Can LSTM Networks Beat the Market? A Deep Learning Approach to S&P 500 Prediction
Can LSTM Networks Beat the Market? A Deep Learning Approach to S&P 500 Prediction
Dec 30, 2024
python data-science machine-learning lstm finance forecasting tensorflow keras pandas

A comparative study of neural network models (LSTM, CNN-LSTM) for S&P 500 index prediction. The analysis evaluates multiple architectures against baseline strategies like buy-and-hold and naive forecasting, finding that while specialized models like LSTM with Batch Normalization and Regularization improve technical error metrics (MSE, MAE). However, when tested with a simulated trading, beating market returns remains a significant challenge requiring further architectural and strategy refinement.

Credit Score Classification with PySpark Machine Learning
Credit Score Classification with PySpark Machine Learning
Sep 16, 2024
big-data pyspark machine-learning decision-tree-classifier random-forest-classifier multilayer-perceptron python

This project involves building a credit score classification model using PySpark's machine learning library. By leveraging distributed computing, we compare the performance of Multilayer Perceptron, Decision Tree Classifier, and Random Forest Classifier to predict creditworthiness based on selected customer and financial features.

Big Data Analysis - Using Hadoop for MapReduce, Cluster Analysis, and Image Classification
Big Data Analysis - Using Hadoop for MapReduce, Cluster Analysis, and Image Classification
Jul 08, 2024
big-data mapreduce apache-mahout machine-learning image-classification python

This project explores various capabilities of distributed computing across three distinct analytical domains by processing large datasets to perform descriptive statistics and clustering as well as image classification. The implementation includes Hadoop MapReduce jobs for weather data analysis on a dataset of hourly weather observations and unsupervised learning using Apache Mahout with several distance metrics, on a dataset of french plays. This work also showcases a scalable cat and dog classifier using the CLIP model within a Hadoop Streaming framework.

Pattern Recognition in Stock Price Volatility and Market Performance
Pattern Recognition in Stock Price Volatility and Market Performance
Jan 11, 2024
machine-learning scikit-learn data-science web-scrape finance time-series-analysis

An in depth exploration of stock market data through feature engineering and pattern recognition. This study analyzes historical price changes and volatility to identify trends across diverse companies using custom statistical metrics and datetime features.

Supervised Learning Benchmarks for Numeric and Textual Data
Supervised Learning Benchmarks for Numeric and Textual Data
Sep 18, 2023
machine-learning scikit-learn data-visualization classification

This project conducts a detailed evaluation of popular machine learning algorithms and their performance characteristics. It benchmarks Naive Bayes, Random Forest, and k Nearest Neighbors across multiple datasets ranging from simple iris data to complex geospatial and text categories. The analysis explores the relationship between hyperparameter tuning and model efficiency while providing quantitative results on accuracy and execution time.

Wine Classification using k Nearest Neighbour from Scratch
Wine Classification using k Nearest Neighbour from Scratch
Jul 17, 2023
python machine-learning clustering matplotlib seaborn

I created a machine learning model to categorize types of Italian wine. This project features a custom implementation of the k Nearest Neighbour algorithm. I also developed the nested cross validation logic without using external libraries. The work explores model stability when handling data with Gaussian noise.