This repository contains some of my completed data science projects that I find interesting and commented enough to be usable as notes in the future.
Note: Some datasets used in this project are fake and created for practise purposes only.
- Install dependencies using requirements.txt:
pip install -r requirements.txt
- Run notebooks as usual by using a jupyter notebook server, vscode etc. Install jupyter notebook by using pip:
pip install notebook
To run jupyter notebook, use:
jupyter notebook
-
- Action Recognition: Isolated Sign Language Recognition (ISLR): Training and testing different action recognition models and data augmentation techniques on ISLR datasets.
- Object Detection: Sign Language: Training a tensorflow object detection model to read sign language and translate to text in real-time.
Tools: PyTorch, MMAction2, Pandas, Numpy, OpenCV2, PIL, Seaborn, Matplotlib, and Tensorflow.
-
- Python
- Ankara University Turkish Sign Language Dataset (AUTSL): Analysis of one of the largest sign language recognition datasets to explore the different strengths and weaknesses of the dataset.
- Word-Level American Sign Language (ASL): Analysis of one of the most popular sign language recognition datasets to compare with the AUTSL dataset.
- Ecommerce Customer Time Spread Analysis: Analysis of time spent by users in-store, on the company's mobile app and on their website to determine needed improvements for the company.
- Ad-Click Based on Personal Info: Analysis of the likelihood of an ad being clicked based on the features of the ad and the personal info of the user.
Tools: Pandas, Seaborn, and Matplotlib
- R As a part of part-II software engineering, I have done some data analysis in R here. These are uncommented and not pleasing to the eye. I will format these in the future to be readable.
- Python
-
- Decision Trees: LendingClub Loan Repayment: A model to predict if a loan will be repaid by a debtor using decision trees on real dataset on lendingclub.com from Kaggle.
- KMeansClustering: Cluster Public/Private University Students: Testing out K Means Clustering algorithm to build a model that accurately clusters students from private and public universities.
- Recommender System: Movie Recommender: A supervised model to recommend movies based on users and their reviews.
- Reinforcement Learning: Frozen Lake Environment: Implementing an optimized Q-Learning agent that will navigate a non-deterministic environment with a fairly high success rate by using reinforcement learning.
Tools: scikit-learn, Pandas, Seaborn, Matplotlib, Pytorch, and Pygame
-
- Yelp Reviews: Classifying the star rating of a yelp review based on the text that the review contains.
-
- DNN Approach: LendingClub Loan Repayment: Trying out a deep learning approach to model if a debtor would or would not repay their debt based on their personal information.
- DNN: King County House Prices: Using deep learning to predict house prices in King County based on house features like location, number of bedrooms, space etc.
Tools: Pandas, Seaborn, Matplotlib, and scikit-learn
-
- Benford's Law: US Elections: Graphing different US Presidential elections to see if Benford's Law applies to them. The file needs to be saved and run as HTML in browser.
- Adding Numbers Using NN: Optimising the number of nodes required to add two numbers.
- Multiplying Numbers Using NN: Limit testing concepts to implement a simple neural network that can multiply numbers.
Tools: Pandas, and scikit-learn
If any of the project is interesting or for any reason, you would like to reach out to me, contact me using my email, [email protected] ❤️ඞ