Skip to content

Netflix is all about connecting people to the movies they love. To help customers find those movies, they developed world-class movie recommendation system: CinematchSM. Its job is to predict whether someone will enjoy a movie based on how much they liked or disliked other movies. Netflix use those predictions to make personal movie recommendati…

Notifications You must be signed in to change notification settings

gujralsanyam22/NETFLIX-MOVIE-RECOMMENDATION-ENGINE-1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

NETFLIX-MOVIE-RECOMMENDATION-ENGINE-1

Table of Content:

Overview:

To help customers find those movies, they developed world-class movie recommendation system: CinematchSM. Its job is to predict whether someone will enjoy a movie based on how much they liked or disliked other movies. Netflix use those predictions to make personal movie recommendations based on each customer’s unique tastes. And while Cinematch is doing pretty well, it can always be made better.

Buisness Problem:

Netflix is all about connecting people to the movies they love. To help customers find those movies, they developed world-class movie recommendation system: CinematchSM. Its job is to predict whether someone will enjoy a movie based on how much they liked or disliked other movies. Netflix use those predictions to make personal movie recommendations based on each customer’s unique tastes. And while Cinematch is doing pretty well, it can always be made better. Now there are a lot of interesting alternative approaches to how Cinematch works that netflix haven’t tried. Some are described in the literature, some aren’t. We’re curious whether any of these can beat Cinematch by making better predictions. Because, frankly, if there is a much better approach it could make a big difference to our customers and our business. Credits: https://www.netflixprize.com/rules.html .

PROBLEM STATEMENT:

Netflix provided a lot of anonymous rating data, and a prediction accuracy bar that is 10% better than what Cinematch can do on the same training data set. (Accuracy is a measurement of how closely predicted ratings of movies match subsequent actual ratings.)

Motivation:

What could be a perfect way to utilize unfortunate lockdown period? Like most of you, I spend my time in cooking, Netflix, coding and reading some latest research papers on weekends. The idea of classifying indian currency struck to me when I was browsing through some research papers. I couldn't find any relevant research paper (and of course dataset!) associated with it. And that led me to collect the images of Indian currency to train a deep learning model using this amazing tool.

Technical Aspect:

This project is divided into two part:

  1. Training a deep learning model using Keras. (Not covered in this repo. I'll update the link here once I make it public.)

Installation

The Code is written in Python 3.7. If you don't have Python installed you can find it here. If you are using a lower version of Python you can upgrade using the pip package, ensuring you have the latest version of pip. To install the required packages and libraries, run this command in the project directory after cloning the repository:

pip install -r requirements.txt

Run

STEP 1

Linux and macOS User

Open .bashrc or .zshrc file and add the following credentials:

export AWS_ACCESS_KEY="your_aws_access_key"
export AWS_SECRET_KEY="your_aws_secret_key"
export ICP_BUCKET='your_aws_bucket_name'
export ICP_BUCKET_REGION='bucket_region'
export ICP_UPLOAD_DIR='bucket_path_to_save_images'
export ICP_PRED_DIR='bucket_path_to_save_predictions'
export ICP_FLASK_SECRET_KEY='anything_random_but_unique'
export SENTRY_INIT='URL_given_by_sentry'

Note: SENTRY_INIT is optional, only if you want to catch exceptions in the app, else comment/remove the dependencies and code associated with sentry in app/main.py

Windows User

Since, I don't have a system with Windows OS, here I collected some helpful resource on adding User Environment Variables in Windows.

Attention: Please perform the steps given in these tutorials at your own risk. Please don't mess up with the System Variables. It can potentially damage your PC. You should know what you're doing.

STEP 2

To run the app in a local machine, shoot this command in the project directory:

gunicorn wsgi:app

Directory Tree

├── app 
│   ├── __init__.py
│   ├── main.py
│   ├── model
│   ├── static
│   └── templates
├── config
│   ├── __init__.py
├── processing
│   ├── __init__.py
├── requirements.txt
├── runtime.txt
├── LICENSE
├── Procfile
├── README.md
└── wsgi.py

To Do

  1. Convert the app to run without any internet connection, i.e. PWA.
  2. Add a better vizualization chart to display the predictions.

Bug / Feature Request

If you'd like to request a new function, feel free to do so by opening an issue here. Please include sample queries and their corresponding results.

Technologies Used

Sources 

Real world/Business Objectives and constraints 

Objectives:

  1. Predict the rating that a user would give to a movie that he has not yet rated.
  2. Minimize the difference between predicted and actual rating (RMSE and MAPE) 

Constraints:

  1. Some form of interpretability.
  2. There is no low latency requirement as the recommended movies can be precomputed earlier.

Type of Data:

  • There are 17770 unique movie IDs.
  • There are 480189 unique user IDs.
  • There are ratings. Ratings are on a five star (integral) scale from 1 to 5.
  • There is a date on which the movie is watched by the user in the format YYYY-MM-DD.

Getting Started

Start by downloading the project and run "NetflixMoviesRecommendation.ipynb" file in ipython-notebook.

Prerequisites

You need to have installed following softwares and libraries in your machine before running this project.

  1. Python 3
  2. Anaconda: It will install ipython notebook and most of the libraries which are needed like sklearn, pandas, seaborn, matplotlib, numpy, scipy.
  3. XGBoost
  4. Surprise

Installing

  1. Python 3: https://www.python.org/downloads/
  2. Anaconda: https://www.anaconda.com/download/
  3. XGBoost: conda install -c conda-forge xgboost
  4. Surprise: pip install surprise

Built With

  • ipython-notebook - Python Text Editor
  • sklearn - Machine learning library
  • seaborn, matplotlib.pyplot, - Visualization libraries
  • numpy, scipy- number python library
  • pandas - data handling library
  • XGBoost - Used for making regression models
  • Surprise - used for making recommendation system models

License

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Authors

  • Sanyam Gujral - Complete work

Acknowledgments

  • From github and youtube

Credits

  • Google Images Download - This project wouldn't have been possible without this tool. It saved my enormous amount of time while collecting the data. A huge shout-out to its creator

About

Netflix is all about connecting people to the movies they love. To help customers find those movies, they developed world-class movie recommendation system: CinematchSM. Its job is to predict whether someone will enjoy a movie based on how much they liked or disliked other movies. Netflix use those predictions to make personal movie recommendati…

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published