Skip to content

Topic modelling and text generation using Amazon and Netflix stand-up comedy scripts

License

Notifications You must be signed in to change notification settings

slashlan/standupcomedynlp

Repository files navigation

"You think that's funny?" Topic modelling and text generation using Amazon and Netflix stand-up comedy scripts

This project explores topic extraction techniques as well as the construction of two language generation model on an atypical collection of documents: stand-up comedy scripts.

The goal is two-fold:

  • a) The first is using standard Natural Language Processing (NLP) techniques to investigate and summarize the topics debated in a corpus of 143 scripts of stand-up comedy shows released by Amazon and Netflix between 2013 and 2021;

  • b) The second involves building on the knowledge acquired from the first part to create two Recurrent Neural Network models with different architectures and test the language capabilities of the most performing one in generating new text.

The project is structured in 5 parts:

  • 1) Construction of the dataset
  • 2) Exploratory Data Analysis
  • 3) Topic modelling
  • 4) Text generation
  • 5) Conclusions

The notebook "You_think_thats_funny_SINGLE_FILE.ipynb" merges the five sections in a unique file.

About

Topic modelling and text generation using Amazon and Netflix stand-up comedy scripts

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published