Skip to content

hemanthkumar17/VIDEO_CAPTIONING_PIPELINE

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VIDEO_CAPTIONING_PIPELINE

We attempt to tackle the challenging task of recipe generation from videos using only pre-trained models. We divided the process of recipe generation into various modules which include event generation, frame extraction, featurizing frames, removing frame redundancy, frame enhancement, frame captioning, and summarization using LLM. We used various pre-trained models to perform different tasks required to achieve desired results at each stage of our recipe generation pipeline. We used the temporal nature of videos, and the power of image embeddings, and harnessed the power of LLMs to extract meaningful content and generate recipes in an efficient manner. We have demonstrated the quality of the recipe generated using various metrics which highlight the impact of our work.

Experiment 1

Screenshot 2023-08-30 at 8 08 28 PM

Experiment 2

Screenshot 2023-08-30 at 8 09 07 PM

Experiment 3

Screenshot 2023-08-30 at 8 09 18 PM

Experiment 4

Screenshot 2023-08-30 at 8 09 27 PM

For detailed explanation refer to the report and the video presentation which contains demos. https://docs.google.com/presentation/d/1R0FjAj_QXoLjxR3NsZRVTFgYu-KnKj2BpN4EcOvyuGI/edit#slide=id.g21eccad0113_0_38

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 96.4%
  • Python 3.6%