CLIP Video Investigator

Overview

CLIP Video Investigator is a Flask-based web application designed to compare text and image embeddings using the CLIP model. The application integrates OpenCV for video processing and Plotly for data visualization to accomplish the following:

Play a video in a web browser.
Pause and resume video playback.
Compare CLIP embeddings of video frames with text embeddings.
Visualize the similarity between text and image embeddings in real-time using a Plotly plot.
Jump to specific frames by clicking on the Plotly plot.

Youtube video: https://www.youtube.com/watch?v=XllrtZnPL6M

Why This is Useful

Understanding the relationship between text and image embeddings can provide insights into how well a model generalizes across modalities. By plotting these values in real-time, researchers and engineers can:

Identify key frames where text and image embeddings are highly aligned or misaligned.
Debug and fine-tune the performance of multimodal models.
Gain insights into the temporal evolution of embeddings in video data.
Enable more effective search and retrieval tasks for video content.

Features

Video Playback: Uses OpenCV to read video frames and displays them in the web browser.
Play/Pause: Allows the user to start and stop video playback.
Data Visualization: Uses Plotly to plot data related to the video frames.
Interactive Plot: Allows the user to click on the plot to jump to specific frames in the video.
Reset Functionality: Resets the application to its initial state.
Embedding Caching: Pickle files of the text and image frame embeddings are saved for each video in the /embeddings folder. This allows for quicker subsequent analysis by avoiding the need to regenerate these embeddings.

Configuration

A config.yaml file is used to specify various settings for the application:

roboflow_api_key: ""  # Roboflow API key
video_path: ""  # Path to video file
CLIP:
  - wall
  - tile wall
  - large tile wall

roboflow_api_key: Your API key for Roboflow.
video_path: The path to the video file you want to analyze.
CLIP: A list of text inputs for which you want to generate CLIP embeddings.

Folder Layout

clip_investigator/
├── config.yaml
├── scripts/
│   └── example.pkl
├── embeddings/
│   └── clip_app.py
│   └── clip_functions.py
├── static/
│   ├── css/
│   │   └── style.css
│   └── js/
│       └── main.js
└── templates/
    └── index.html

clip_app.py: The main Flask application file.
config.yaml: Configuration file for specifying settings.
embeddings/: Folder where pickle files of text and image embeddings are stored.
static/: Contains static files like CSS and JavaScript.
templates/: Contains HTML templates.

Installation

Prerequisites

Python 3.x
Virtualenv (optional but recommended)

Steps

Clone the repository.

git clone https://github.com/roboflow/clip_video_app.git

Navigate to the project directory.
```
cd  clip_video_app
```
(Optional) Create a virtual environment.
Install the dependencies.
```
pip install -r requirements.txt
```

Usage

You must also be running the roboflow inference server localy!

Update the config.yaml file with your Roboflow API key and the path to your video file (or use sample file in /data folder).
Start the Flask application.
```
python scripts/clip_app.py
```
Open a web browser and navigate to http://localhost:5000.
Use the "Start" and "Stop" buttons to control video playback.
View real-time data related to the video in the Plotly plot below the video.
Click on the Plotly plot to jump to specific video frames.

Troubleshooting

WebSocket Errors: If you encounter WebSocket errors, check the browser console for specific error messages. The application has built-in error handling to attempt reconnections.
Plotly Click Events: If click events are not detected on the Plotly plot after a reset, reload the page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CLIP Video Investigator

Overview

Why This is Useful

Features

Configuration

Folder Layout

Installation

Prerequisites

Steps

Usage

Troubleshooting

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
embeddings		embeddings
media		media
scripts		scripts
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
requirements.txt		requirements.txt

Mas-Ayb/clip_video_app

Folders and files

Latest commit

History

Repository files navigation

CLIP Video Investigator

Overview

Why This is Useful

Features

Configuration

Folder Layout

Installation

Prerequisites

Steps

Usage

Troubleshooting

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages