Before run this script, you need to install dlib first.
You can follow this blog to install.
This script is only for LRS3 dataset. You need to rewrite something for other datasets.
you need to modify path to your own path before running.
- extract audio
cd audio_process
./extract_audio.sh
- (optional) cut audio of some speakers to 4~6 s
./audio_cut.sh
- generate text file of mixed audio
python audio_path.py
- mix audio. This code refers to Deep Clustering
/opt18/matlab_2015b/bin/matlab -nosplash -nodesktop -r create_wav_2speakers
- extract frames
cd video_process
./extract_frames.sh
- generate file_path_of_frame and save it to LRS3_image.scp
by yourself
- detect and crop face + lip regions
python extrac_face_and_lip.py
- convert image sequences to npy
python convert_npy.py