This repository is for paper Rethinking the visual cues in audio-visual speaker extraction
src
: training scripts
data
: scripts to generate data.scp file from mixed audio folder;
To simulate data, you can follow this repo.
pretrainedmodel
: pretrained model