You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 1, 2024. It is now read-only.
I've been trying to reproduce the enhancement performance on the VoxCeleb2 test set, but the performance of the given pre-trained model was much lower than in the paper.
(I used evaluateSeparation.py from the main directory to evaluate the metrics.)
And when I tried with test_synthetic_script.sh, the outputs were bad for my hearing.
The offscreen noise in the mixture (audio_mixed.wav) was much larger than the voice from what I heard, so I felt that the enhancement would be too difficult for the model.
I have 2 questions regarding this.
Is the pre-trained model in the av-enhancement directory your best model for speech enhancement, not separation?
Is your evaluation done with a mixture of two speeches and an offscreen noise with weight 1?
Isn't it too difficult for the model to separate and enhance at the same time?
Thanks in advance.
The text was updated successfully, but these errors were encountered:
syl4356
changed the title
Speech Enhancement evaluation code
Speech enhancement evaluation
Nov 13, 2023
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hello, thanks for your great work.
I've been trying to reproduce the enhancement performance on the VoxCeleb2 test set, but the performance of the given pre-trained model was much lower than in the paper.
(I used
evaluateSeparation.py
from the main directory to evaluate the metrics.)And when I tried with test_synthetic_script.sh, the outputs were bad for my hearing.
The offscreen noise in the mixture (audio_mixed.wav) was much larger than the voice from what I heard, so I felt that the enhancement would be too difficult for the model.
I have 2 questions regarding this.
av-enhancement
directory your best model for speech enhancement, not separation?Isn't it too difficult for the model to separate and enhance at the same time?
Thanks in advance.
The text was updated successfully, but these errors were encountered: