You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I recently submitted another issue #1150, but I’d also like to suggest another enhancement for WhisperPipeline.
Currently, WhisperPipeline(openvino_genai-2024.5.0.0rc1) relies on Greedy Search for transcription. While Greedy Search is efficient, it often leads to prematurely shortened transcriptions due to early stopping in complex scenarios.
To improve transcription quality, adding Beam Search support would allow for more comprehensive output, mitigating cases of incomplete or overly simplified results. Many applications would benefit from this feature, especially when accuracy is critical and a more extensive search process is required.
Proposed Feature:
Implement Beam Search as an alternative search strategy within WhisperPipeline.
Optionally, provide adjustable parameters to control beam size (like CTranslate2) and other relevant settings, giving users flexibility based on performance needs.
Are there any plans to introduce this functionality, or might there be existing methods to achieve similar results?
Thanks in advance for considering this improvement!
The text was updated successfully, but these errors were encountered:
Hi, thank you for a feature request! We have plans to implement Beam Search for Whisper pipeline. List of planned supported parameters can be found in GenerationConfig under beam search and multinomial sections.
Hi OpenVINO team,
I recently submitted another issue #1150, but I’d also like to suggest another enhancement for WhisperPipeline.
Currently, WhisperPipeline(openvino_genai-2024.5.0.0rc1) relies on Greedy Search for transcription. While Greedy Search is efficient, it often leads to prematurely shortened transcriptions due to early stopping in complex scenarios.
To improve transcription quality, adding Beam Search support would allow for more comprehensive output, mitigating cases of incomplete or overly simplified results. Many applications would benefit from this feature, especially when accuracy is critical and a more extensive search process is required.
Proposed Feature:
Are there any plans to introduce this functionality, or might there be existing methods to achieve similar results?
Thanks in advance for considering this improvement!
The text was updated successfully, but these errors were encountered: