multi-speaker multi-track audio #131

Tharun1718 · 2024-10-16T05:36:49Z

Hi,

I am trying to transcribe live stereo audio to mono audio and transcribe them, is there any recommended methods to implement this, I have tried converting stereo to mono and my result is very inaccurate.

Thanks in advance for the help

Gldkslfmsd · 2024-10-16T05:51:50Z

what is in your stereo?
yes, converting to mono sounds best.
Or one whisper streaming per track?

Tharun1718 · 2024-10-16T06:59:22Z

hey thanks, actually i am trying to stream conversation between two people(agent and a customer), Can you suggest me some guides where I can study more on this, any suggestion would be great.

Gldkslfmsd · 2024-10-16T08:02:56Z

if you have the voices in separate tracks, it's good, you don't need diarization (good topic to know about).

Then you probably need voice activity controller that sends the speaking part track into WhisperStreaming. You can modify this class:

whisper_streaming/whisper_online.py

Line 518 in 0a00254

class VACOnlineASRProcessor(OnlineASRProcessor):

You can use multiple Silero VAD Iterator objects: https://github.com/ufal/whisper_streaming/blob/main/silero_vad.py , one for each track, to control sending the voice to the OnlineASRProcessor . When it returns the output, wrap it with info who spoke that.

In any way, you should make sure that the context of the previous turns is not cleared. Use finalize() but do not clear the HypothesisBuffer.

Tharun1718 · 2024-10-16T09:04:41Z

Thank you for you input, will work on it

Gldkslfmsd changed the title ~~Does whisper_streaming support streaming stereo audio?~~ multi-speaker stereo Oct 29, 2024

Gldkslfmsd changed the title ~~multi-speaker stereo~~ multi-speaker multi-track audio Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-speaker multi-track audio #131

multi-speaker multi-track audio #131

Tharun1718 commented Oct 16, 2024

Gldkslfmsd commented Oct 16, 2024

Tharun1718 commented Oct 16, 2024

Gldkslfmsd commented Oct 16, 2024

Tharun1718 commented Oct 16, 2024

multi-speaker multi-track audio #131

multi-speaker multi-track audio #131

Comments

Tharun1718 commented Oct 16, 2024

Gldkslfmsd commented Oct 16, 2024

Tharun1718 commented Oct 16, 2024

Gldkslfmsd commented Oct 16, 2024

Tharun1718 commented Oct 16, 2024