Replies: 2 comments 3 replies
-
Hi @TheStalke, we managed to do this with post-processing, because the timestamps that Whisper produces aren't that accurate. We arrived at a solution that works pretty well- we use VAD (voice activity detection) to check when a speaker is speaking and trim the segments automatically to match: Here's the link if you want to try it out: Automatic Segment Trimming |
Beta Was this translation helpful? Give feedback.
-
Yes, VAD usually helps. Other than that, you could try with --word_timestamps True. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, Im trying to transcribe a video, for subtitles using Whisper by OpenAi, but at the start of the video theres a song and my .srt file starts from "00:00:00,00" , when theres no one talking like it goes from 0 to 13 but people only start talking from 8 to 13, is there a way to make the subtitles start at the same time as they start talking in the video automatically?
Beta Was this translation helpful? Give feedback.
All reactions