Suggestions for prompts/prefix to include repetitions and false starts #2003
Replies: 5 comments
-
Hi Joe, Many thanks in advance! |
Beta Was this translation helpful? Give feedback.
-
maybe you can try CrisperWhisper |
Beta Was this translation helpful? Give feedback.
-
😍thank you! I'll give it a try |
Beta Was this translation helpful? Give feedback.
-
@birknu75 where you able to achieve this? Anyone know how to do this on groq whisper model? |
Beta Was this translation helpful? Give feedback.
-
No, unfortunately not. There is only a german or English model available, but I need Dutch
…________________________________
From: Rahul Bansal ***@***.***>
Sent: 08 July 2025 11:21
To: openai/whisper ***@***.***>
Cc: birknu75 ***@***.***>; Mention ***@***.***>
Subject: Re: [openai/whisper] Suggestions for prompts/prefix to include repetitions and false starts (Discussion #2003)
@birknu75<https://github.com/birknu75> where you able to achieve this? Anyone know how to do this on groq whisper model?
—
Reply to this email directly, view it on GitHub<#2003 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BJR6OYRSPZHG534BU5V3JPL3HOESDAVCNFSM6AAAAAB544X3LCVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTGNRZGM2DCNY>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone!
I'm using Whisper on a research project where we're hoping to use it as a first step in transcribing data verbatim according to a strict transcription protocol. This data is real, spontaneous speech rather than subtitles for a TV show or similar, so there's a lot of hesitations, filler words, repetitions of words, etc, and we want these transcribed. I've managed to write a prompt to include 'um's and 'uh's, but I was wondering if anyone has previously managed to get Whisper to successfully transcribe false starts and repetitions of words using prompts or any other features/settings?
Our current prompt is something like this:
This has helped a lot, but it doesn't catch all the repetitions, and doesn't do anything for false starts. Ideally, we want to end up with transcripts that include repetitions, as well as false starts in brackets followed by an underscore, so that the transcript look a bit like this:
Has anyone had any success with getting Whisper to do anything like this, or have any suggestions for what I could try? I've also tried a prompt modelled off the above examples with includes things like '(th_) there's a cat', but unfortunately this doesn't work at all, as well as having the completely unintended effect of swear words being censored with underscores.
Any help or thoughts would be much appreciated - thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions