🚀 Live Demo → whisper.enesxgrahovac.com
Real-time speech recognition running entirely in the browser.
No server, no external API keys – just WebAssembly, Web Audio and IndexedDB.
- Whisper model executed in the browser via WebAssembly (compiled with Emscripten).
- Next.js 15 (App Router) + React 19 + TypeScript.
- Model files (~30-140 MB) are transparently cached in IndexedDB after the first download.
- Works offline once a model is cached – use the Clear Cache button in the UI to remove them.
- Shadcn/UI + Tailwind v4 for the UI.
git clone https://github.com/enesgrahovac/whisper.cpp-nextjs.git
cd whisper.cpp-nextjs
pnpm install # or npm / yarn / bun
pnpm dev # localhost:3000The first time you select a model it will be downloaded and stored locally (see “Caching” below).
The pre-built files already live under public/whisper/stream/ so you don’t need to build anything to run the demo.
If you want to rebuild them yourself:
- Clone
whisper.cpp. - Follow the instructions in
examples/stream/README.md(make stream.wasm, etc.). - Copy the generated
stream.js,stream.wasm,lib*.js, … intopublic/whisper/stream/.
We are, quite literally, standing on the shoulders of giants – enormous thanks to Georgi Gerganov and all contributors to whisper.cpp. 🙏
Running large WebAssembly modules that use SharedArrayBuffer requires the page to be cross-origin isolated.
In this repo we do that in next.config.ts:
// next.config.ts
export default {
async headers() {
const securityHeaders = [
{ key: "Cross-Origin-Opener-Policy", value: "same-origin" },
// Safari (and everything else) works with require-corp ↓
{ key: "Cross-Origin-Embedder-Policy", value: "require-corp" },
// Recommended when using COEP=require-corp
{ key: "Cross-Origin-Resource-Policy", value: "same-origin" },
];
return [{ source: "/:path*", headers: securityHeaders }];
},
};If you control your static asset server (and serve proper Cross-Origin-Resource-Policy headers) you can switch require-corp to credentialless.
- Models are stored in an IndexedDB database called
whisper-model-cache. - Use the Clear Cache button in the UI, or manually clear the browser’s site-data if you run out of storage.
This repo is released under the MIT license (see LICENSE).
Whisper.cpp itself is licensed under the MIT license as well.
PRs and issues are very welcome!
For larger changes, please open an issue first so we can discuss direction and scope.
Which browsers are supported?
Any browser that supports SharedArrayBuffer and cross-origin isolation.
That includes recent versions of Chrome/Edge/Opera and Firefox with
privacy.partition.always_partition_third_party_non_partitioned_state=false.
Can I use other Whisper models?
The UI currently lists Tiny & Base (and their Q5_1 quantised versions).
If you compile another ggml-*.bin model, just add an entry to MODELS in src/components/StreamClient.tsx.
What about uploading audio files?
The demo currently only supports real-time transcription of live audio.
Feel free to contribute a file-upload feature!
- ggerganov/whisper.cpp – the core magic.
- shadcn/ui – beautiful, headless UI primitives.
- Everyone who filed issues / PRs and tested early versions.