A beautiful, native macOS desktop application for transcribing audio and video files using whisper.cpp.
🌐 Visit our website | 📥 Download Latest Release
- Drag & Drop - Drag single or multiple files to create a batch queue
- Batch Processing - Process unlimited files sequentially with automatic queue management
- Multiple Formats - Supports MP3, WAV, M4A, FLAC, OGG, WMA, AAC, AIFF, MP4, MOV, AVI, MKV, WebM, WMV, FLV, M4V
- Multiple Models - Choose from tiny, base, small, medium, large-v3, or large-v3-turbo Whisper models (including English-only variants)
- Output Formats - Export as VTT subtitles, SRT subtitles, plain text, Word (
.docx), PDF, or Markdown - Language Support - Auto-detect or select from 12+ languages
- Apple Silicon Optimized - Native Metal GPU acceleration on M1/M2/M3/M4 Macs
- Dark Mode - Beautiful dark theme that respects your system preference
- Auto Updates - Automatic update notifications when new versions are available
- Keyboard Shortcuts - Full keyboard navigation support
- Transcription History - Keep track of your recent transcriptions
- Native Performance - Uses whisper.cpp for fast, efficient transcription
- TypeScript - Fully typed codebase for better maintainability
- Feature-Driven Architecture - Modular codebase organized by feature domains
- macOS 10.15 (Catalina) or later
- FFmpeg (Required for audio processing)
- ~500MB disk space (for whisper.cpp and models)
Note: WhisperDesk requires FFmpeg to process audio files. The app will check for it on startup and guide you if it's missing.
WhisperDesk requires FFmpeg to be installed on your system to process audio and video files.
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"brew install ffmpegbrew install cmake- Download the latest
WhisperDesk-x.x.x.dmgfrom Releases - Open the DMG file
- Drag WhisperDesk to your Applications folder
- Important: Ensure you have FFmpeg installed (see Prerequisites)
- Launch WhisperDesk from Applications
# Clone the repository
git clone https://github.com/PVAS-Development/whisperdesk.git
cd whisperdesk
# Install dependencies
npm install
# Build whisper.cpp with Metal support (downloads base model)
# For development (current architecture only):
npm run setup:whisper
# For production (universal binary - Intel + Apple Silicon):
npm run setup:whisper:universal
# Run in development mode
npm run electron:dev
# Or build for production (automatically builds universal binary)
npm run electron:build- Open Files - Drag and drop audio/video files (single or batch) into the app, or click to browse
- Configure Settings - Choose your preferred model, language, and output format
- Transcribe - Click "Transcribe" to process the entire queue sequentially
- Save/Copy - Save the transcription from the save dialog (choose from
.txt,.docx,.pdf,.md,.srt, or.vttformats) or copy to clipboard
| Shortcut | Action |
|---|---|
Cmd+O |
Open file |
Cmd+S |
Save transcription |
Cmd+C |
Copy transcription |
Cmd+Return |
Start transcription |
Cmd+H |
Toggle history |
Escape |
Cancel transcription |
| Model | Size | Speed | Quality | Best For |
|---|---|---|---|---|
tiny |
75 MB | ~10x | ★☆☆☆☆ | Quick drafts, testing |
base |
142 MB | ~7x | ★★☆☆☆ | Fast transcription |
small |
466 MB | ~4x | ★★★☆☆ | Balanced speed/quality |
medium |
1.5 GB | ~2x | ★★★★☆ | High quality |
large-v3 |
3.1 GB | ~1x | ★★★★★ | Best quality |
large-v3-turbo |
1.6 GB | ~2x | ★★★★★ | Fast + quality |
English-only variants (.en) are available for tiny, base, small, and medium models.
Models are downloaded automatically on first use and cached in:
- Development:
PROJECT_ROOT/models/ - Production:
~/Library/Application Support/WhisperDesk/models/
- Node.js 22.12+ (use
nvm useto auto-switch via.nvmrc) - CMake (for building whisper.cpp)
- FFmpeg
# Clone and install
git clone https://github.com/PVAS-Development/whisperdesk.git
cd whisperdesk
npm install
# Build whisper.cpp and download base model
npm run setup:whisper
# Run development server
npm run electron:dev# Build for macOS
npm run electron:build:mac
# Build directory only (faster, for testing)
npm run electron:build:dirThis project uses conventional commits for consistent commit messages.
-
Create a feature branch from
main:git checkout -b feat/my-feature
-
Make changes with conventional commits and create PR to
main
Use Conventional Commits for clear history:
| Commit Type | Example | Description |
|---|---|---|
feat: |
feat: add PDF export |
New feature |
fix: |
fix: crash on startup |
Bug fix |
perf: |
perf: faster loading |
Performance improvement |
refactor: |
refactor: simplify logic |
Code refactoring |
docs: |
docs: update README |
Documentation |
chore: |
chore: update deps |
Maintenance |
style: |
style: format code |
Code style |
test: |
test: add unit tests |
Tests |
ci: |
ci: fix workflow |
CI/CD changes |
build: |
build: update config |
Build changes |
- Bug report? Open an issue via the built-in bug report template so we collect macOS version, WhisperDesk version, reproduction steps, and relevant logs automatically.
- Feature idea? Start a thread in Discussions. We prefer to explore new ideas there and will only create an issue once we understand the scope.
- Before you post: search the existing issues and discussions to avoid duplicates and help us respond faster.
WhisperDesk has a comprehensive test suite with 335+ tests covering utilities, services, hooks, and React components.
# Run all tests once (CI mode)
npm run test:run
# Run tests with watch mode
npm run test
# Run tests with UI dashboard
npm run test:ui
# Run tests with coverage
npm run test:coverage- Unit Tests - Utilities, formatters, validators, storage, and services
- Component Tests - SettingsPanel, FileDropZone, OutputDisplay
- Service Tests - Electron API, transcription service, model service, history storage
- Test Framework: Vitest with jsdom
- Component Testing: @testing-library/react
- Pre-commit Hooks - Lint and format checks run automatically before every commit (via husky + lint-staged)
Tests run automatically in GitHub Actions on every PR and push:
- ✅ Linting & formatting checks
- ✅ TypeScript type checking
- ✅ Unit & component tests (335+ tests)
- ✅ Production build validation
| Script | Description |
|---|---|
npm run dev |
Start Vite dev server |
npm run electron:dev |
Start app in development mode |
npm run setup:whisper |
Build whisper.cpp (current architecture) |
npm run setup:whisper:universal |
Build whisper.cpp (universal binary) |
npm run electron:build |
Builds macOS DMG (with universal binary) |
npm run electron:build:mac |
Builds macOS DMG (with universal binary) |
npm run electron:build:dir |
Build directory only (faster, testing) |
npm run icons |
Generate app icons from SVG |
npm run lint |
Run ESLint |
npm run lint:fix |
Run ESLint with auto-fix |
npm run typecheck |
Run TypeScript type checking |
npm run format |
Format code with Prettier |
npm run format:check |
Check code formatting |
npm run test |
Run tests with watch mode |
npm run test:ui |
Run tests with dashboard UI |
npm run test:run |
Run tests once (CI mode) |
npm run test:coverage |
Run tests with coverage report |
This project follows a modern Electron architecture with strict separation of concerns:
src/main/: Electron Main process (TypeScript). Handles OS integration, window management, and native services.src/preload/: Preload scripts (TypeScript). Exposes a secure, typed API to the renderer viacontextBridge.src/renderer/: React application (TypeScript). The UI layer, built with Vite.src/shared/: Shared types and constants used by both Main and Renderer processes.
Security Features:
- Context Isolation: Enabled. Renderer cannot access Node.js primitives directly.
- Sandbox: Enabled. Renderer runs in a sandboxed environment.
- IPC: All communication happens via typed IPC channels defined in
src/main/ipc/.
whisperdesk/
├── src/
│ ├── main/ # Electron Main process (TypeScript)
│ │ ├── index.ts # Entry point
│ │ ├── ipc/ # IPC Handlers
│ │ ├── services/ # Business logic (Whisper, FileSystem)
│ │ └── utils/ # Utilities
│ ├── preload/ # Preload scripts (TypeScript)
│ │ └── index.ts # Secure API exposure
│ ├── renderer/ # React frontend (TypeScript)
│ │ ├── App.tsx # Main app component
│ │ ├── main.tsx # React entry point
│ │ ├── components/ # Shared UI components
│ │ ├── features/ # Feature-based modules
│ │ └── ...
│ └── shared/ # Shared types and constants
├── dist-electron/ # Output folder for main process build
├── dist/ # Output folder for renderer build
├── scripts/ # Build and setup scripts
├── bin/ # whisper-cli binary (built)
└── models/ # Downloaded GGML models (dev)
Run the setup script to build whisper.cpp:
npm run setup:whisperInstall FFmpeg via Homebrew:
brew install ffmpeg- Use a smaller model (tiny or base) for faster results
- Ensure you're using GPU acceleration (shown in app settings)
- Close other resource-intensive applications
The app is code-signed and notarized by Apple, so it should open normally. If you still see a warning:
- Right-click the app and select "Open"
- Click "Open" in the dialog that appears
For builds from source (unsigned), you may need to run:
xattr -cr /Applications/WhisperDesk.appContributions are welcome! Please see our Contributing Guide for details on:
- Setting up your development environment
- Code style and commit conventions
- Submitting pull requests
- Local Processing: All audio/video processing happens locally on your device. Your files never leave your computer.
- No Cloud Uploads: We do not upload your media files or transcriptions to any server.
- Anonymous Analytics: We collect minimal, anonymous usage data (e.g., app launches, feature usage) to improve the app. No personal data or file content is collected.
- Code Signing: The app is code-signed and notarized by Apple for your safety.
For more details, please read our Privacy Policy.
WhisperDesk is free and open-source software. If you find it useful, please consider supporting its development:
Your support helps cover the costs of Apple Developer Program fees and keeps the project alive!
MIT License - see LICENSE for details.
- whisper.cpp - High-performance C++ port of OpenAI Whisper
- OpenAI Whisper - The amazing speech recognition model
- Electron - Cross-platform desktop apps
- React - UI framework
- TypeScript - Type-safe JavaScript
- Vite - Build tool
- Vitest - Fast unit testing framework
Made with ❤️ for the transcription community
