🎙️ WhisperDesk

A beautiful, native macOS desktop application for transcribing audio and video files using whisper.cpp.

🌐 Visit our website | 📥 Download Latest Release

✨ Features

Drag & Drop - Drag single or multiple files to create a batch queue
Batch Processing - Process unlimited files sequentially with automatic queue management
Multiple Formats - Supports MP3, WAV, M4A, FLAC, OGG, WMA, AAC, AIFF, MP4, MOV, AVI, MKV, WebM, WMV, FLV, M4V
Multiple Models - Choose from tiny, base, small, medium, large-v3, or large-v3-turbo Whisper models (including English-only variants)
Output Formats - Export as VTT subtitles, SRT subtitles, plain text, Word (.docx), PDF, or Markdown
Language Support - Auto-detect or select from 12+ languages
Apple Silicon Optimized - Native Metal GPU acceleration on M1/M2/M3/M4 Macs
Dark Mode - Beautiful dark theme that respects your system preference
Auto Updates - Automatic update notifications when new versions are available
Keyboard Shortcuts - Full keyboard navigation support
Transcription History - Keep track of your recent transcriptions
Native Performance - Uses whisper.cpp for fast, efficient transcription
TypeScript - Fully typed codebase for better maintainability
Feature-Driven Architecture - Modular codebase organized by feature domains

📋 Requirements

macOS 10.15 (Catalina) or later
FFmpeg (Required for audio processing)
~500MB disk space (for whisper.cpp and models)

Note: WhisperDesk requires FFmpeg to process audio files. The app will check for it on startup and guide you if it's missing.

🚀 Installation

1. Install Prerequisites

WhisperDesk requires FFmpeg to be installed on your system to process audio and video files.

Install Homebrew (if not already installed)

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Install FFmpeg

brew install ffmpeg

Install CMake (for building whisper.cpp)

brew install cmake

2. Install WhisperDesk

Option A: Download DMG (Recommended)

Download the latest WhisperDesk-x.x.x.dmg from Releases
Open the DMG file
Drag WhisperDesk to your Applications folder
Important: Ensure you have FFmpeg installed (see Prerequisites)
Launch WhisperDesk from Applications

Option B: Build from Source

# Clone the repository
git clone https://github.com/PVAS-Development/whisperdesk.git
cd whisperdesk

# Install dependencies
npm install

# Build whisper.cpp with Metal support (downloads base model)
# For development (current architecture only):
npm run setup:whisper

# For production (universal binary - Intel + Apple Silicon):
npm run setup:whisper:universal

# Run in development mode
npm run electron:dev

# Or build for production (automatically builds universal binary)
npm run electron:build

🎮 Usage

Open Files - Drag and drop audio/video files (single or batch) into the app, or click to browse
Configure Settings - Choose your preferred model, language, and output format
Transcribe - Click "Transcribe" to process the entire queue sequentially
Save/Copy - Save the transcription from the save dialog (choose from .txt, .docx, .pdf, .md, .srt, or .vtt formats) or copy to clipboard

Keyboard Shortcuts

Shortcut	Action
`Cmd+O`	Open file
`Cmd+S`	Save transcription
`Cmd+C`	Copy transcription
`Cmd+Return`	Start transcription
`Cmd+H`	Toggle history
`Escape`	Cancel transcription

🧠 Whisper Models

Model	Size	Speed	Quality	Best For
`tiny`	75 MB	~10x	★☆☆☆☆	Quick drafts, testing
`base`	142 MB	~7x	★★☆☆☆	Fast transcription
`small`	466 MB	~4x	★★★☆☆	Balanced speed/quality
`medium`	1.5 GB	~2x	★★★★☆	High quality
`large-v3`	3.1 GB	~1x	★★★★★	Best quality
`large-v3-turbo`	1.6 GB	~2x	★★★★★	Fast + quality

English-only variants (.en) are available for tiny, base, small, and medium models.

Models are downloaded automatically on first use and cached in:

Development: PROJECT_ROOT/models/
Production: ~/Library/Application Support/WhisperDesk/models/

🔧 Development

Prerequisites

Node.js 22.12+ (use nvm use to auto-switch via .nvmrc)
CMake (for building whisper.cpp)
FFmpeg

Setup

# Clone and install
git clone https://github.com/PVAS-Development/whisperdesk.git
cd whisperdesk
npm install

# Build whisper.cpp and download base model
npm run setup:whisper

# Run development server
npm run electron:dev

Building

# Build for macOS
npm run electron:build:mac

# Build directory only (faster, for testing)
npm run electron:build:dir

Contributing

This project uses conventional commits for consistent commit messages.

Development Flow

Create a feature branch from main:
```
git checkout -b feat/my-feature
```
Make changes with conventional commits and create PR to main

Commit Message Convention

Use Conventional Commits for clear history:

Commit Type	Example	Description
`feat:`	`feat: add PDF export`	New feature
`fix:`	`fix: crash on startup`	Bug fix
`perf:`	`perf: faster loading`	Performance improvement
`refactor:`	`refactor: simplify logic`	Code refactoring
`docs:`	`docs: update README`	Documentation
`chore:`	`chore: update deps`	Maintenance
`style:`	`style: format code`	Code style
`test:`	`test: add unit tests`	Tests
`ci:`	`ci: fix workflow`	CI/CD changes
`build:`	`build: update config`	Build changes

Issues & Discussions

Bug report? Open an issue via the built-in bug report template so we collect macOS version, WhisperDesk version, reproduction steps, and relevant logs automatically.
Feature idea? Start a thread in Discussions. We prefer to explore new ideas there and will only create an issue once we understand the scope.
Before you post: search the existing issues and discussions to avoid duplicates and help us respond faster.

Testing

WhisperDesk has a comprehensive test suite with 335+ tests covering utilities, services, hooks, and React components.

Run Tests

# Run all tests once (CI mode)
npm run test:run

# Run tests with watch mode
npm run test

# Run tests with UI dashboard
npm run test:ui

# Run tests with coverage
npm run test:coverage

Test Coverage

Unit Tests - Utilities, formatters, validators, storage, and services
Component Tests - SettingsPanel, FileDropZone, OutputDisplay
Service Tests - Electron API, transcription service, model service, history storage
Test Framework: Vitest with jsdom
Component Testing: @testing-library/react
Pre-commit Hooks - Lint and format checks run automatically before every commit (via husky + lint-staged)

CI/CD Pipeline

Tests run automatically in GitHub Actions on every PR and push:

✅ Linting & formatting checks
✅ TypeScript type checking
✅ Unit & component tests (335+ tests)
✅ Production build validation

Available Scripts

Script	Description
`npm run dev`	Start Vite dev server
`npm run electron:dev`	Start app in development mode
`npm run setup:whisper`	Build whisper.cpp (current architecture)
`npm run setup:whisper:universal`	Build whisper.cpp (universal binary)
`npm run electron:build`	Builds macOS DMG (with universal binary)
`npm run electron:build:mac`	Builds macOS DMG (with universal binary)
`npm run electron:build:dir`	Build directory only (faster, testing)
`npm run icons`	Generate app icons from SVG
`npm run lint`	Run ESLint
`npm run lint:fix`	Run ESLint with auto-fix
`npm run typecheck`	Run TypeScript type checking
`npm run format`	Format code with Prettier
`npm run format:check`	Check code formatting
`npm run test`	Run tests with watch mode
`npm run test:ui`	Run tests with dashboard UI
`npm run test:run`	Run tests once (CI mode)
`npm run test:coverage`	Run tests with coverage report

Architecture

This project follows a modern Electron architecture with strict separation of concerns:

src/main/: Electron Main process (TypeScript). Handles OS integration, window management, and native services.
src/preload/: Preload scripts (TypeScript). Exposes a secure, typed API to the renderer via contextBridge.
src/renderer/: React application (TypeScript). The UI layer, built with Vite.
src/shared/: Shared types and constants used by both Main and Renderer processes.

Security Features:

Context Isolation: Enabled. Renderer cannot access Node.js primitives directly.
Sandbox: Enabled. Renderer runs in a sandboxed environment.
IPC: All communication happens via typed IPC channels defined in src/main/ipc/.

Project Structure

whisperdesk/
├── src/
│   ├── main/                # Electron Main process (TypeScript)
│   │   ├── index.ts         # Entry point
│   │   ├── ipc/             # IPC Handlers
│   │   ├── services/        # Business logic (Whisper, FileSystem)
│   │   └── utils/           # Utilities
│   ├── preload/             # Preload scripts (TypeScript)
│   │   └── index.ts         # Secure API exposure
│   ├── renderer/            # React frontend (TypeScript)
│   │   ├── App.tsx          # Main app component
│   │   ├── main.tsx         # React entry point
│   │   ├── components/      # Shared UI components
│   │   ├── features/        # Feature-based modules
│   │   └── ...
│   └── shared/              # Shared types and constants
├── dist-electron/           # Output folder for main process build
├── dist/                    # Output folder for renderer build
├── scripts/                 # Build and setup scripts
├── bin/                     # whisper-cli binary (built)
└── models/                  # Downloaded GGML models (dev)

🐛 Troubleshooting

"whisper.cpp not found" error

Run the setup script to build whisper.cpp:

npm run setup:whisper

"FFmpeg not found" error

Install FFmpeg via Homebrew:

brew install ffmpeg

Slow transcription

Use a smaller model (tiny or base) for faster results
Ensure you're using GPU acceleration (shown in app settings)
Close other resource-intensive applications

App won't open (macOS Gatekeeper)

The app is code-signed and notarized by Apple, so it should open normally. If you still see a warning:

Right-click the app and select "Open"
Click "Open" in the dialog that appears

For builds from source (unsigned), you may need to run:

xattr -cr /Applications/WhisperDesk.app

🤝 Contributing

Contributions are welcome! Please see our Contributing Guide for details on:

Setting up your development environment
Code style and commit conventions
Submitting pull requests

🔒 Privacy & Security

Local Processing: All audio/video processing happens locally on your device. Your files never leave your computer.
No Cloud Uploads: We do not upload your media files or transcriptions to any server.
Anonymous Analytics: We collect minimal, anonymous usage data (e.g., app launches, feature usage) to improve the app. No personal data or file content is collected.
Code Signing: The app is code-signed and notarized by Apple for your safety.

For more details, please read our Privacy Policy.

☕ Support the Project

WhisperDesk is free and open-source software. If you find it useful, please consider supporting its development:

Your support helps cover the costs of Apple Developer Program fees and keeps the project alive!

📄 License

MIT License - see LICENSE for details.

🙏 Acknowledgments

whisper.cpp - High-performance C++ port of OpenAI Whisper
OpenAI Whisper - The amazing speech recognition model
Electron - Cross-platform desktop apps
React - UI framework
TypeScript - Type-safe JavaScript
Vite - Build tool
Vitest - Fast unit testing framework

Made with ❤️ for the transcription community

Name		Name	Last commit message	Last commit date
Latest commit History 188 Commits
.github		.github
.husky		.husky
build		build
docs		docs
scripts		scripts
src		src
.gitignore		.gitignore
.nvmrc		.nvmrc
.prettierignore		.prettierignore
.prettierrc		.prettierrc
.releaserc.json		.releaserc.json
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
commitlint.config.js		commitlint.config.js
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
tsup.config.ts		tsup.config.ts
vite.config.ts		vite.config.ts
vitest.config.ts		vitest.config.ts

Uh oh!

License

PVAS-Development/whisperdesk

Folders and files

Latest commit

History

Repository files navigation

🎙️ WhisperDesk

✨ Features

📋 Requirements

🚀 Installation

1. Install Prerequisites

Install Homebrew (if not already installed)

Install FFmpeg

Install CMake (for building whisper.cpp)

2. Install WhisperDesk

Option A: Download DMG (Recommended)

Option B: Build from Source

🎮 Usage

Keyboard Shortcuts

🧠 Whisper Models

🔧 Development

Prerequisites

Setup

Building

Contributing

Development Flow

Commit Message Convention

Issues & Discussions

Testing

Run Tests

Test Coverage

CI/CD Pipeline

Available Scripts

Architecture

Project Structure

🐛 Troubleshooting

"whisper.cpp not found" error

"FFmpeg not found" error

Slow transcription

App won't open (macOS Gatekeeper)

🤝 Contributing

🔒 Privacy & Security

☕ Support the Project

📄 License

🙏 Acknowledgments

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 24

Sponsor this project

Uh oh!

Packages 0

Contributors 2

Languages

Packages