A production-ready toolkit for designing and testing music theory prompts for large language models (LLMs). Features a modular architecture for composing reusable prompt components.
August 2025 migration: primary dataset now
fux-counterpoint
with unified--file/--files
identifiers (stems of encoded filenames). Legacy--question/--questions
flags still accepted (hidden) for backward compatibility; treat them as aliases of--file/--files
.
Quick environment bootstrap:
# Install Poetry (if not already) curl -sSL https://install.python-poetry.org | python3 - export PATH="$HOME/.local/bin:$PATH" # Use in-project virtualenvs poetry config virtualenvs.in-project true # Install base dependencies poetry install # (Optionally) add providers: poetry install --with google --with anthropic # Run tests poetry run pytest -qPython compatibility: tested on CPython 3.11β3.13 (any ^3.11 per pyproject).
For detailed information, see our comprehensive documentation:
- π User Guide - Complete usage instructions and examples
- ποΈ Architecture - System design and components
- π API Reference - Detailed API documentation
- βοΈ Development Guide - Setup and contribution guidelines
- π‘ Examples - Usage examples and tutorials
- π§ Scripts - Development and automation scriptsomated querying across multiple LLM providers. Includes comprehensive testing suite and support for various music encoding formats.
- π Project Status - Current state and next steps
π― Built for researchers and developers working on AI music theory applications
- π§ Modular Prompt Architecture: Compose prompts from reusable, testable components
- π€ Multi-LLM Provider Support: ChatGPT, Claude, and Gemini APIs
- π΅ Comprehensive Music Format Support: MEI, MusicXML, ABC notation, and Humdrum **kern
- π§ͺ Production-Grade Testing: 84% test coverage with comprehensive mock API validation
- π Context-Aware Prompts: Toggle between contextual and non-contextual prompt modes
- πΎ Built-in Data Management: Integrated support for RCM exam questions and encoded music
- π οΈ Developer Experience: Poetry dependency management, proper Python packaging, comprehensive documentation
- π§ Modular Architecture: Compose prompts from reusable components
- π€ Multi-LLM Support: ChatGPT, Claude, and Gemini integration
- π΅ Music Format Support: MEI, MusicXML, ABC notation, and Humdrum
- π§ͺ Comprehensive Testing: 47/56 tests passing with mock API validation
- π Context Learning: Toggle between contextual and non-contextual prompts
- πΎ Data Management: Built-in support for RCM exam questions and encoded music
- π οΈ Developer Friendly: Poetry-managed dependencies, proper packaging
- Quick Start
- Installation
- Configuration
- Usage
- Architecture
- Testing
- Development
- API Documentation
- Troubleshooting
- License
Get up and running in under 2 minutes:
# 1. Clone and install
git clone https://github.com/liampond/LLM-MusicTheory.git
cd LLM-MusicTheory
poetry install
# 2. Configure API keys
cp .env.example .env
# Edit .env with your API keys (see Configuration section)
# 3. Test installation
poetry run pytest tests/test_models.py -v
# 4. Run your first prompt (new flags)
poetry run run-single --model chatgpt --file Q1b --datatype mei --context --dataset fux-counterpoint
# (Legacy alias still works) --question Q1b
# 5. Run batch processing (supports provider names or specific model names)
poetry run run-batch --models chatgpt,claude --files Q1b Q1c --datatypes mei,abc --dataset fux-counterpoint
# Or use specific model names:
poetry run run-batch --models gpt-4o,claude-3-sonnet --files Q1b --datatypes mei --dataset fux-counterpoint
π That's it! You're ready to start experimenting with music theory prompts.
- Python 3.11+ (Download here)
- Poetry for dependency management (Installation guide)
git clone https://github.com/liampond/LLM-MusicTheory.git
cd LLM-MusicTheory
Using pipx (recommended):
pip install pipx
pipx install poetry
Using official installer:
curl -sSL https://install.python-poetry.org | python3 -
# Install all dependencies in a virtual environment
poetry install
# Verify installation
poetry run run-single --model chatgpt --file Q1b --datatype mei --context --dataset fux-counterpoint
# Run a quick test to ensure everything works
poetry run pytest tests/test_path_utils.py -v
If you see tests passing, you're ready to go! π
Or see [Poetry's official installation guide](https://python-poetry.org/docs/main/#installing-with-the-official-installer).
**Alternative installation via pipx (recommended):**
```bash
python3 -m pip install --user pipx
pipx install poetry
Add Poetry to your PATH:
export PATH="$HOME/.local/bin:$PATH"
You may need to add this line to your ~/.bashrc
or ~/.zshrc
file and restart your terminal.y
A modular toolkit for designing and testing music theory prompts for large language models (LLMs). Write modular prompt components, then use this tool to flexibly combine them and automate querying ChatGPT, Claude, and Gemini. Built for experimentation and evaluation on official Royal Conservatory of Music (RCM) exam questions.
-
Clone the repository
git clone https://github.com/liampond/LLM-MusicTheory.git cd LLM-MusicTheory
-
Install Poetry (if you donβt have it)
curl -sSL https://install.python-poetry.org | python3 -
-
Install dependencies
poetry install
If you get a
poetry: command not found
error, make sure Poetry is in your PATH. You may need to restart your terminal or run:export PATH="$HOME/.local/bin:$PATH"
-
Activate the Poetry environment
poetry shell
-
Check your Python version Results: Up-to-date tests passing (see STATUS.md for current counts).
python --version
-
Troubleshooting
- If you get errors about missing dependencies, try running
poetry lock --no-update
thenpoetry install
again. - If you have issues with conflicting Python versions, ensure your virtual environment uses the correct Python version:
poetry env use python3.11 # or your preferred version >=3.11
- If you get errors about missing dependencies, try running
You need to provide your own API keys for the LLM providers you want to use.
cp .env.example .env
Edit .env
and add your API keys:
# Add your actual API keys (one or more required)
OPENAI_API_KEY=sk-your-openai-key-here
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key-here
GOOGLE_API_KEY=your-google-api-key-here
Provider | Sign Up | Pricing | Free Tier |
---|---|---|---|
OpenAI | platform.openai.com | $0.002/1K tokens | $5 credit |
Anthropic | console.anthropic.com | $0.003/1K tokens | $5 credit |
ai.google.dev | $0.001/1K tokens | 1M tokens/day |
π° Cost Management: Use Google's generous free tier for development. Monitor usage in provider dashboards.
Default models are optimized for cost and performance. Customize in src/llm_music_theory/config/settings.py
:
# Current defaults (cost-effective)
OPENAI_MODEL = "gpt-4o-mini" # $0.0002/1K tokens
ANTHROPIC_MODEL = "claude-3-haiku" # $0.0003/1K tokens
GOOGLE_MODEL = "gemini-1.5-flash" # Free tier available
The toolkit provides two CLI commands for different use cases:
Run one prompt at a time for testing and development:
# Basic usage
poetry run run-single --model chatgpt --file Q1b --datatype mei --context --dataset fux-counterpoint
# Advanced usage with all parameters
poetry run run-single \
--model claude \
--file Q1a \
--datatype musicxml \
--context \
--temperature 0.7 \
--max-tokens 1000 \
--save
Run multiple prompts automatically for experiments:
# Test multiple models on same prompt (provider names)
poetry run run-batch --models chatgpt,claude,gemini --questions Q1b --datatypes mei
# Or use specific model names with auto-detection
poetry run run-batch --models gpt-4o,claude-3-sonnet,gemini-1.5-pro --questions Q1b --datatypes mei
# Full experiment across all combinations
poetry run run-batch \
--models chatgpt,claude \
--questions Q1a,Q1b,Q2a \
--datatypes mei,musicxml \
--context \
--temperature 0.0
Option | Required | Description | Example Values |
---|---|---|---|
--model(s) |
β | LLM provider(s) | chatgpt , claude , gemini |
--question(s) |
β | Question ID(s) | Q1a , Q1b , Q2a |
--datatype(s) |
β | Music encoding(s) | mei , musicxml , abc , humdrum |
--context |
β | Include context guides | flag (present = with context) |
--temperature |
β | Sampling creativity | 0.0 to 2.0 (default: 0.0 ) |
--max-tokens |
β | Response length limit | 500 , 1000 , 2000 |
--save |
β | Save responses to files | flag |
Explore available data before running prompts:
# List available resources
poetry run run-single --list-questions # Shows: Q1a, Q1b, Q2a, ...
poetry run run-single --list-datatypes # Shows: mei, musicxml, abc, humdrum
poetry run run-single --list-guides # Shows: harmonic_analysis, intervals, ...
# See everything at once
poetry run run-single --list-all
For programmatic usage and custom experiments:
from llm_music_theory.core.runner import PromptRunner
from llm_music_theory.core.dispatcher import get_llm
# Initialize LLM
llm = get_llm("chatgpt")
# Create and run prompt
runner = PromptRunner(
model=llm,
question_number="Q1b",
datatype="mei",
context=True,
temperature=0.0,
save=True
)
response = runner.run()
print(f"LLM Response: {response}")
from llm_music_theory.prompts.prompt_builder import PromptBuilder
from llm_music_theory.models.base import PromptInput
# Custom prompt building
builder = PromptBuilder()
prompt_input = builder.build_prompt_input(
question_number="Q1a",
datatype="musicxml",
context=True,
temperature=0.5
)
# Direct LLM querying
llm = get_llm("claude")
response = llm.query(prompt_input)
Settings and configurations can be changed in src/llm_music_theory/config/settings.py
. Currently, the only settings that you can change are the models. Each has a different price, performance, and niche. You can find information about the models, pricing, and their string identifiers here:
You can run a single music theory prompt against any supported LLM using the run_single.py
script. This script combines your modular prompt components, sends the query to the selected API, and prints the modelβs response.
Example command (new syntax):
poetry run run-single --model gemini --file Q1b --datatype mei --context --dataset fux-counterpoint
Legacy still accepted (alias): --question Q1b
.
--model
(required): LLM provider:chatgpt
,claude
,gemini
--file
(required): File ID (stem of encoded file, e.g.Q1b
)--datatype
(required): Encoding format:mei
,musicxml
,abc
,humdrum
--context
: Include contextual guides--dataset
: Dataset folder inside--data-dir
(default:fux-counterpoint
)--temperature
: Sampling temperature (default:0.0
)--max-tokens
: Optional max tokens--save
: Persist response under outputs--data-dir
: Root data directory (default:./data
)--outputs-dir
: Output root (default:./outputs
)
Legacy aliases: --question
maps to --file
(hidden), --examdate
retained for old RCM layout but ignored for new dataset.
You can list available files (new) plus legacy questions, datatypes, or guides:
--list-files
(preferred)--list-questions
(legacy alias β same as list-files)--list-datatypes
--list-guides
Example:
poetry run run-single --list-files --dataset fux-counterpoint
data/
fux-counterpoint/
encoded/
mei/ # MEI files (Q1b.mei, ...)
musicxml/
abc/
humdrum/
prompts/
base/ # base_<datatype>.txt templates
prompt.md # unified question text (replaces per-question files)
guides/ # optional contextual guide .txt/.md files
Legacy RCM layout (now renamed RCM6
, still minimally supported for tests) used: data/RCM6/encoded/<ExamDate>/<datatype>/<Q>.mei
and per-question prompt files under prompts/questions/<context|no_context>/<datatype>/Qx.txt
.
LLM-MusicTheory/
βββ src/llm_music_theory/ # Main package
β βββ cli/ # Command-line interfaces
β β βββ run_single.py # Single prompt execution
β β βββ run_batch.py # Batch processing
β βββ config/ # Configuration management
β β βββ settings.py # Environment and model settings
β βββ core/ # Core business logic
β β βββ dispatcher.py # LLM provider selection
β β βββ runner.py # Prompt execution engine
β βββ models/ # LLM provider implementations
β β βββ base.py # Abstract base classes
β β βββ chatgpt.py # OpenAI ChatGPT
β β βββ claude.py # Anthropic Claude
β β βββ gemini.py # Google Gemini
β βββ prompts/ # Prompt building system
β β βββ prompt_builder.py # Modular prompt composition
β βββ utils/ # Utility functions
β βββ logger.py # Logging configuration
β βββ path_utils.py # File and path utilities
βββ data/RCM6/ # Legacy data (read-only, formerly LLM-RCM)
β βββ encoded/ # Music files in various formats
β βββ prompts/ # Base prompt templates
β βββ guides/ # Context guides for prompts
β βββ questions/ # Question templates
βββ tests/ # Comprehensive test suite
βββ docs/ # Additional documentation
βββ examples/ # Usage examples and tutorials
βββ scripts/ # Development and automation scripts
- π§© Modular Architecture: Each component has a single responsibility
- π Provider Abstraction: Easy to add new LLM providers
- π§ͺ Testable Design: Comprehensive mocking for cost-free testing
- π¦ Clean Packaging: Standard Python project structure
- βοΈ Configuration-Driven: Environment-based settings management
- Input: User specifies model, question, datatype, and context
- Discovery: System locates required files using path utilities
- Composition: Prompt builder assembles modular components
- Dispatch: Core dispatcher selects and initializes LLM provider
- Execution: Runner sends prompt and handles response
- Output: Response returned to user, optionally saved to file
Comprehensive test suite with 84% coverage and zero API costs during testing.
# Run all tests (recommended)
poetry run pytest
# Run with verbose output
poetry run pytest -v
# Run specific test categories
poetry run pytest tests/test_models.py # LLM provider tests
poetry run pytest tests/test_path_utils.py # File handling tests
poetry run pytest tests/test_runner.py # Core logic tests
poetry run pytest tests/test_integration.py # CLI integration tests
# Quick Make targets
make test # All tests
make test-models # Just model tests
make test-fast # Skip slow tests
Test Suite | Purpose | Coverage |
---|---|---|
test_models.py |
LLM provider implementations | Mock API validation |
test_path_utils.py |
File discovery and data loading | Path resolution, data integrity |
test_runner.py |
Core prompt execution logic | Prompt building, parameterization |
test_integration.py |
CLI command workflows | End-to-end argument processing |
test_comprehensive.py |
Real data validation | Legacy data compatibility |
- π« No Real API Calls: All LLM interactions are mocked to avoid costs
- π Comprehensive Coverage: Tests validate prompt construction, not LLM responses
- πββοΈ Fast Execution: Full test suite runs in <1 second
- π Continuous Integration: Tests run automatically on all changes
# Clone and setup
git clone https://github.com/liampond/LLM-MusicTheory.git
cd LLM-MusicTheory
poetry install
# Install development dependencies
poetry install --with dev
# Activate shell
poetry shell
# Run pre-commit checks
poetry run pytest
poetry run black --check src/
poetry run flake8 src/
# 1. Make changes to source code
# 2. Run tests to ensure nothing breaks
poetry run pytest
# 3. Format code (if using black)
poetry run black src/
# 4. Test specific changes
poetry run pytest tests/test_your_change.py -v
# 5. Commit changes
git add -A
git commit -m "feat: describe your changes"
- Create new provider in
src/llm_music_theory/models/your_provider.py
:
from .base import LLMInterface, PromptInput
class YourProvider(LLMInterface):
def query(self, input: PromptInput) -> str:
# Implement your API integration
pass
- Register in
src/llm_music_theory/core/dispatcher.py
:
def get_llm(model_name: str) -> LLMInterface:
if model_name == "your_provider":
from ..models.your_provider import YourProvider
return YourProvider()
-
Add tests in
tests/test_models.py
-
Update documentation
- Formatting: Python Black (auto-formatting)
- Imports: isort for import organization
- Type Hints: Required for public APIs
- Docstrings: Google style for functions and classes
- Testing: Pytest with comprehensive mocking
Main class for executing prompts across LLM providers.
class PromptRunner:
def __init__(self, model, question_number, datatype, context, **kwargs):
"""Initialize prompt runner with configuration."""
def run(self) -> str:
"""Execute prompt and return LLM response."""
Abstract base class for all LLM providers.
class LLMInterface(ABC):
@abstractmethod
def query(self, input: PromptInput) -> str:
"""Send prompt to LLM and return response."""
Modular prompt composition system.
class PromptBuilder:
def build_prompt_input(self, question_number, datatype, context, **kwargs) -> PromptInput:
"""Build complete prompt from modular components."""
run-single
: Execute single promptrun-batch
: Execute multiple prompts in batch
For complete API documentation, see docs/
directory.
1. Import Error: No module named 'llm_music_theory'
# Solution: Ensure Poetry virtual environment is active
poetry shell
poetry install
2. API Key Not Found
# Solution: Check your .env file
cat .env
# Ensure no extra spaces around = sign
OPENAI_API_KEY=your-key-here
3. FileNotFoundError for data files
# Solution: Check data directory structure
ls -la data/RCM6/
# Should contain: encoded/, prompts/, guides/, questions/
4. Tests failing with "system prompt not found"
# This is expected - comprehensive tests are skipped when legacy data is incomplete
# Core functionality tests should pass:
poetry run pytest tests/test_models.py tests/test_runner.py -v
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation:
docs/
directory - Examples:
examples/
directory
This project is licensed under the MIT License - see the LICENSE file for details.
- Royal Conservatory of Music for official exam materials
- OpenAI, Anthropic, and Google for LLM API access
- Python Poetry for excellent dependency management
- The open-source community for inspiring this project's architecture
π΅ Happy prompting! Build something amazing with music theory and AI.
Overall: 47/56 tests passing (84% success rate)
- β No API Costs: All tests use mock responses
- β Prompt Correctness: Validates proper prompt compilation
- β Data Loading: Tests file discovery and loading
- β Error Handling: Verifies graceful failure handling
- β CLI Interface: Tests command-line tools without API calls
- β Parameter Passing: Ensures settings are correctly transmitted
- β Multi-Format Support: Tests all music encoding formats
Tests automatically run on GitHub Actions for:
- β Push to main branch
- β Pull request creation
- β Multiple Python versions (3.11, 3.12, 3.13)
For detailed information, see our comprehensive documentation:
- π User Guide - Complete usage instructions and examples
- ποΈ Architecture - System design and components
- π API Reference - Detailed API documentation
- βοΈ Development Guide - Setup and contribution guidelines
# Clone and setup
git clone https://github.com/liampond/LLM-MusicTheory.git
cd LLM-MusicTheory
poetry install
# Run tests (no API calls made)
poetry run pytest
# Try an example
poetry run python -m llm_music_theory.cli.run_single --question Q1b --datatype mei --model ChatGPT
# Run all tests - comprehensive coverage, no API calls
poetry run pytest
# Run specific test categories (Make targets)
make test-models # Model implementations
make test-runner # Core functionality
make test-integration # CLI workflows
Results: 47/56 tests passing with comprehensive coverage of core functionality.
LLM-MusicTheory/
βββ src/llm_music_theory/ # Main package code
β βββ cli/ # Command-line interfaces
β βββ config/ # Configuration and settings
β βββ core/ # Core logic (dispatcher, runner)
β βββ models/ # LLM model wrappers
β βββ prompts/ # Prompt building utilities
β βββ utils/ # Utility functions
βββ data/RCM6/ # Legacy evaluation data (formerly LLM-RCM)
β βββ encoded/ # Music files (MEI, MusicXML, etc.)
β βββ prompts/ # Prompt templates
β βββ guides/ # Context guides
βββ tests/ # Comprehensive test suite
βββ docs/ # All documentation
βββ user-guide.md # Usage instructions
βββ architecture.md # System design
βββ api-reference.md # API documentation
βββ development.md # Development setup
βββ examples.md # Usage examples
βββ scripts.md # Automation scripts
For detailed development setup, architecture details, and contribution guidelines, see Development Guide.
We welcome contributions! For detailed guidelines, see our Development Guide.
- Fork the repository
- Create a feature branch
- Write/update tests
- Ensure tests pass (
poetry run pytest
) - Update documentation if needed
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- Royal Conservatory of Music (RCM) for exam question data
- OpenAI, Anthropic, Google for LLM APIs
- Music encoding communities for MEI, MusicXML, ABC, and Humdrum formats
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: [email protected]
Happy music theory prompting! π΅π€