Google MT3-2.0

Project Overview (Updating)

Use Basic Pitch Prediction as Label:
- Generate predictions using the Basic Pitch Temporal Convolutional Network (TCN) model.
- Output includes MIDI files, NPZ model outputs, and CSV files containing note event labels.
Label and Mel Spectrograms Processing:
- Extract framed Mel spectrograms from the raw audio files using note event timestamps from Basic Pitch predictions.
- Create structured datasets for CNN training, aligning the spectrograms with corresponding note event labels.
- Each Mel spectrogram is set to 2s long for now. This might change in the future.
Simple CNN Training:
- Train a simple CNN model to predict music transcription from the framed Mel spectrograms.
- The model predicts note events, including start time, end time, pitch, velocity, and confidence.

Project Structure

GoogleMT3/
│
├── Models/
│   └── SimpleCNN.py       # CNN model training and prediction script
│
├── Preprocessor/
│   └── BASIC_Pitch_spectrogram_gen.py   # Basic Pitch prediction and spectrogram extraction
│
├── BasicPitchTCN_prediction/
│   ├── csv/               # Basic Pitch-generated CSV files
│   ├── midi/              # Basic Pitch-generated MIDI files
│   ├── npz/               # Basic Pitch model outputs (NPZ format)
│   └── spectrograms/      # Extracted framed Mel spectrograms
│
├── .gitignore             # Git ignore file
├── README.md              # Project readme file
└── requirements.txt       # Python dependencies

Train the model

Prepare the dataset
- Create the directory GoogleMT3/RawData/audio_mono-mic and put the audios as .wav files into the folder.
- Run the USE_BASIC_PITCH_GEN_DATASET.py file, this script will help create datasets as input of models.
Train the model
- When you prepare your datasets, run the SimpleCNN.py to train the CNN model or the VisionTransformer.py to train the vision transform model.
- When the training process finished, you will see the .pth models in the same folder.
Use the model
- Load a model from the '.pth' file.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.idea		.idea
Models		Models
Preprocessor		Preprocessor
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Google MT3-2.0

Project Overview (Updating)

Project Structure

Train the model

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

Snackiller/Google-MT3-2.0

Folders and files

Latest commit

History

Repository files navigation

Google MT3-2.0

Project Overview (Updating)

Project Structure

Train the model

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages