- TeX 62.2%
- Rust 22.5%
- BibTeX Style 10.6%
- Typst 4.5%
- Makefile 0.2%
| report | ||
| src | ||
| .gitignore | ||
| Cargo.lock | ||
| Cargo.toml | ||
| LICENSE | ||
| README.md | ||
rrtir - Rusty Real Time Instruments Recognition
rrtir, pronounced [ʁi.tœʁ] is a school project at INSA Lyon. It is a PSAT, which stands for Projet Scientifique Artistique et Technique.
The motivation is to produce a model capable of reconizing musical instruments in a real-time audio stream to provide backstages artists an efficient way to program their scenography. In fact, this is more of an excuse to try AI, and more specifically deep learning in an audio situation, in a constraint context, and using cutting edge frameworks and languages like Burn and Rust...
Run rrtir
Nevertheless, this program, written in Rust, provides 3 binaries :
trainwhich trains the modeldemothat can infer on a wav file or a microphone inputtestwhich produces metrics for the test dataset (that went through the same preprocessing pipeline as train)cm(src/bin/confusion_matrix.rs) that generates confusion matrices images based on the training artifact directory (can be run withcargo run --bin cm --features cmf --release)
Download models
All relevant models are provided in this git repository, in the Model Registry. These are based on the SLAKH2100 dataset, see slakh.com.
You'll find lots of models, but you may want to focus on >=2.0.0 versions that can be used right away with >=2.0.0 rrtir versions. Simply download the archive, uncompress and rename it as artifacts in the root of the project and then use demo mode.
Warnings
The preprocessing pipeline uses Rayon to paralellize. This is required for training time reasons, however it implies that training is not deterministic as the time of writing this text. If we didn't notice big differences in performance results, you may encounter gaps between what we provide and what you'll get training by yourself.
Team
We did this project, in a very short amount of time (October 2024 - January 2025) in a 2 people team only :
- Hugues KADI @attssystem (preprocessing pipeline, model code and training, report co-author)
- Mathis DEVIDAL @mdevidal (metrics code, test, report co-author)
This project is licensed under the MIT License - see the LICENSE file for details.