PSAT (INSA Lyon) - CS Department - IF 5th year

insa

TeX 62.2%
Rust 22.5%
BibTeX Style 10.6%
Typst 4.5%
Makefile 0.2%

Find a file

Hugues KADI 9e648a2310 defense - small fix (brief <-> small)		2025-06-27 19:42:46 +02:00
report	defense - small fix (brief <-> small)	2025-06-27 19:42:46 +02:00
src	4 instr training	2025-01-17 18:20:48 +01:00
.gitignore	report - resentencing	2025-05-21 21:55:57 +02:00
Cargo.lock	RRTIR 2.0 (weird yeah), all models retraining	2025-01-17 14:21:40 +01:00
Cargo.toml	RRTIR 2.0 (weird yeah), all models retraining	2025-01-17 14:21:40 +01:00
LICENSE	Added LICENSE	2025-01-10 16:17:06 +01:00
README.md	README Update	2025-01-25 10:34:58 +01:00

README.md

rrtir - Rusty Real Time Instruments Recognition

rrtir, pronounced [ʁi.tœʁ] is a school project at INSA Lyon. It is a PSAT, which stands for Projet Scientifique Artistique et Technique.

The motivation is to produce a model capable of reconizing musical instruments in a real-time audio stream to provide backstages artists an efficient way to program their scenography. In fact, this is more of an excuse to try AI, and more specifically deep learning in an audio situation, in a constraint context, and using cutting edge frameworks and languages like Burn and Rust...

Run rrtir

Nevertheless, this program, written in Rust, provides 3 binaries :

train which trains the model
demo that can infer on a wav file or a microphone input
test which produces metrics for the test dataset (that went through the same preprocessing pipeline as train)
cm (src/bin/confusion_matrix.rs) that generates confusion matrices images based on the training artifact directory (can be run with cargo run --bin cm --features cmf --release)

Download models

All relevant models are provided in this git repository, in the Model Registry. These are based on the SLAKH2100 dataset, see slakh.com.

You'll find lots of models, but you may want to focus on >=2.0.0 versions that can be used right away with >=2.0.0 rrtir versions. Simply download the archive, uncompress and rename it as artifacts in the root of the project and then use demo mode.

Warnings

The preprocessing pipeline uses Rayon to paralellize. This is required for training time reasons, however it implies that training is not deterministic as the time of writing this text. If we didn't notice big differences in performance results, you may encounter gaps between what we provide and what you'll get training by yourself.

Team

We did this project, in a very short amount of time (October 2024 - January 2025) in a 2 people team only :

Hugues KADI @attssystem (preprocessing pipeline, model code and training, report co-author)
Mathis DEVIDAL @mdevidal (metrics code, test, report co-author)

This project is licensed under the MIT License - see the LICENSE file for details.