Aurio is a .NET library that focuses on audio processing, analysis, media synchronization and media retrieval and implements various audio fingerprinting methods.
Features
- 32-bit floating point audio processing engine
- File I/O through NAudio and FFmpeg
- Audio playback through NAudio
- FFT through PFFFT, FFTW (optional) and Exocortex.DSP (optional)
- Resampling through Soxr and SecretRabbitCode/libsamplerate (optional)
- STFT
- Chroma
- Dynamic Time Warping
- On-line Time Warping (Dixon, Simon. "Live tracking of musical performances using on-line time warping." Proceedings of the 8th International Conference on Digital Audio Effects. 2005.)
- Fingerprinting
- Haitsma, Jaap, and Ton Kalker. "A highly robust audio fingerprinting system." ISMIR. 2002.
- Wang, Avery. "An Industrial Strength Audio Search Algorithm." ISMIR. 2003.
- Echoprint (Ellis, Daniel PW, Brian Whitman, and Alastair Porter. "Echoprint: An open music identification service." ISMIR. 2011.)
- AcoustID Chromaprint
All audio processing (incl. fingerprinting) is stream-based and supports processing of arbitrarily long streams at constant memory usage. All fingerprinting methods are implemented from scratch, not ports from existing libraries, while keeping compatibility where possible.
Aurio.WaveControls provides WPF widgets for user interfaces:
- Spectrogram / Chromagram View
- Spectrum / Graph View
- VU Meter
- Correlometer
- Time Scale
- Wave View
Requirements, Build Instructions, Documentation
For requirements and build instructions visit the GitHub repository. There is no documentation available yet, but the library comes with a few sample applications and is also used by the open source AudioAlign application that can be used as a reference. If you're experiencing issues or have any questions, please get in touch.Examples
Reading, Processing & Writing
/* Read a high definition MKV video file with FFmpeg,
* convert it to telephone sound quality,
* and write it to a WAV file with NAudio. */
var sourceStream = new FFmpegSourceStream(new FileInfo("high-definition-video.mkv"));
var ieee32BitStream = new IeeeStream(sourceStream);
var monoStream = new MonoStream(ieee32BitStream);
var resamplingStream = new ResamplingStream(monoStream, ResamplingQuality.Low, 8000);
var sinkStream = new NAudioSinkStream(resamplingStream);
WaveFileWriter.CreateWaveFile("telephone-audio.wav", sinkStream);
Short-time Fourier Transform
// Setup STFT with a window size of 100ms and an overlap of 50ms
var source = AudioStreamFactory.FromFileInfoIeee32(new FileInfo("somefilecontainingaudio.ext"));
var windowSize = source.Properties.SampleRate/10;
var hopSize = windowSize/2;
var stft = new STFT(source, windowSize, hopSize, WindowType.Hann);
var spectrum = new float[windowSize/2];
// Read all frames and get their spectrum
while (stft.HasNext()) {
stft.ReadFrame(spectrum);
// do something with the spectrum (e.g. build spectrogram)
}
Generate fingerprints
// Setup the source (AudioTrack is Aurio's internal representation of an audio file)
var audioTrack = new AudioTrack(new FileInfo("somefilecontainingaudio.ext"));
// Setup the fingerprint generator (each fingerprinting algorithms has its own namespace but works the same)
var defaultProfile = FingerprintGenerator.GetProfiles()[0]; // the first one is always the default profile
var generator = new FingerprintGenerator(defaultProfile);
// Setup the generator event listener
generator.SubFingerprintsGenerated += (sender, e) => {
// Print the hashes
e.SubFingerprints.ForEach(sfp => Console.WriteLine("{0,10}: {1}", sfp.Index, sfp.Hash));
};
// Generate fingerprints for the whole track
generator.Generate(audioTrack);
Fingerprinting & Matching
// Setup the sources
var audioTrack1 = new AudioTrack(new FileInfo("somefilecontainingaudio1.ext"));
var audioTrack2 = new AudioTrack(new FileInfo("somefilecontainingaudio2.ext"));
// Setup the fingerprint generator
var defaultProfile = FingerprintGenerator.GetProfiles()[0];
var generator = new FingerprintGenerator(defaultProfile);
// Create a fingerprint store
var store = new FingerprintStore(defaultProfile);
// Setup the generator event listener (a subfingerprint is a hash with its temporal index)
generator.SubFingerprintsGenerated += (sender, e) => {
var progress = (double)e.Index / e.Indices;
var hashes = e.SubFingerprints.Select(sfp => sfp.Hash);
store.Add(e);
};
// Generate fingerprints for both tracks
generator.Generate(audioTrack1);
generator.Generate(audioTrack2);
// Check if tracks match
if (store.FindAllMatches().Count > 0) {
Console.WriteLine("overlap detected!");
}
Multitrack audio playback
var drumTrack = new AudioTrack(new FileInfo("drums.wav"));
var guitarTrack = new AudioTrack(new FileInfo("guitar.wav"));
var vocalTrack = new AudioTrack(new FileInfo("vocals.wav"));
var band = new TrackList<AudioTrack>(new[] {drumTrack, guitarTrack, vocalTrack});
new MultitrackPlayer(band).Play();
AudioAlign
AudioAlign is an audio synchronization and analysis tool written for research purposes to automatically synchronize audio and video recordings that have either been recorded in parallel at the same event or contain the same aural information. It is basically a GUI for the Aurio library with a little bit of glue code in between.
Publications
Aurio has been accepted to the ACM Multimedia 2015 Open-Source Software Competition. The paper describing the library in more detail is available here.License
This project is released under the terms of the GNU Affero General Public License. The library can be built to be free of any copyleft requirements; get in touch if the AGPL does not suit your needs.