My Second project in processing. Here I wanted to make something that could detect notes by scanning an audio input signal.
I was first inspired by the work Eric Decker did on the subject. In his Guitar Genetics project, he made use of a pitch-class-profiling method to detect various notes in the spectrum. For my project, I wanted to do the same thing, but in a more linear fashion by using math. This meant detecting multiple notes would be much harder, but hopefully individual notes would be more accurate and allow access to extra data like pitch bend information.
I started by attempting to write my own DFT. this was actually not to difficult as there are lots of examples out there and I do have a background in rudimentary
calculus. My problem here was that I did not have the programing know-how to optimize it for speed. While it worked, it did so very very very slowly.
To solve this I decided to use the built in FFT in minim, a sound class for processing. It turned out to be very easy to detect pitch using the FFT by looking for the bin with the most energy. After a quick search on wikipedia and some rummaging around through my pipe organ books, I was able to calculate note based on the frequency.
Once that was working, I added some extra features from a project by Corban Brook. Namely a trigger threshold and a noise subtraction array to remove background sounds. This noise reduction is something I’d like to improve on in the future as I know there are more complex and effective algorithms out there.
I had always know that a sound is composed of multiple frequencies, and with musical tones, those are called harmonics. I was working on a way to have processing search the FFT for patterns in the harmonics to help enhance the accuracy and stability of the pitch detection. Because these spikes are supposed to be at know intervals, I figured I could use an equation to find and extract them.
Turns out it is even easier then I thought. I found a paper by Gareth Middleton on the Harmonic product spectrum. This appeared to be only a theoretical paper, but I was able to actually implement the idea into a workable solution, something I am very proud of.
The HPS system basically makes a copy of the spectrum and shrinks it by half, then a third, then a forth. Next, it takes all those copies and multiplies them together. This has the effect of canceling out all the frequencies except the fundamental. It’s not perfect, but when you have a noisy spectrum and lost of harmonics (i.e. distorted guitar) it improves the stability of the note detection by a huge amount. Downside is that sometimes it can be less accurate, a problem I am still working on solving.
Another thing I’d like to address is the re-sampling of the spectrum in the HPS system. At first, I just threw out the bins that were not being used. Now I have a filtering system similar to an anti-aliasing filter. Currently it’s a simple linear average, but I hope to develop it into a more refined sampling filter. Lastly, I was working on a way to sample the top Nth peaks of the HPS spectrum in hopes of finding a way to detect multiple notes. This does not seem like it will work as there are to many harmonics from a single note that could cause a false reading. Chances are that pitch-class-profiling will yield far better results, but I’m curious if there is another way.
Special thanks to Corban Brook and NoahBuddy from the processing forums for posting their research/work and sharing their findings.
Extra special thanks to Eric Decker for sharing his work on the same topic.
Sources and papers consulted during this project:
http://en.wikipedia.org/wiki/Pitch_(music)#Labeling_pitches
http://en.wikipedia.org/wiki/Fast_Fourier_transform
http://www.dsprelated.com/showmessage/69952/1.php
Key-finding with interval profiles
Automatic Chord Recognition from Audio Using Enhanced Pitch Class Profile
Impact of Distance in Pitch Class Profile Computation




http://www.virtualmatter.org to GoogleReader!
SonyaSunny