That Tune, Named
How does the music-identifying app Shazam work its magic?
Shazam is the closest a cell phone can come to magic. Say you're in a restaurant, a song comes on, and you can't quite place the tune. In the past, your options were limited; you could try asking your spouse or the waiter for a clue, but that approach risked revealing your ignorance. (That's " Sex Machine," dumb ass.) Shazam—which launched in the United Kingdom in 2002 as a call-in service and became widely known in the United States last year when it hit the iPhone—solves the dilemma in a few clicks. Press a button on your phone, and in seconds you'll get the artist and song title. Other than playing video games, it's the most useful thing you can do on your phone.
Last week, Shazam announced that more than 50 million people worldwide have used the service—up from 35 million at the start of the year. The company also said that it's received an undisclosed investment from the fabled Silicon Valley venture-capital firm KPCB. Shazam's success seems justified—it's the one app you can show to iPhone skeptics to get them to reconsider their position (though Shazam is also available on Android, BlackBerry, Windows Mobile, and pretty much any other phone). Yet for all the acclaim it garners, Shazam's inner workings are pretty mysterious. How does it actually ID your song? How does the company make money? (Here's one hint: iPhone users should expect to see a pay version soon.) And what are the long-term prospects for a firm whose sole purpose is satisfying an acute, very occasional need?
First, a short explanation of how Shazam works. The company has a library of more than 8 million songs, and it has devised a technique to break down each track into a simple numeric signature—a code that is unique to each track. "The main thing here is creating a 'fingerprint' of each performance," says Andrew Fisher, Shazam's CEO. When you hold your phone up to a song you'd like to ID, Shazam turns your clip into a signature using the same method. Then it's just a matter of pattern-matching—Shazam searches its library for the code it created from your clip; when it finds that bit, it knows it's found your song.
OK, but how does Shazam make these fingerprints? As Avery Wang, Shazam's chief scientist and one of its co-founders, explained to Scientific American in 2003, the company's approach was long considered computationally impractical—there was thought to be too much information in a song to compile a simple signature. But as he wrestled with the problem, Wang had a brilliant idea: What if he ignored nearly everything in a song and focused instead on just a few relatively "intense" moments? Thus Shazam creates a spectrogram for each song in its database—a graph that plots three dimensions of music: frequency vs. amplitude vs. time. The algorithm then picks out just those points that represent the peaks of the graph—notes that contain "higher energy content" than all the other notes around it, as Wang explained in an academic paper he published to describe how Shazam works (PDF). In practice, this seems to work out to about three data points per second per song.
Farhad Manjoo is Slate's technology columnist and the author of True Enough: Learning To Live in a Post-Fact Society. You can email him at firstname.lastname@example.org and follow him on Twitter.