There are areas of human interest that seem to warrant and reward statistical analysis: the study of weather patterns, the financial markets. One can even make a case for baseball. But pop music? What benefit does it promise? A really killer mixtape? Yet, bringing the rigor of science to the study of the pop song is precisely what a California technology company called Savage Beast has done. The name of their project, The Music Genome Project, speaks to its outsized ambitions: The company touts it as "the most sophisticated taxonomy of musical information ever collected on this scale." Sounds great, but does it actually work?
Employing an army of rigorously trained music analysts, most with degrees in music theory, Savage Beast has dissected "the vast majority" of music that has appeared on the Billboard Music Charts since the mid-1950s, as well as large swaths of jazz and indie rock. Each song has been coded according to a proprietary list of 400 music attributes. Some, like "rhythm" and "tempo," are obvious to the lay listener; others, like "degree of chromatic harmony," are more complex, and, well, pretty much require a degree in music theory to explain. The point of all this fuss is to produce the ultimate music recommendation system, a system that's not based on the flimsy criteria that people normally use—popularity, genre, hipness, how the lead singer looks in tight jeans—but on precisely defined musical characteristics.
As novel (and quixotic) as all this sounds, it isn't even the first time a codification of music has been attempted. The Music Genome bears a striking resemblance to another, much older project begun by the famed musicologist Alan Lomax in the 1960s. Lomax, best known for recording and popularizing the likes of Leadbelly, Muddy Waters, Woody Guthrie, Mississippi Fred McDowell, and Jelly Roll Morton, dedicated the last 30 years of his career (he died in 2002) to an elaborate, lofty, and ultimately unfinished project called the Global Jukebox.
Like the Music Genome, the Global Jukebox is based on a music notation system. Lomax called his "cantometrics," a made-up word he defined as meaning "song as a measure of society." It consisted of 36 parameters that could be used to compare musical performance styles across cultures. And, just as the Music Genome would, Lomax employed an army of rigorously trained research assistants to code and input thousands of songs into a central database. There are 4,400 in all, spanning 400 cultures, everything from Pygmy recordings to American pop tunes. This is only a portion of what Lomax intended. A series of strokes in the 1990s prevented him from getting the Jukebox past the prototype stage.
Despite their many similarities, the two projects have very different ambitions. The Music Genome is primarily a commercial venture, designed to take advantage of something called the Long Tail—an economic concept with new implications in the Internet age. It holds that in an environment of limitless selection and easy distribution—as created by businesses like iTunes and Rhapsody—there's money to be made by driving people beyond the blockbuster hits to the more obscure, deep catalog stuff. As the Savage Beast Web site points out: "In an industry where less than 3% of all releases currently account for over 80% of all revenue, Savage Beast is ideally positioned to unlock an enormous lost revenue potential." That's where the Music Genome comes in.
To date, the system has been used exclusively by in-store kiosks and online recommendation engines for clients like AOL, Tower Records, Best Buy, and Barnes & Noble. But in the next few months, Savage Beast plans to unveil a public interface that will enable listeners to, in CEO Tim Westergren's words, "have a full music genome conversation on the web": query it, input songs, and listen to music. To demonstrate how it works, Westergren volunteered to run a few songs through the system for me. I suggested several, representing a range of styles, artists, and eras; each song returned 10 results. What is most surprising about the Music Genome's recommendations is how Savage Beast has managed to reverse-engineer the obvious.
Although the Genome relies solely on musical attributes, it stumbles onto a surprising number of human connections. Jay-Z's "99 Problems," for instance, yielded a similar, rock-sampling song by his protégé Memphis Bleek called "Everything's a Go" that features a quick, three-bar cameo by—you guessed it—Jay-Z himself. Among the top results for Elvis Presley's "Hound Dog," another of my suggestions, was "Bottle to the Baby" by Charlie Feathers, the man who did the arrangements for Elvis' early Sun sessions and co-wrote his first No. 1 hit ("I Forgot To Remember To Forget").
It seems the Genome is more likely to deepen people's tastes than broaden them, as few of the recommendations strayed outside the genre of the original song. Three of the top matches for Gwen Stefani's "Hollaback Girl," for example, were "I'm a Slave 4 U" by Britney Spears (both songs are produced by the Neptunes), Madonna's "American Life," and "Ain't It Funny" by Jennifer Lopez—pretty much the same list you'd get from any music-savvy 12-year-old girl. But casual fans of Elvis' "Hound Dog" will be pleased to discover Genome recommendations such as "Flyin' Saucers Rock 'n' Roll" by fellow Sun Records artist Billy Lee Riley and "Dance to the Bop" by Gene Vincent, the man Capitol hired to compete with Elvis. Both are relatively obscure today.
Lomax's Global Jukebox takes an altogether different approach. It's designed to lead people down what you might call the Long Tail of culture, taking them from the familiar and commercial to the global and neglected. The project was motivated by Lomax's concern that the world's musical diversity was being trampled by the spread of mass media. "Once a universal human attribute, communication has tended to become a monopoly, a one-way channel from the powerful center to the mute periphery," he wrote. The Global Jukebox was his tool for fighting back.
To experience the Global Jukebox firsthand, I visited the Lomax Archive in New York City, where the prototype is housed on an aging Apple Quadra machine. * The Jukebox is able to produce Genome-like one-to-one recommendations, but it doesn't do it especially well: The pop data set is small and incompatible with the more-robust world-music database. But then, the recommendation function is the least interesting part of the Jukebox.