Microsoft’s “Bing It On Challenge” is a clever marketing conceit. In a series of commercials, the company bet random passersby that they’d choose Bing’s web-search results over Google’s in a “side-by-side, blind-test comparison.” The results, Microsoft finds, are pretty conclusive: By a two-to-one ratio, people choose Bing. The commercials close by inviting the viewer to “join the 5 million people who visited the challenge at Bing.com.”
That makes it sound like the “two-to-one” claim is based on a sample size of 5 million. But Ian Ayres, a professor at Yale Law School, found that hard to believe. So he decided to look into it a little further. What he found does not make Microsoft's claims look particularly solid.
First, the “two-to-one” figure comes not from the 5 million people who have taken the challenge online, but from a separate study of just 1,000 participants. As for the online challenge, Microsoft says it isn't actually keeping track of the results. So Ayres got four of his law students together and ran a study of his own.
Between January and March, the team hired 1,000 subjects on Mechanical Turk and asked them to take the “Bing It On” challenge on Microsoft’s own website. Some were asked to use randomly selected popular search terms, others came up with their own search terms, and a third group used the search terms suggested by Microsoft on the site. The findings:
Our sample group generally preferred Google to Bing analyzed at both the respondent level (53% to 41%) and the individual search level (49% to 42%). The preference for Google was most pronounced when respondents used popular search terms or selected their own search terms. Respondents who used Bing-suggested search terms preferred Bing and Google in nearly equal numbers.
Whoa: Not only did the subjects not prefer Bing two-to-one, they generally preferred Google. The one exception was when they used only the search terms Microsoft suggested—which implies that Microsoft is suggesting search terms more favorable to Bing than Google.
In a blog post on Freakonomics, Ayres concluded that “several of Microsoft’s claims are a little fishy.” In fact, he wrote, “we think that Google has a colorable deceptive advertising claim against Microsoft.”
That seems a little strong. Microsoft can still defend itself by pointing to its own 1,000-person study, which was conducted by a professional third-party research company called Answers Research to guard against bias. It also ran a second 1,000-person study using a different set of search terms, and Bing came out on top again, though by a narrower margin. (That may be why one of Microsoft’s more recent “Bing It On” ads dropped the “two-to-one” claim and just said that people “prefer” Bing over Google.) Still, Ayres’ starkly different findings do raise some questions about how Microsoft’s studies managed to turn up such Microsoft-friendly findings.
I asked Microsoft for comment, and they sent the following statement from Matt Wallaert, a behavioral scientist at Bing:
The professor’s analysis is flawed and based on an incomplete understanding of both the claims and the Challenge. The Bing It On claim is 100% accurate and we’re glad to see we’ve nudged Google into improving their results. Bing it On is intended to be a lightweight way to challenge peoples’ assumptions about which search engine actually provides the best results. Given our share gains, it’s clear that people are recognizing our quality and unique approach to what has been a relatively static space dominated by a single service.
At this point, it's Microsoft's study against Ayres'. That said, Microsoft might have a stronger case if it provided some more substantive criticism of Ayres' methodology than simply asserting that it's "flawed"—and then backing up and downplaying the whole thing as "lightweight." Funny: I didn't see the word "lightweight" in Microsoft's previous write-ups of the study results, let alone those national TV ads.
Update, Thursday, Oct. 3, 10:39 a.m.: Microsoft's behavioral psychologist, Matt Wallaert, has put up a blog post offering a more full-throated defense of the company's "Bing It On Challenge" and the results it claims in its commercials. In the post, Wallaert argues that Microsoft's survey sample of 1,000 was in fact more statistically powerful than the sample used by Ayres. And he defends Microsoft's policy of not tracking the answers given by people taking the challenge on its website. Doing so, he says, would be both unethical and totally unscientific. You can read Wallaert's full post here.