Psychologists’ Food Fight Over Replication of “Important Findings” 

July 31 2014 12:26 PM

Why Psychologists’ Food Fight Matters

“Important findings” haven’t been replicated, and science may have to change its ways.

washing hands.
One of the studies that didn't successfully replicate had to do with a report that washing your hands makes you less likely to perceive moral failures in others.

Psychologists are up in arms over, of all things, the editorial process that led to the recent publication of a special issue of the journal Social Psychology. This may seem like a classic case of ivory tower navel gazing, but its impact extends far beyond academia. The issue attempts to replicate 27 “important findings in social psychology.” Replication—repeating an experiment as closely as possible to see whether you get the same results—is a cornerstone of the scientific method. Replication of experiments is vital not only because it can detect the rare cases of outright fraud, but also because it guards against uncritical acceptance of findings that were actually inadvertent false positives, helps researchers refine experimental techniques, and affirms the existence of new facts that scientific theories must be able to explain.

One of the articles in the special issue reported a failure to replicate a widely publicized 2008 study by Simone Schnall, now tenured at Cambridge University, and her colleagues. In the original study, two experiments measured the effects of people’s thoughts or feelings of cleanliness on the harshness of their moral judgments. In the first experiment, 40 undergraduates were asked to unscramble sentences, with one-half assigned words related to cleanliness (like pure or pristine) and one-half assigned neutral words. In the second experiment, 43 undergraduates watched the truly revolting bathroom scene from the movie Trainspotting, after which one-half were told to wash their hands while the other one-half were not. All subjects in both experiments were then asked to rate the moral wrongness of six hypothetical scenarios, such as falsifying one’s résumé and keeping money from a lost wallet. The researchers found that priming subjects to think about cleanliness had a “substantial” effect on moral judgment: The hand washers and those who unscrambled sentences related to cleanliness judged the scenarios to be less morally wrong than did the other subjects. The implication was that people who feel relatively pure themselves are—without realizing it—less troubled by others’ impurities. The paper was covered by ABC News, the Economist, and the Huffington Post, among other outlets, and has been cited nearly 200 times in the scientific literature.

However, the replicators—David Johnson, Felix Cheung, and Brent Donnellan (two graduate students and their adviser) of Michigan State University—found no such difference, despite testing about four times more subjects than the original studies.


Aggrieved about several aspects of the replication process, Schnall aired her concerns, first to a journalist covering the special issue for Science, and then on her personal blog. When the Michigan State researchers told her that they planned to replicate her study, Schnall had gladly provided them with the materials she used in her experiments (the moral dilemmas she asked subjects to consider, the procedures she followed, and so on). She had also accepted the journal editors’ invitation to peer review the experimental protocol and statistical analysis the replicators planned to follow. But after that, she felt shut out. Although Schnall had approved of their proposed method of collecting and analyzing data, neither she nor anyone other than the editors reviewed the results of the replication. When the replicators did share their data and analysis with her, she asked for two weeks to review it to try to determine why they had failed to reproduce her original findings, but the manuscript had already been submitted for publication.

Once the journal accepted the paper, Donnellan reported, in a much-tweeted blog post, that his team had failed to replicate Schnall’s results. Although the vast majority of the post consisted of sober academic analysis, he titled it “Go Big or Go Home”—a reference to the need for bigger sample sizes to reduce the chances of accidentally finding positive results—and at one point characterized their study as an “epic fail” to replicate the original findings.

After reviewing the new data, Schnall developed an explanation for why the Johnson group failed to replicate her study. But the guest editors of the special issue (social psychologists Brian Nosek of the University of Virginia and Daniel Lakens of the Eindhoven University of Technology in the Netherlands), having initially said that the original authors of some of the replicated studies “may” be invited to respond, now told her there was no space in the issue for responses by any original authors. The editors also disagreed with her argument that she had found an error in the replication that rendered it “invalid” and warranted editorial intervention. (Schnall’s claim of error involves some technical issues regarding measurement and statistics, and has been analyzed by several respected methodologists. At present, the consensus is running against her on this point.)

The editor in chief of Social Psychology later agreed to devote a follow-up print issue to responses by the original authors and rejoinders by the replicators, but as Schnall told Science, the entire process made her feel “like a criminal suspect who has no right to a defense and there is no way to win.” The Science article covering the special issue was titled “Replication Effort Provokes Praise—and ‘Bullying’ Charges.” Both there and in her blog post, Schnall said that her work had been “defamed,” endangering both her reputation and her ability to win grants. She feared that by the time her formal response was published, the conversation might have moved on, and her comments would get little attention.

How wrong she was. In countless tweets, Facebook comments, and blog posts, several social psychologists seized upon Schnall’s blog post as a cri de coeur against the rising influence of “replication bullies,” “false positive police,” and “data detectives.” For “speaking truth to power,” Schnall was compared to Rosa Parks. The “replication police” were described as “shameless little bullies,” “self-righteous, self-appointed sheriffs” engaged in a process “clearly not designed to find truth,” “second stringers” who were incapable of making novel contributions of their own to the literature, and—most succinctly—“assholes.” Meanwhile, other commenters stated or strongly implied that Schnall and other original authors whose work fails to replicate had used questionable research practices to achieve sexy, publishable findings. At one point, these insinuations were met with threats of legal action.

Brent Donnellan apologized for his use of “go big or go home” and “epic fail,” and another researcher apologized for comments that seemed to imply that Schnall’s original work might not have been “honest.” But for too long, the discussion continued to focus on who chose the wrong words or took the wrong tone, whose career and reputation were mostly likely to be hurt, whose research plans have been most chilled, and who did what to whom—and what their real motives were.

* * *

This all may seem like little more than a reminder of the adage that the politics of academia are so nasty because the stakes are so small. The #repligate controversy spiked to an unusual intensity, even for academia. But the stakes for the rest of us are anything but low. Scientific knowledge is not produced by scientists alone, and it certainly doesn’t affect only them.

