HOME / do the math: A mathematician's guide to the news.

Guilt by CalculationIt takes more than an Excel sheet to prove the Iranian election was fixed.

Read more of Slate's coverage of Iran's June 12 election and its aftermath.

(Continued from page 1)

Now let's look at that first batch of votes, made up of 360 of our 1,000 regions (corresponding to the first real batch of 36 percent of the votes). Absent any reason to think that this particular sample is skewed compared with the overall vote, we can employ the following beautiful and simple formula: "The standard deviation of the average over N regions is the standard deviation of each region divided by the square root of N."

So the amount by which it's reasonable to expect that batch to differ from overall average of 67.2 percent is 20 percent divided by the square root of 360, or 1.05 percent. In other words, even if we assume a wide variance in the support for Ahmadinejad in any region—20 points in either direction—a batch consisting of 36 percent of the electorate is likely to wander from the average only by somewhere in the neighborhood of 1 percent.

And Ahmadinejad's reported total of 70 percent for the first 36 percent of the vote misses his average by substantially more than that, suggesting even messier data than our scenario predicts. The same argument estimates the standard deviations of the other five batches as 1.5 percent, 1.4 percent, 2 percent, 2.6 percent, and 2.2 percent, respectively. In other words, these figures, though they may seem eerily consistent at first glance, are actually just what we would expect. That's the nature of large batches of data, governed by what's called the Law of Large Numbers: Averages of widely varying quantities can, and usually do, yield results that look almost perfectly uniform. Given enough data, the outliers tend to cancel one another out.

Of course, these estimates depend vitally on the arbitrary guesses about the sizes of the regions and their individual vote totals we made when setting up our estimate. But every reasonable guess I tried yielded the same result; on purely statistical grounds, the Iranian election numbers look more or less reasonable. It might be a different story if Ahmadinejad had drawn between 67.1 percent and 67.3 percent in all six batches, suggesting a standard deviation of less than 0.1 percent—or if 500 mini-batches of data, each making up 0.2 percent of the vote, were all in that 62 percent to 70 percent range. (One reason American readers may be more used to seeing wide swings in the vote totals is that our fine-grained media start reporting results when just a few percent of the votes are in.)

I'm not saying the election wasn't fixed; Juan Cole and Richard Sexton offer more reasons for doubting the government's numbers. On the other side, Ken Ballen and Patrick Doherty argue that their pre-election polling is consistent with a big Ahmadinejad win. Either way, the final verdict on the Iranian election won't be settled by drawing a graph. The official numbers may or not be authentic, but they're definitely messy enough to be true.

Print This ArticlePRINTEmail to a FriendE-MAILShare This ArticleRECOMMEND...Get Slate RSS FeedsRSS
Jordan Ellenberg is an associate professor of mathematics at the University of Wisconsin. His first novel is The Grasshopper King. He blogs at Quomodocumque.
Chart by Chris Wilson.
COMMENTS

I think this analysis makes an assumption that there is no bias in the timing of the reporting for regions that support one candidate vs. another. This is not a valid assumption in most US elections - rural regions (usually supporting Republican candidates) very often report early while urban regions (usually supporting Democratic candidates) report late, with the result that early percentages are wildly at variance with final totals. My understanding is that there is a similar dynamic in Iran as far as rural vs. urban preferences.

-- wombat123
(To reply,
click here)

This article makes for interesting reading but grossly oversimplifies statistical analysis and criticisms of the election.

You would expect, on average, the various large batches of votes to reasonably consistent in the percentages allocated to each candidate. However, what really matters are other factors including the geographical location (certain areas - such as the opposition candidates' home towns - were predicted to support these opposition candidates.) As CNN said the other day, saying Mousavi lost the Turkic vote is like saying Obama lost the African-American vote to McCain.

It's not the average that is so worrying. It's the distribution.

85% of Iranians in U.K. voted for Ahmadinejad? Yeah right.

-- searlen
(To reply,
click here)

What did you think of this article?
Join The Fray: Our Reader Discussion Forum
POST A MESSAGE | READ MESSAGES
TODAY'S PICTURES
TODAY'S CARTOONS
TODAY'S DOONESBURY
TODAY'S VIDEO
Office party high jinks.98/091215_TP.jpg
Cartoonists' take on lobbying.70/091215_TC.jpg
The gathering.60/091215_TD.jpg