This week 281 students between the ages of 8 and 15 will assemble outside Washington, D.C., for the Scripps National Spelling Bee. Over the last 10 years, it’s taken an average of 645 words and 5,680 letters to weed out the wannabes from the one who outspells them all. Looking at past trends, we can take a shot at predicting which letters and sounds will cause contestants to go home D-E-F-E-A-T-E-D and brō-kən.
Thanks to the folks at the National Spelling Bee (who sent me complete records for the last decade) and Merriam-Webster (which provided their pronunciations), I’ve been able to compile statistics on all of the words that have been spelled correctly (there are 5,042 of them) and incorrectly (1,409) during the traditional oral rounds. (I didn’t look at words that were part of the bee’s written test.) So, what’s most likely to throw a speller off?
You might suspect that longer words are more likely to trip up contestants. The two longest words in the data set were 17 letters apiece: triboluminescence and idiosyncratically, both of which sent their spellers home. But long words aren’t always so tricky. Five of the eight 16-letter words were spelled correctly, Michelangelesque and sphygmomanometer among them. And of the two shortest words to appear in the spelling bee in the last 10 years, gbo and rya, only the former was spelled correctly.
Looking at length more systematically, the number of letters in a word seems to have little correlation with spelling difficulty. Roughly half of the words in the bee have nine or more letters. These words were spelled correctly 78 percent of the time. By comparison, those with eight or fewer letters were spelled correctly 79 percent of the time.
In the first two oral rounds, which include a greater mix of weak and strong spellers, the effect of word length is more pronounced.
When you exclude the first two oral rounds and look only at the best spellers, words of nine letters of more are actually spelled correctly more often (70 percent of the time) than shorter words (65 percent).
It’s possible these statistics are the result of pure chance. Though more than 1,700 words have been spelled correctly in the third oral round and beyond, the difference between above-average-length words and below-average-length words barely misses out on statistical significance. Regardless, the fact that long words and short words are spelled correctly at roughly the same rate shows that, in general, the word pickers are doing a good job. Though word lengths can vary, ideally all words in a given round should be of the same difficulty.
If not length, what causes the most spelling hiccups? To answer this, I grouped spelling mistakes into three categories of my own design. The first is a substitution, such as spelling atrabilious as atribilious, mistakenly subbing an I for the second A. The second is a deletion: spelling ecchymosis as echymosis, erroneously removing a C. The third is an insertion: spelling vacillant as vascillant, adding an S that shouldn’t be there. Multiple mistakes were recorded if a speller, for example, had a substitution and an insertion error in the same word.
It was not possible to categorize every single mistake or every single word. For example, in one case a speller began to spell idiosyncratically as I-O and immediately realized his mistake, he finished by spelling the word as I-O-Q-R-S-Z-3-cuatro-F-L-V-R-Q. This word was tossed from the analysis, but the vast majority of the nearly 6,451 words from the last 10 years stayed in.
Most mistakes, by my categorization, were substitutions. Just short of 70 percent of spelling errors were caused by subbing in one letter for another, while 19 percent were deletions and 11 percent insertions.