How many words are there in English?

Language and how we use it.
April 10 2006 2:16 PM

Word Count

Are there really 988,968 words in the English language?

Download an MP3 audio interview with the author here, or sign up for Slate's free daily podcast on iTunes.

Dictionary. Click image to enlarge.

How many words are there in the English language?

No one knows. But that hasn't stopped an operation known as the Global Language Monitor from estimating * that—as of this writing—there are exactly 988,968 words in English. GLM has done a remarkable job suckering even the respectable press into believing that we're on the verge of adding the millionth word to English—at which point we'll presumably see another flurry of articles about GLM. Even so, its claim is a bogus one.

Advertisement

The problem with trying to number the words in any language is that it's very hard to agree on the basics. For example, what is a word? If run is a verb, is the noun run another word? What about the inflected forms ran, runs, and running? What about words with run as a base, such as runner and runnable and runoff and runway? Are compounds, such as man-bites-dog, man-child, man-eater, manhandle, man-hour, man of God, man's man, and men in black, to be counted once or many times?

Another question: What is English? The word veal, borrowed from French in the 14th century, seems to be English, as does spaghetti, a 19th-century Italian borrowing. But what about pho, a Vietnamese soup found from the 1930s but only recently common? Or the yet-more-recent banh mi sandwich? What about shurpa, a Bukharian soup, which can apparently be eaten in New York? What about words used by non-native English speakers in Singlish?

Even sticking with something that we can agree is English, what about obsolete words? Variant spellings? Regional dialects? What about words that are widespread, but only in a highly limited subgroup, such as bone, "a pre-1946 Martin guitar made of Brazilian rosewood having herringbone purfling on its top," GAS, "to ardently desire to purchase guitars" (from 'Guitar Acquisition Syndrome'), or hog "a guitar having a mahogany top, back, and sides," used among collectors of vintage guitars?

What about Frizzie, "student of Ms. Frizzle" or busigator, "the Magic School Bus transformed into an alligator," in the books I'm reading to my daughter? What about Giant, "a player on the N.Y. Giants football team"? The most comprehensive abbreviations-dictionaries include about 500,000 entries, most of which wouldn't be found in standard dictionaries. The American Chemical Society has a registry of over 84 million named chemical substances, and there are about a million named species of insects alone; surely these must count as words?

What about obvious forms? Dictionaries include great-grandfather but not great-great-great-great-great-great-grandfather, which is real enough to get over 3,500 Google hits. Only the most basic numbers are typically included; Merriam-Webster, for example, includes twenty-one and twenty-two, but not twenty-three or thirty-one. In fact, if you were to count every number between 0 and 999,999 as a word, you'd have a cool million right there—and still have the rest of the English language to account for.

At the other end of the scale, estimates of the number of words that an average person uses range from a few thousand (the number a person might actively use in a week) to many tens of thousands (the number an educated person might understand) or more. College-size dictionaries typically include almost 200,000 words (using a formula that counts each separately listed word or word-form); unabridged dictionaries from 300,000 to 600,000 or so. But each of these words is listed not for any intrinsic reason, but because a lexicographer decided it was useful to include. Twenty-three is just as real a number as twenty-two, but it doesn't have a common bullet caliber associated with it, so it often gets left out. Team names, as a class, generally fail to make the cut. We could always add words to the dictionary if there were no limitations on time or space.

So, where does that leave us? It's probably possible to devise criteria that would allow us to conclude that there are about a million words in English. (The dictionary publisher Merriam-Webster goes for "roughly 1 million words" in its discussion of this particular question, although elsewhere, they suggest that the figure could be many millions.) But there's no possible way to count the actual number of words in the language, and the idea of having a running counter, as is found on GLM's home page, is absurd. So, why have journalists fallen for the claim? I think it's the pseudo-scientific nature of GLM's "methodology": The company claims to use an "algorithm" called the "Predictive Quantities Indicator," so its figures must be right. According to the company's Web site, though, the PQI's count of English words is based on the entry list of a number of major dictionaries, so from the outset we know we're just getting a summation of lexicographers' judgment calls—including scientific, obsolete, and dialect forms—rather than an accurate, independent analysis of current English. Still, it sounds impressive to some. I recently got a call about GLM from a reporter, and when I explained why the million-word claim is bogus, he practically shouted, "But they have an algorithm!"

And they'll have a good party this summer, for the credulous among us.

TODAY IN SLATE

Frame Game

Hard Knocks

I was hit by a teacher in an East Texas public school. It taught me nothing.

Chief Justice John Roberts Says $1,000 Can’t Buy Influence in Congress. Looks Like He’s Wrong.

After This Merger, One Company Could Control One-Third of the Planet's Beer Sales

Hidden Messages in Corporate Logos

If You’re Outraged by the NFL, Follow This Satirical Blowhard on Twitter

Sports Nut

Giving Up on Goodell

How the NFL lost the trust of its most loyal reporters.

How Can We Investigate Potential Dangers of Fracking Without Being Alarmist?

My Year as an Abortion Doula       

  News & Politics
Weigel
Sept. 15 2014 8:56 PM The Benghazi Whistleblower Who Might Have Revealed a Massive Scandal on his Poetry Blog
  Business
Moneybox
Sept. 15 2014 7:27 PM Could IUDs Be the Next Great Weapon in the Battle Against Poverty?
  Life
Dear Prudence
Sept. 16 2014 6:00 AM Can of Worms Prudie offers advice to a letter writer who wants to blackmail a famous ex with tapes of his fetish.
  Double X
The XX Factor
Sept. 15 2014 1:51 PM Why Not Just Turn Campus Rape Allegations Over to the Police? Because the Police Don't Investigate.
  Slate Plus
Tv Club
Sept. 15 2014 11:38 AM The Slate Doctor Who Podcast: Episode 4  A spoiler-filled discussion of "Listen."
  Arts
Brow Beat
Sept. 15 2014 8:58 PM Lorde Does an Excellent Cover of Kanye West’s “Flashing Lights”
  Technology
Future Tense
Sept. 15 2014 4:49 PM Cheetah Robot Is Now Wireless and Gallivanting on MIT’s Campus
  Health & Science
Bad Astronomy
Sept. 15 2014 11:00 AM The Comet and the Cosmic Beehive
  Sports
Sports Nut
Sept. 15 2014 8:41 PM You’re Cut, Adrian Peterson Why fantasy football owners should release the Minnesota Vikings star.