I have in my possession two tiles from a prototype of the game that would become Scrabble. They were crafted by architect Alfred Butts circa 1938. They are made of plywood with the letters stenciled in India ink. Their point values are handwritten on tiny squares of paper and glued beneath the center of each letter. My Q is worth 10 points. My X is worth six.
Any Scrabble player can tell you that the X is actually worth eight points. But as Butts was creating the game, in a fifth-floor walkup in Queens, he tinkered—with the layout of the board, with the total number of tiles, with their distribution, and with their respective point values. “It’s not hit or miss,” Butts said long afterward. “It’s carefully worked out.”
Seventy-five years later, Butts’ carefully worked out point values are under attack. Late last month, a University of California–San Diego, cognitive science postdoc and casual player named Joshua Lewis conducted a computer analysis to recalibrate Scrabble’s letter values based on the game’s current lexicon. Lewis reposted his findings to Hacker News, and they were picked up by Digg and went viral. Around the same time, Sam Eifling, writing for Deadspin, asked a programmer friend to do the same. Both were inspired by the fact that while the language had changed dramatically from the time Butts performed his calculations, the game of Scrabble had not.
It’s a fair observation. Since Scrabble was adopted in chess parlors in New York in the 1950s, competitive players have dissected its strategic quirks. One early realization was that short words have outsized value, so players scoured the preferred source (the now-defunct Funk & Wagnalls Standard College Dictionary) and compiled lists of two- and three-letter words. They also recognized that the most common letters showed up in a lot of words, so they recorded and memorized seven- and eight-letter words—ones that would earn the 50-point bonus for using all seven tiles at once—that contained A, E, I, N, R, S, and T, among other single-point letters. You didn’t need a computer to see that the Q, though worth the most points, was a pain in the rack but the Z not so much.
Since the publication in 1978 of the Official Scrabble Players Dictionary, a compilation of several standard college dictionaries, the game’s word list has grown by tens of thousands of words. From a playing vantage, the addition of QI (a Chinese life force) and ZA (short for pizza) in the last lexicon update, in 2006, were game-changers. Players feared the new words would cheapen Scrabble, boosting scoring and elevating the role of chance. It didn’t happen. The Q became less of a hindrance, a slightly fairer tile than before, and players adjusted strategy to account for the new gimmes.
That need to adjust validates Lewis’ and Eifling’s suspicion that the values assigned to letters aren’t in perfect harmony with the frequency of their use in English or in its narrower subset, the Scrabble word list. The two approached the problem differently. Eifling and software developer Kyle Rimkus totaled the number of letters in Scrabble-eligible words (1.58 million), isolated the frequency of each letter, and then calculated how overvalued or undervalued each letter was compared to its existing point value. Lewis’ approach was more complicated. He weighted letters not only by appearance in the Scrabble lexicon but also by the frequency with which they appear in words of different lengths (with emphasis on two-, three-, seven-, and eight-letter words) and by their ability to “transition” into and out of other letters.
The findings don’t differ much. In both analyses, the values of about half the letters change by one or two points. One or the other found that B, C, F, H, K, M, P, X, Y, and Z are overvalued, which makes some intuitive sense. For instance, the X (eight points) and the Z (10) can be easy money, especially since they occur in a number of short words; bumping them down to six points apiece is a logical move. Similarly, the H was set by Butts at four points, but it now appears in nine two-letter words and combines beautifully with other letters, while the M appears in 12 two-letter words. Living-room players detest the C, but they haven’t studied seven- and eight-letter “bingos,” in which C’s abound. The clunky U and V, by contrast, are undervalued—ratcheting them up to two points and five points respectively seems reasonable.
While the media pounced on the story (I joined in), the Scrabble community has been largely unmoved. Why? Several reasons. One, the game’s owners, Hasbro in North America and Mattel overseas, aren’t changing anything. Two, such proposed rejiggerings aren’t new. Three, players understand that variances, in letter values and tile distribution (too many I’s, the Q without a U), are part of the game and strategize accordingly. Four, there are other, arguably more sophisticated ways to assess tiles values. Five, and most important, adjusting any core variables would create a completely different game requiring different strategies. “It's basically saying, Let’s change the game to make a new game,” Jason Katz-Brown, a software engineer who co-wrote Scrabble’s best computer player, Quackle, told me.