Lexicon Valley

Pangrammatic Tweets!

Twitter has made possible in recent years, for both linguists doing rigorous research and lay people just having fun, a lot of wonderful text-mining projects. The Twitter bot @haikuinyou, for example, was designed to find accidental haikus, and, perhaps more impressively, @pentametron searches for rhyming tweets in iambic pentameter and then fashions couplets out of them. Talk about found poetry!

And then there is found wordplay. I’m a big fan of @anagramatron, which discovers paired tweets that form serendipitous anagrams of each other. (Example: “Last time I do anything” ⇔ “That’s it. I’m dying alone.”) Now, courtesy of the lexicographer Jesse Sheidlower, comes @PangramTweets, in which each tweet contains every letter of the alphabet at least once.

Sheidlower explains the project on his site:

PangramTweets is a bot (a computer program that runs on its own) that searches Twitter for, and then retweets, pangrams—texts that contain every letter of the alphabet. A famous pangram, sometimes used as a typing test, is “The quick brown fox jumps over the lazy dog.” […]

You may find the results interesting, or dull. I make no judgment on this. The bot is entirely automated; I do not curate the results.

The bot originally did not discriminate against known pangrams of the “quick brown fox” variety, but, by popular demand, Sheidlower put a filter in place for that, as well as for tweets in foreign languages. He says he gets “one real pangram in every few million tweets scanned.”

It’s unlikely that Vindu Goel, a technology reporter for the New York Times, was aware that this tweet of his, for instance, contained every letter of the alphabet:

Here are some others that have turned up so far:

As I mentioned, Sheidlower is filtering out non-English tweets, however some Indonesian ones keep seeping through. Since I’ve done research on colloquial varieties of Indonesian, I find these especially fascinating. I was initially surprised that the Indonesian Twittersphere would be generating pangrams, considering that the letters Q, V, X, and Z appear only in loanwords. But Indonesian Twitterers are apparently using quite a lot of Anglicisms, along with many txtspk-style abbreviations of Indonesian words. An example that recently popped up:

The loanwords here are EXCITED, JOIN, and LITTLEQUIZ, and 1D refers to the band One Direction. Here’s a key to the abbreviation-heavy Indonesian items:

BGT = banget ‘very’
GRGR = gara-gara ‘just because’
MW = mau ‘will’
K = ke ’(come) to’
INDO = Indonesia
LBH = lebih ‘more’
LG = lagi ’(even) more’
KLO = kalau ‘if’
DAN = dan ‘and’
BCA = baca ‘read’
JG = juga ‘also’
PASTI = pasti ‘definitely’
LO = (e)lo ‘you’
MKIN = makin ‘more and more’
CEK = cek ‘check’

So that would translate to: “@PutriAZSYA Very excited just because One Direction is coming to Indonesia. You’ll be even more excited if you join LittleQuiz @1D_CrazyLovers, and also read FFNY. You’ll definitely get more and more excited. Check Fav6.”

It’ll be interesting to see if the bot turns up a naturally occurring “pangrammatic window“—the number of consecutive letters over which a pangram plays out—that beats the current record-holder of 42 letters, which appears in Cube Route, a novel by the science-fiction and fantasy author Piers Anthony:

“We are all from Xanth,” Cube said quickly. “Just visiting Phaze. We just want to find the dragon.”

Sean Irvine announced the discovery of this pangrammatic window in Word Ways in 2012. It beat out Eric Chaikin’s 47-letter find, which he discovered after Googling “Joaquin Phoenix”:

“… movie review of The Yards: Mark Wahlberg, Joaquin Phoenix, Charlize Theron…”

Of course, determining if a pangram is “naturally occurring”—which is to say, occurring by chance and not design—may be difficult, since it’s always possible to game the system. But with half a billion tweeters tweeting, maybe someday one of them will authentically produce a 30-letter winner like “Benghazi quickly warped Fox’s TV jam,” which I just made up.

If it does happen, I’ll definitely get more and more excited.

A version of this post appeared on Language Log.