What it’s like to write a dictionary: A transcript of Episode 15 of Slate’s Working Podcast.

On Writing a Dictionary: Working Podcast Episode 15 Transcript

On Writing a Dictionary: Working Podcast Episode 15 Transcript

Slate Plus
Your all-access pass
Dec. 11 2014 5:01 PM

Slate’s Working Podcast: Episode 15 Transcript

Read what David Plotz asked a Merriam-Webster lexicographer about her workday.

Kory Stamper
Kory Stamper.

Photo illustration by Slate. Photo by David Plotz.

We’re posting weekly transcripts of David Plotz’s Working podcast for Slate Plus members. This is the transcript for Episode 15, featuring Kory Stamper, a lexicographer for Merriam-Webster. To learn more about Working, click here.

You may note some differences between this transcript and the podcast. Additional edits were made to the podcast after we completed this transcript.

David Plotz: Now we’ll really start. What is your name, and what do you do?

Kory Stamper: My name is Kory Stamper, and I’m a lexicographer at Merriam-Webster.

Plotz: And what does lexicographer mean?

Stamper: I write dictionary definitions.

Plotz: And how long have you been a lexicographer? And can we just call you a definer? Because that is a word I’m not going to be able to pronounce over and over again? How long have you been a definer?

Stamper: I’ve been a definer for 16 years now.

Plotz: What’s the training to be a definer?

Stamper: It’s all in-house. There are only a couple of requirements to be an editor. One is that you have to be a native speaker of English. One is that you have to have an accredited college degree in anything. It doesn’t have to be linguistics.

Then you get hired and you go through months of what we call “style and defining” classes. So, you start with reading through the front matter of our biggest dictionary, which is the Webster’s Third New International Dictionary, Unabridged. That front matter is 45 pages of 4-pt type. You take notes on it, and then we start—

Plotz: Do you have to actually read it in the 4-pt type, or do they give you a version which is the 12-pt type version?

Stamper: I read it in 4-pt type. Maybe they do it in 12-pt type now, but when I was trained you just opened up the front cover and started reading.

And then we start with the style and defining classes. The first thing you need to do is relearn English grammar, because you are now the person deciding what part of speech a word is, and so you need to be able to identify contextually what part of speech a word is, which you discover very quickly is not easy at all. And then as you’re doing that, you practice writing dictionary definitions. And the way that we do that is you get a batch of what we call “citations,” which are pieces of a word in context—so, they are taken from various sources—and you read through them, and then using that context you start drafting definitions. And then senior editors who are giving these classes will critique them, and pull them apart, and tell you, “No, that’s off the mark.” Or, “We don’t use that convention.” Or, “You missed—this sense doesn’t have enough evidence, or has too much clumped evidence and one source.” So, that goes on for—it depends, two to four months, maybe.

Plotz: So talk us through the process of actually defining a word. Pick a word that you want to define or that you’ve been assigned, tell us even how you get assigned it, and then how you go through the process of giving us a definition.

Stamper: OK, there’s two parts to this. The first is sort of the prework that is done before you start defining, and that is what we call reading and marking. So, every editor at the company spends an hour or two a day reading. We read everything. We read books, magazines, blogs, I’ve read and marked beer bottles and cereal boxes, other people have read and marked the yellow pages, dance programs, all sorts of stuff.

And what you’re doing when you read and mark is, you’re reading through and you’re looking for words that catch your eye. Sometimes they’re new, sometimes they’re old, sometimes it’s a new use of an old world, sometimes it’s a brand new word. What you do when you see it is you underline it and you bracket the context, and then that bit of context gets stuck into a database, and before we had databases it was printed on a 3x5 card. And those are called citations. And that’s the raw material you use when you’re defining.

Plotz: When you say you underline and bracket, do you then take a photo of it? Do you then type it out? How does it then get to somebody who can put in your archive?

Stamper: We actually have a typing department. So, we have a department of typists who go through—the Web stuff that we mark gets sort of slurped immediately into a database, but the typists go through everything print and retype into our database. And before we had that, they would, you know, feed an index card into their typewriter and type the information on that index card, and then manually file it.

So once you’ve done the citational work, let’s say it’s time to revise a dictionary. What we do is, every dictionary is broken up into tiny chunks. It’s usually one column, I believe, of a dictionary page and all of the words in that column. So, we might have 5,000 batches in something like the Collegiate Dictionary.

And we begin and you sign out a batch. So I open up the spreadsheet and I look for the defining column. Because when a batch is signed out, it’s not just for defining. Once I’m done with it there’s seven or eight different steps that it goes through. But when I get to it—let’s say I’m going to sign out the batch that is “green” to “grow”—

Plotz: Why have you signed that batch out?

Stamper: That’s the next batch available. You go by whatever batch is available next. So, I—

Plotz: And does is start at, do you start at A? Do you work your way through it in alphabetical order? Are the batches made available in some random order just so A doesn’t get more attention or A isn’t older? Or is it just a march through?

Stamper: So dictionary writers never start at A. I don’t know of any lexicography company that starts at A. You start in the middle of the alphabet, usually on one of the smaller letters in the middle, so, H or I or J. And the reason you do that is because, A, B, C, and D are actually huge letters. They take up a fifth of the book. Any dictionary you look at, A through D takes up a fifth of the book. So you start at a smaller letter, but it just takes practice. And you would much rather go in and revise the letter J because that was the first letter you did, than to have revise A all over again at the end of the process. So we start in the middle, you work to the end, and then you start at the beginning again.

Plotz: OK, so go back to signing out. You’ve signed out.

Stamper: OK. I’ve signed out. Now, in the olden days when you would sign out a batch that was a physical thing. So you would initial a piece of paper and you would go to this huge bank of filing cabinets, and on top of all of those filing cabinets are shoeboxes, essentially. And you would find the shoebox that has all the new citations that we’ve collected since the last time we revised this dictionary, for your alphabetical section. So I go find the box for “green” to “grow.” Or boxes. Sometimes you’ve got a lot of citations for something.

And then I go find the physical paper copy. What we do is, you take that column, you triple space it, and you put it on what we call galleys, so that in the olden days you could actually make revisions on those galleys. I could shorten a definition or I could add a definition. Now everything is electronic, but the idea is the same. I still sign out a batch that’s “green to grow.” I get that file, which—how many words? Let’s see, it really depends. I would say you’re probably looking at for the most part 25 words, but it just depends. I mean, I’ve signed out batches that are 50 words and I’ve signed out batches that are one word. It just depends on where you are in the alphabet.

So then, you get your file. That’s your file. It works pretty much like a word-processing program and it’s sort of the same thing. You can insert and move things around. And now our citations are in this database. So what you do is, you start—let’s say my first word is “green.” So, what I do is, I open up the file and I read through the existing entry for green. And then I pull all of my citations. I’m either doing that physically—I take them out of the box—or I’m loading them into a reading program, and then I start reading each citation.

In the old days you would read a citation and you’d want to see, is it covered by an existing meaning of this word or not? And in the old days I’d read the card, and if it was covered it went into one pile or one of many piles, and if it wasn’t it went into a separate pile. So, it’s all contextual reading. That’s how you determine what a definition of a word is.

Plotz: How many citations might there be for a word?

Stamper: Well, that depends on what the dictionary is. So I said I was working on the Unabridged Dictionary at this point, and I—one of the words that I revised was the word “god.” And that had 16,000 citations in the file, so, you know, the word “god” gets used a lot. Another word, one of my first words that I defined, was “bodice-ripper.” “Bodice-ripper” doesn’t have many—you know, it doesn’t have 16,000 citations. It’s got, you know, maybe, I don’t know, somewhere between 50 and 100 citations, something like that.

The more common the word, generally speaking, the fewer the citations, because they don’t catch your eye as much. So, there are some times when you’re reading and marking—certainly all of us do it—where you’ll read a section of a blog post, let’s say—and I am marking along, and I’ll think, “You know what? I’m going to mark every instance of the word ‘for,’ just so that we’ve got evidence for ‘for.’” So, I underline every instance of “for,” I highlight it, and I send that in. But generally speaking, when you get to the simpler words you actually have to go through—you have to search the database for words that have specifically been marked. It’s good to just get—I might want every instance of “for” that we have or every instance of “for” followed by “me,” or “us,” or some kind of pronoun.

So, you start by reading. You’re just sort of mentally piling things now. So, I’ll sort of click things, OK, that goes there and this goes to—that goes to sense four, that goes to sense two, and this is a new sense. And that process sounds very straightforward, and it is often not as straightforward as you would think. There are some words that you think, “Well, this use could be a new, emerging use of this word, or it could—if I use sense two and I tweak it a little bit, it could be covered under sense two.”

Plotz: So, going back to the reading and marking. How many of you are reading and marking?

Stamper: All of us try and do it, so, you know, about 40 of us.

Plotz: So there are 40 of you but you don’t encompass the entire wondrous array that is American English language usage. It’s also—you’re only looking at things that are written down, so there are all these things that people say, for example, that they don’t write down. There are very specialized scientific or trade publications. How do you account for the huge variety of English that might not be in what you guys see in your telescope?

Stamper: You’d kind of be surprised. We certainly don’t have the whole scope, and nobody does because English moves faster than anyone’s attempt to record it or catalogue it. But all of us are really very conscious of trying to get a good array, a broad spectrum of things. And you’d be surprised, particularly in the Internet age, if someone says something in a TV interview for instance, that gets reported in transcript reports, that might get quoted in a blog piece about something. People will quote things on Twitter and suddenly I can pull a quote from Twitter if I need to. So it’s this weird balancing act. We always are aware of, you’re always aware of there are senses, emerging meanings, and specialized information that we know we’re not getting.

The one thing that’s interesting is that people approach the dictionary and expect all the words that they know to be in that dictionary. No dictionary has all the words in the English language or even all the words that you know. And so people write it and say, “hey, why don’t you have…” For instance, I live near Philly. “Jawn,” which is a Philly term, JAWN, means “thing” and is not in the dictionary. It’s a regionalism. So, whenever I tell people what I do they run to the dictionary and they say, “‘Jawn’s’ not in the dictionary.” And I say, “I know,” but now we’re aware of it, you know? So, I’ve read and marked plenty of Philly local blogs and Philly local things that use “jawn,” and so “jawn” is now in the database and it will get evaluated. So, when things like that happen, and then as you’re defining if you see, for instance, “jawn” and you see, OK, there’s a lot of use from Philly. So, now I can do a little spelunking and say, all right, how popular is “jawn?” Well, it is definitely a Philly thing. It hasn’t spread into the rest of the mid-Atlantic yet, but it is really popular in Philly and it has a long, storied history in Philly. OK, so maybe I consider that for a bigger dictionary where I have more room, like the unabridged.

But it is a weird balancing act. You’re aware that you’re not getting everything and you wish there was a way that you could get it all.

Plotz: Going back to you, you’ve signed out your column and you’re looking at the citations, and at some point you have to decide, how do new words get in there that weren’t in the column before? And what gives them enough gravity to push itself in between “green” and “grow?”

Stamper: There are three criteria for entry just generally. The first is, a word has to have widespread use. That means different things for different books and for different subject areas. Widespread use generally means, we want it to have some kind of a significant national presence. We don’t have a dictionary of regionalisms. We focus on general vocabulary. We’re looking for things that are in print, though we do try to take spoken word citations when we can. That’s actually much more difficult than you’d think, because it’s very easy to mishear or miswrite what people are saying, especially with new words. If someone says a new word—like, if someone says “jawn” and I don’t tell you it’s spelled JAWN, everyone will write it JON, JOHN. So, we’re looking mostly for print.

So, widespread use, sustained use is a big deal. There are tons and tons of words that come into the language, see this huge spike in use for a year or maybe two and then drop out, and then ten years later see a huge spike again and drop out. Or, there are words that enter the language and take a hundred years to catch on. Like, a good example of a word that was coined and then sort of had this up and down, up and down over the years is the word “ollie" for the skateboard trick. It had huge use right after the creation of the “ollie,” and then it dropped out of use and was mostly used in skateboard magazines. And then it had another huge spike in use when snowboarding became really popular, and then it dropped out again. And over the last ten years we’ve seen more consistent use of it in things that are not snowboarding or skateboarding magazines.

A word that took a long time to catch on is “korma,” the Indian dish. It was coined back in the 1840s and had almost no use except in menus, where it was always explained in text, which is a sign to a lexicographer that a word hasn’t fully entered the language yet, probably up until the—I mean, not until the last maybe ten years. So, “korma” actually was entered into our dictionary last year, though it’s a word that over 100 years old, 150 years old at this point.

Plotz: You’re sitting there with your citations. You’ve made your piles.

Stamper: I’ve made my piles. The words that are already covered, there’s a fine tradition at Merriam-Webster called “date-stamping.” When you get hired you get a date stamp with your name on it, and you stamp all of the citations you use for any particular book. And we can do that electronically, too. But, you know, let’s say - it’s easier to visualize this working on paper, so let’s say I’ve got all my piles that are already in. So I stamp them all as used, I file them in the box in a section that’s called “used,” and then I take all of my other piles for new meanings and I start looking through those. And I start reading them in a slightly different way, because I’m going to start reading them more for bibliographic information and more for the tone. I’m going to start noticing, oh, this word is only used in Wine Spectator. So, oh, OK, maybe not for a smaller dictionary like the Collegiate. We might put that in the Unabridged, but it just doesn’t have the widespread use.

Trying to determine when a word has made it in is this weird sort of subjective call, it just comes with a lot of time and doing it. So I’ll look at a word—let’s say that I’ve got a word that is used everywhere, from, like, Vibe to the Wall Street Journal. It’s got pretty—you know, the spread I’ve got is over the last ten years and there don’t seem to be many big clumps.

And the third criteria that a word must have—and it sounds ridiculous but we have to say it—is that a word has to have a meaning. Most words have meanings. Some words are just used as examples of long words and have no lexical meaning. I can’t write a definition for them. So one question I always get is, “Why isn’t “antidisestablishmentarianism” in the dictionary?” And I say, “Well, tell me what it means. Use it in a sentence.” People will say, “Well, it’s a long word.” I’ll say, “But that’s not a meaning.” If someone said, “He used antidisestablishmentarianisms in his essay to impress his teacher,” well, that means that it means “long word,” but it doesn’t actually—it’s not used often. I think I’ve only found three or four citations in all of our files where it’s used with any meaning whatsoever.

There are very few words that don’t have a meaning. Even filler words like “uh” and “um,” you can lexicalize those. You can say those have a lexical meaning and they have lexical weight.

Plotz: Give us an example of a word and tell us how you end up writing that definition.

Stamper: One word that immediately pops to mind is the word “take,” which I defined for our Collegiate dictionary sort of by error. Some words are pulled out for senior editors to do, and “take” was left in the general pile, and I signed it out. It took me a month to get through it. So if I have a new sense of “take,” for instance, the first thing I need to do as I’m evaluating things. Like I said, “OK, is this new use of ‘take’ really an extension of a current use? Can I just tweak a current definition and get it?”

So for instance, let’s say that I’ve got a bunch of citations for, “The witness takes the stand.” OK, and let’s say that that sense of “take” isn’t explicitly covered, that a witness coming up to, you know, the stand to be interrogated. So, I could say, all right, well, I do have a sense that means to assume or be awarded a position. Or, “He took the department chair position?” All right, OK, is that close enough to taking the stand? Well, no, there’s a slightly different connotation here.

Or I might look and say, are there any other uses of take that this actually falls into? Am I being blinded by the subject? Am I being blinded by the construction? And after a while, once you decide it’s entirely new, then comes sort of the mental crunch. Everyone does it differently. What I do is, I sit down with a legal pad and I use a pencil, and I start drafting definitions. So, “take” as in “take the stand,” I might start drafting a definition. OK, I know it’s a verb, so I have to write that definition as if it were a verb. The definition itself needs to substitute in for a verb, so it always is going to start with “to.” So, I will say, “take the stand,” that sense of “take” means “to be entered into court as a witness.” Well, no, I’m going to read some more citations, no, because you can be entered into court as a witness without actually physically taking the stand. “Take the stand” generally refers to being physically present at the court.

So I’ll cross that out. And I’ll write draft, after draft, after draft. And then when I get a good working draft—so let’s say that this sense of “take” means “to ascend to the witness—or, to ascend to the witness stand in a court in order to give evidence.” Let’s say that’s what that means. Then I take that draft definition and you weigh every single word. OK, ascend implies upward movement. Are all witness stands “up?” Maybe not. OK, so, I need to find a different word than “ascend.” So it really is like taking your brain and wringing it out, and you’re shaking loose all of the extra words that are not going to stick. And that’s the part of writing a definition that is brain-twisting exhausting.

In an ideal world there are no publishing deadlines and you could spend forever trying to hone these down. But you do the best you can. You sort of get at the shortest. You want concision and precision.

And then once that’s done I will—you never write the definition itself on the galley, and even on the electronic thing, we don’t actually physically insert the definition into the entry. You write it on a separate piece of paper or you put it in a separate file, because when I’m done with it, it then goes through three, four, five more levels of editing.

Someone’s going to read through the citations that I band together with that definition and say, “Oh, she got this wrong.” Or, “Oh, no, this is pretty good.” We might send it to someone on staff who has a law background so that they can so, “Oh, no, there’s actually another legal term that we should use here instead.” They are going to send it to the cross-reference people to make sure that every word I have used in that definition is also entered into the dictionary. Then we’re going to send it to somebody who’s going to make sure I haven’t made any typos.

So you spend all of this solitary time crafting a definition but it’s never actually “your” definition by the time it gets into a dictionary. It’s been touched by almost every other person on staff. We all have multiple roles when a book comes out.

But that’s how you do it. And then you write the definition on this little piece of paper or in the file, you add some example sentences if you feel it needs it, you can pull those from our citation files. You want to find an example sentence that is a typical use of the word, that doesn’t have any narrative interest. You don’t want people to get lost in the example sentence and not in the definition. And then when you’re done you stamp your name on it, you band that together with its group of citations, and you file that. And then you move on to the next new sense or new word.

You know, it seems like, lots of people think, “Oh, your job is reading all day, I would love that.” Or, “Your job is writing all day, I would love that.” And there is a creative element to the writing and the reading can be really enjoyable, but you’re doing it at a level that is so specialized and finely tuned that—I go home at night from defining and I am the most inarticulate slob. I need about 30 minutes to decompress before I can put a sentence together, because I’ve just used up all of my words. There are no more words left at the end of a day.

Plotz: You work at home, or maybe you work—you don’t work wherever central office is. Does it have to be solitary? Could it be a collaborative effort?

Stamper: No. I take that back. You can, and I have often, asked coworkers for help in helping me puzzle out parts of speech or saying, “OK, I’ve written this definition. Does this even make sense to you?” Or, “Does this example sentence fit well with this? Can you think of another more common use?” But it is primarily just you and the files. In fact, even in the main office—the office upstairs where all of the editors work at Merriam-Webster—is kind of this half-open office plan. We all have cubicles. There are a handful of editors who have their own offices, but not many. And until 1996, I believe, there was a formal rule of silence on the floor. There was to be no talking. No conversations could be had. We had two telephones for use and they were in private telephone booths on either end of the floor. If I wanted to—if I needed to talk to somebody, physically talk to them, I could go to their cubicle. I would drop down so that I was chair level with them, and I would talk in an undertone like this, and I would say, “Where do you think we’re going to go to lunch today?” “I don’t know.” And even sometimes that would be too loud. You could hear someone in the next cubicle, “Oh! Oh!” You’d go, “I’m sorry, I’m sorry!”

So, in those days if you wanted to communicate with your fellow editors you used the interoffice mail and you sent what we called pinks, which were pink index cards. We used them for notes in the files. If I found a typo, for instance, I would write it on a pink. I’d say, “I found a typo at XYZ entry.” You write a pink. It’s the editor’s initials or the book initials. The in-house nomenclature we use goes in the top right-hand corner, the word that you’re focusing on goes in the top left-hand corner, and then your note is central and you stamp it.

So if you are a social person, if you need social interaction every day - you will die in this job. It is very quiet. It’s very alone. And people who love no interaction thrive; they just love it. You could, if you really wanted to, go days or weeks without speaking to somebody.

Plotz: What is the metric by which you measure how good a definer you are?

Stamper: That’s hard. I think one of the metrics is—and this is kind of a weird metric—but particularly for general definers, the longer you’re there and the better you are, the more boring the words you get. So one of my colleagues who is a brilliant editor got the indefinite article “a,” and she found a new sense of “a” that had not been documented before. It was a sense that, when you would say, like, “A triumphant Mrs. Smith returned home from the campaign trail.” Well, that use of, “A triumphant Mrs. Smith” is different than, “A Mrs. Smith returned home from the campaign trail.” Previously those two had been lumped together, and she is a good enough reader and editor and she was like, no, those have very subtle but very different meanings.

There are a certain set of words that are very small that tend to be done by senior editors or older editors, because they require a lot of that kind of very careful, close reading and pulling apart. “Take” is one of those words. “Do,” “go,” “run,” “set,” “get.”

So essentially, the smaller the word the more difficult the word. Defining a word like “bodice-ripper” is really easy, actually. It might have one or two meanings, and it always has the same general form, and it’s always used in the same sort of way. But a word like “for” or “as,” it’s just in the air you breathe. It’s in everything that you read. And so, a good definer will be able to go through all of those citations—you know, if you want to be comprehensive and pull 50,000 citations from the database to read them you can—but they’re going to be very, very specific about making sure that, OK, well, this sense of “as” in “as it were” is slightly different than this sense of “as” in “as if,” so maybe I need to pull those apart.

When lexicographers get together—and we do occasionally and we do have sort of awkward conversation together—that’s sort of the measure of which. Like, “Whoa, this person worked on ‘run!” Like, whoa! This person worked on “a!”

One of the words I worked on for the unabridged was the word “god,” and that was terrifying. I mean, it took me four months I think to do the entry, and so much of it was, OK, well, how‑you know, what do I lump together? What do I split apart? If I’m going to start splitting apart different meanings of “god,” can you ever really define “god?” “God” is a word that’s used so vaguely in most writing that it’s like, well, I can’t—I don’t know. Is God merciful all the time? Every use of this sense of “god?” I don’t know. Is God omnipotent, omnipresent? I don’t know. You have so much space that now you feel like, well, there’s no reason to not subdivide all of these things or to not list every single possible use of the word “take,” the verb, but at the same time there are plenty of words that once they get into the dictionary it is really difficult to remove them from a dictionary. You need to show that there’s no significant historical use in the canon and that there’s no current use for—it depends. I tend to look back 50 years. You’re always going to find some use in the last 50 years. So even though we’ve got lots of space, there is still this judgment call of, OK, this might still be emerging, this might still be regional.

Plotz: So is the kind of job you can die in? You can do it for 60 years and then, you know, pass away while struggling with a new sense of “run?”

Stamper: This is absolutely a job you can die in. Yeah, I think we—at Merriam-Webster we have 40-ish editors and a good chunk of those are in what we call the 25 Club, which are people that have been there for 25 years and more. It’s an interesting job because, you know, when you think of an industry you might think of institutional memory, that some people are the institutional memory for something. And in lexicography it’s much more creative, in that it’s craft. The people have been doing it for 40 years and have honed their craft to a point where they’re sort of luminaries in lexicography. The field of dictionary publishers has contracted pretty significantly over the 20 years, you know, over 20 years or so. And so, you know, good editors, there’s always a call for good editors, and generally the longer you do it the better you get at it. I mean, I’ve been defining for 16 years and I really feel like it took me 10 years to feel like I had a grasp on it.