Still, when it comes to solving tough computer programming problems, sometimes scientists discover a shortcut that allows them to make strides much faster than they assumed they could. A good example comes from another, closely related branch of the “natural language processing” field—the effort to create more accurate grammar checkers inside software like Microsoft Word. From the 1950s until the late 1990s, programmers believed good grammar checks would require a computer to truly “understand” English, so they tried to approximate the “listening” process babies and children undergo as they learn to speak, hand-coding vocabulary definitions and grammar rules into the computer. The problem was that computers turned out to be terrible at colloquialisms, metaphors, aphorisms, tone, clarity, and all the other non-rule-based features that make a language both lively and correct—and that native speakers intrinsically grasp.
So instead of teaching computers vocabulary and grammar, programmers tried scanning thousands of pages of text into the computers, and then used statistics to analyze the probabilities of various word and sound groupings. It worked. This innovation led not only to improved grammar checking and to the simple automated writing assessment we have today, but also to software like Google Translate, which, while imperfect, far outperforms previous generations of translation tools like AltaVista’s BabelFish.
Could there be a similar innovation on the frontier of essay grading—one that would allow computers to more accurately score even sophisticated forms of writing? A paper by ETS’s Derrick Higgins and Beata Beigman Klebanov points to a potential path forward: using Web databases of human knowledge, like online encyclopedias and news repositories, to check how factual and intellectually sophisticated an essay truly is.
An experimental program called the Stanford Named Entity Recognizer can pick out proper nouns like “Chaucer” and “Albert Einstein” with 82 percent precision. Another program, called ReVerb, can recognize about one-third of the “facts” writers present on such topics, such as the century in which Chaucer lived (the 14th) and Einstein’s most famous scientific contribution (the Theory of Relativity). Since computers can already recognize phrases that hint at an argument—such as “caused by” and “led to”—it isn’t inconceivable that in coming years, a program will be able to search Web sources on a certain topic, and then use its findings to assess the plausibility of a writer’s assertions.
Currently, however, computers struggle with determining how trustworthy various Web sources are, and they can’t weigh or synthesize competing claims from good sources. The ETS researchers cite the example of a real-life grad school applicant who argued in an essay that “Albert Einstein’s accidental development of the atomic bomb has created a belligerent technological front.” Historians and scientists debate the nature of Einstein’s role in the development of the atomic bomb, and human graders could certainly argue endlessly about whether the writer’s use of the words “accidental” and “belligerent” are historically justified in this instance (or whether his deployment of the perfect tense is grammatically sound).
Wes Bruce, Indiana’s chief assessment officer, has concluded the technology is promising, but that it must improve before it can be used on exams that are both high-quality and high-stakes: the kind that not only test for knowledge, but also determine whether students graduate from high school, or whether teachers receive high “value-added” scores for raising student achievement. Artificial-intelligence scoring, he says, is “pretty artificial, not too intelligent.”
For now, at least.
This article arises from Future Tense, a collaboration among Arizona State University, the New America Foundation, and Slate. Future Tense explores the ways emerging technologies affect society, policy, and culture. To read more, visit the Future Tense blog and the Future Tense home page. You can also follow us on Twitter