The Mathematics of Narcissism
The common statistical thread between psychiatric diagnosis and grad school rankings.
There will be no more narcissists or paranoids by 2013—at least not officially. The upcoming fifth revision of the Diagnostic and Statistical Manual of Mental Disorders, 10 years in the making, will exclude "narcissistic and paranoid personality disorder" from its list of designated psychiatric diagnoses, along with "histrionic, dependent, and schizoid personality disorder." (Psychopaths, take heart—you're back in the book, after being written out of the DSM-IV.) The reshuffle hasn't been embraced by everyone: Clinicians of the traditional school worry that existing knowledge about best treatment for narcissistic patients will be lost to history along with the diagnosis itself. John Gunderson, a psychiatrist at Harvard Medical School and the chair of the personality disorders group for the previous DSM, wrote that the new guidelines in DSM-V needed decades of research to become "scientifically credible or clinically useful."
Another long-gestating project—the National Research Council's ranking of U.S. graduate programs in 59 subject areas—wrapped up in 2010 as well. The ranking came in five years overdue, and its results, like the new DSM, roused widespread dissatisfaction. Stephen Stigler, a professor of statistics at the University of Chicago, described the methodological problems as so severe that the project was "doomed from the start."
The particulars of the two scientific disputes are too gnarled to detail here. In a mathematical sense, though, the controversies are much alike. They both rest on the tension between two fundamentally different strategies of data analysis, clustering and dimension reduction.
Suppose you've got a large collection of objects—say, graduate programs in mathematics, or psychiatric patients. For each object you have a large collection of measurements. For the graduate programs, you can assess the average time to degree or median publications per faculty member. For the patients, the measurements could be responses to diagnostic questionnaires or assessments by clinicians on various scales.
But when someone wants to know "what's wrong with my patient?" or "which graduate school should I go to?" a list of two dozen numbers isn't so helpful. You need a human-readable description of the object in question.
One way to do this is clustering—you look for divisions along which the objects of study cleave naturally into groups bearing common features. This is the DSM-IV approach to clinical diagnosis: Narcissists resemble other narcissists more than they resemble paranoids, or borderlines, or people without any personality disorder at all. Humans are born clusterers—we almost can't help doing it. Entities neither fish nor fowl discomfit us: Politicians are liberal or conservative, animals break up into phyla, pop songs are black metal or death metal or math rock or shoegaze or grime.
Then there's dimension reduction—here, we try to boil down the many measurements to a few numbers that really matter. This is what we do when we boil down all the aspects of a baseball player's performance to his batting average (or, nowadays, OPS and Wins Above Replacement). It's what the NRC was charged with doing—given all the data about graduate programs, put them in order from best to worst. And it's the way the DSM, in its latest edition, now proposes to reclassify personality disorders. Instead of partitioning patients into groups, they are now measured on six personality axes: negative emotionality, introversion, antagonism, disinhibition, compulsivity, and schizotypy. The patient previously known as "the narcissist" will now be a high scorer on four facets of "antagonism": callousness, manipulativeness, narcissism, and histrionism. In the new paradigm, there's no breakpoint where one personality disorder stops and another begins—there's such a thing as narcissism, but no such thing as a narcissist.
The tough part of dimension reduction is figuring out which few numbers to use. Getting to consensus on the six personality axes took years. And if that sounds hard, consider the charge given to the NRC, which had to capture the essential features of a graduate program with a single number. That's a tall order—imagine if the DSM group had to devise a linear scale to rank patients from sanest to most crazy!
In the end, the NRC group couldn't agree on a uniform ranking. Instead, they offered a range of possible metrics and an interactive tool where users can rank departments via various dimensions endorsed by the NRC or using their own homebrewed measures. This compromise, statistically principled though it may have been, satisfied nobody. After waiting 10 years for the rankings to come out, people wanted their department's standing to be more definitive than "between 6th and 16th, depending on which metric you choose." (And the complaints about the NRC rankings weren't purely methodological—many departments complained the dataset itself was hopelessly shot through with error.)
So, what's the right way to tame your data? It depends on the objects. The researchers driving the changes in DSM take the view, which has some empirical support, that people really do vary continuously from normal to disordered. On that account, carving out various precincts of psychopathology is like divvying up people into "beanpoles," "fatties," and "pipsqueaks" instead of reporting their height and weight. Dimension reduction is the way to go.
The NRC, on the other hand, might have done better to toss the idea of rankings entirely, and just clustered the departments into natural groupings. The statistician Leland Wilkinson ran a quick and dirty clustering on the NRC data for math departments. He found that the departments broke up into five clusters: 10 elite departments, a big group of 59 upper-tier departments, 47 lower-tier departments, and two smaller clusters whose meaning, if any, isn't clear to me. This is much coarser information than a full ranking—but it has the advantage of not depending on politically contentious choices as to which criteria matter most.
In the end, dimension reduction and clustering are going to have to coexist. We rely on continuous metrics to describe baseball players, but at the same time we form mental clusters around prototypes like the plodding slugger and the crafty slap hitter. We cluster our music collections into genres and our politicians into parties, but it can be just as illustrative to map bands and senators in two dimensions using continuous coordinates. So narcissists, and the therapists who treat them, can breathe easy—the notion of the narcissist was alive before the diagnosis broke into the DSM in 1980, and it will persist after the diagnosis is gone.
Jordan Ellenberg is a professor of mathematics at the University of Wisconsin. His book How Not To Be Wrong is forthcoming. He blogs at Quomodocumque.
Photo by Paul Tearle/Thinkstock.