When the Lights Go Out

What can reality TV teach us about clinical drug trials?

May 13, 200911:33 AM

Kris Allen performs on American Idol

From the combined 17 seasons of ABC’s The Bachelor and The Bachelorette—which regularly featured noisy declarations of on-air love—only a single marriage has emerged. Millions regularly tune in to Fox’s American Idol, but the show that produced bona fide superstars Kelly Clarkson and Carrie Underwood also crowned Taylor Hicks, who last year was dropped by his label, Arista. The Fortune 500 has yet to include a winner of NBC’s The Apprentice, and Anna Wintour has not featured an America’s Next Top Model victor on the cover of Vogue.

Though it may seem a stretch, the lessons of reality TV can help us understand why, for example, many parents recently were told—with the suddenness of Jason Mesnick’s dumping his fiancee Melissa Rycroft on the last Bachelor finale—that a large federal study contradicted its initial findings and concluded that drug treatment for attention deficit disorder had no benefit in children who were followed for six to eight years. These results put into question the widespread use of stimulants like Ritalin and Concerta, which were prescribed roughly 40 million times last year, and led to an acrimonious public debate among the study’s co-authors.

Paul Bataldan, who co-founded the Institute for Healthcare Improvement, once observed that “every system is perfectly designed to get the results it gets.” Reality TV participants desperate for fame or love sing (and lip-sync) shortened, preselected songs or go on over-the-top dates in fabulous locations, followed by a team of camera operators. The ostensible goal—the one declared to the audience—is to identify the most talented recording artist or most compatible couple. But as Bataldan might observe, the shows are instead designed to win over television viewers, which is an altogether different (and possibly incompatible) goal.

The parallels to a major clinical drug trial are uncanny. In the federal Multimodal Treatment Study, hundreds of kids with ADHD, whose families were desperate enough to enroll them in a randomized study, entered a well-funded and highly supervised National Institute for Mental Health program complete with specialized therapy, regular evaluation by developmental experts, and careful drug prescription—a setup that’s about as realistic as a date on The Bachelor. Within that very unusual, closely monitored environment, as reported in 1999, stimulant medications caused modest improvement after about a year. In response, use of these products surged nationwide, and Ritalin and its peers became household brands. But in March, the researchers described what happened after the lights went out. In their subsequent years in the real world, the drug-treated kids ultimately ended up no better off than the others.

Epidemiologists call this the problem of “surrogate endpoints,” and it’s no surprise to fans of reality television. Garnering the greatest number of text-messaging votes after a brief performance doesn’t always mean you’ll be a successful pop star; winning the final rose after an on-air courtship doesn’t mean you’ll have a happy marriage; and getting higher scores on a simple rating scale of attention-deficit symptoms doesn’t mean you’ll later succeed in school. In medicine, this problem happens all the time.

Few drug-trial studies have the time or money to study the actual health outcomes that people care about, such as whether the middle-aged man avoids a heart attack after a few decades, the hyperactive first-grader holds down a good job someday, or the menopausal woman remains free from a hip fracture when she’s elderly. Waiting for these events would stifle any meaningful innovation, so doctors pick surrogate endpoints, which they hope serve as short-term checkpoints. Thus drugs trials for the preceding examples may just decide to measure the middle-aged man’s cholesterol level, the youngster’s symptom checklist for hyperactivity, and the woman’s bone density with a DEXA scan.

Unfortunately, surrogate endpoints can be problematic for two key reasons. First, their relation to the actual outcome of interest may be weak or nonexistent (just as impressing Donald Trump may fail to predict later business success); second, the intervention can improve the surrogate outcome but have bad side effects (just as the nonstop fancy dates that titillate viewers later lead to the couple’s inevitable romantic disappointment).

Consider high serum cholesterol levels, which are a surrogate marker for later heart attacks. In the 1970s, a drug called clofibrate was discovered to markedly reduce cholesterol levels; unfortunately, a World Health Organization trial found it increased heart attacks. In 2002, the Women’s Health Initiative trial showed hormone replacement caused blood clots and increased heart attacks by 29 percent, though it also made cholesterol numbers better. In 2006, Pfizer abruptly pulled the plug on its drug torcetrapib, which had cost roughly $1 billion to develop and had great effects on cholesterol levels—but also caused more heart attacks. Keep in mind that cholesterol levels, like blood pressure or levels of tumor markers in the blood, are at least considered “validated” markers.

In part, the increased focus on surrogate markers instead of hard outcomes came about in the 1990s, when the U.S. Food and Drug Administration was pressured to get HIV drugs to market rapidly for desperate patients, at the expense of using even “unvalidated” surrogate endpoints (like CD4 cell counts) that at the time were “reasonably likely” to predict benefit. For every turkey like clofibrate or torcetrapib, there are occasionally life-saving breakthroughs like protease inhibitors for HIV that may depend on a dizzying array of complicated surrogate endpoints.

How, then, can we encourage approval of drugs like protease inhibitors and cut the number of failures like clofibrate? The truth is that you can’t, just like you can’t easily figure out how to guarantee a happy marriage or find a surefire pop icon. You do the best with the information you have, then wait and see. In his terrific 2004 book Powerful Medicines, Jerry Avorn proposed a two-phased drug evaluation process in which “the initial FDA approval of a drug should be seen as the beginning of an intensive period of assessment, not the end.” No doctor, he writes, would ever start a patient on a new medicine without scheduling any kind of follow-up, but that’s exactly what the U.S. health care system now does.

The other way to help, of course, is to maintain the special environment that existed during the drug trial. During the yearlong intervention phase of the MTA study, kids got ideal therapy, medication oversight, and personal attention from their doctors, but once the study ended, many families simply bagged the drugs and therapy that may have been helping. So it also comes as no surprise they weren’t better eight years later. Improving so-called compliance with treatment is a huge challenge (half of patients don’t take their medicines) and may mitigate some problems with surrogate endpoints.

In the end, prescribing expensive and potentially dangerous drugs isn’t exactly like a reality television show. But perhaps we should at least be equally skeptical of their outcomes until some time has gone by.