Sports Nut

Passing the Chemistry Test

Will pro sports teams ever figure out how to quantify how well teammates get along?

 David Ross #3 of the Chicago Cubs congratulates Jon Lester #34
David Ross, right, congratulates Jon Lester for pitching a complete game for the win against the Los Angeles Dodgers on June 1 in Chicago.

Jon Durr/Getty Images

What makes a group of athletes greater than the sum of its parts? Is it the knowing glance that New England Patriots quarterback Tom Brady exchanges with Rob Gronkowski when he looks down the line of scrimmage? Is it the fire that the Chicago Cubs’ Jon Lester mustered after his personal catcher David Ross trotted out to the mound to dispense some wisdom in a tense sixth inning?

Team chemistry is the most elusive factor in sports—the “holy grail of performance analytics,” according to Harvard Business Review. It’s only logical that certain teams get along better than others, but how important are these relationships, and can teams optimize them?

The fact that the sports world’s intangibles seem, by definition, immeasurable make them an irresistible challenge for researchers who’ve figured out how to quantify so much of what happens on the field of play. Neuroscientists have claimed to measure chemistry through the synchronized heartbeats of teammates. Other researchers have examined the correlation of high fives and wins.

The rewards for solving the chemistry riddle are high, in part because maximizing chemistry would come at almost zero cost. If a team could determine that a player would contribute more of a winning attitude than another guy with a similar statistical output, they’d get that chemistry boost for free—at least until other teams figured out how to quantify that extra boon to team spirit.

Professional sports franchises are still a long way away from figuring out how to maximize their players’ ability to work together. Sam Miller, who wrote a feature on team chemistry for ESPN the Magazine in 2013, told me that “it’s not like you have 25 guys, therefore you have 25 relationships. You have 25 guys, therefore you have probably billions of relationships.” And Russell Carleton, who has written about the quantification of chemistry for Baseball Prospectus, says major-league clubs haven’t yet come close to “understanding a baseball team as its own little culture.” The economics of baseball ensure that in-house analytics gurus focus more on a player’s hard statistics than something as squirrelly as “clubhouse presence.” At least for now, every team would be advised to build its roster based on wins above replacement rather than, say, the alleged 10 wins’ worth of value that pitcher Brandon McCarthy claimed his teammate Brandon Inge contributed off the field.

In reality, we’re not even particularly close to developing a consensus understanding of what the term chemistry means. Analysts and academics have mountains of player performance data, but these on-field metrics can only carry their research so far. Baseball players spend more time in the relative privacy of locker rooms, dugouts, bullpens, airplanes, and hotel rooms than they do on the field. The limited access researchers have to these spaces means they’re lacking a vital source of quantifiable data. With limited inputs to calculate chemistry, statisticians have to get creative to find something measureable. But what they end up measuring might not actually be chemistry.

Take the work of Katerina Bezrukova, a professor at the University at Buffalo School of Management who has worked with Major League Baseball and the National Basketball Association to shed light on chemistry’s role in team performance.* Her research focuses on the demographic “fault lines” in sports, intrateam divisions that develop from differences in teammates’ racial, ethnic, and economic backgrounds. She claims that teams must strike an optimal balance between diversity and homogeneity and that teams that fall too far on either side of the golden mean win fewer games. In an MLB season, she finds, chemistry is worth about three wins.

Although demographic factors may have some say in how a team gets along, Bezrukova’s research pays little mind to players’ individual personalities. That’s a far more difficult element to harness, but without it you end up with a circuitous definition of chemistry. Bezrukova has found something to measure. It’s just unclear what that something is.

A paper presented at this year’s MIT Sloan Sports Analytics Conference leans on a similar crutch. “In Search of David Ross,” named for the backup catcher and spiritual leader of the 2016 World Series champion Cubs, takes a stab at quantifying “the indirect impact that an individual player can have on team wins through making their teammates better.” The authors do some messy math to get there, employing a regression model on FanGraphs’ wins above replacement statistic. There is on average a 20 percent variance, they report, between a team’s actual win total and the cumulative WAR of all the players on that team. They attribute half of that 20 percent gap to what they call chemistry.

There are plenty of problems with this approach. Carleton and Miller both say such a model, which points to a negative space in the calculation of team performance and works backward to fill it in, risks sweeping a lot of unrelated stuff into the chemistry bucket. Miller points out that analysts have traditionally attributed discrepancies between team wins and cumulative WAR to a team’s relative “clutchness”—that is, random (well, probably random) fluctuations in how similarly skilled players perform in crucial moments throughout the season. Carleton says his concern is that the paper bundles on-field “interaction effects” into chemistry. His example: If shortstop A plays for a team whose pitching staff produces a lot of ground balls, he may have an inflated WAR compared with shortstop B, whose pitching staff generates a lot of fly balls. Shortstop B produces less value for his team because he’s spending a lot of time twiddling his thumbs, but that doesn’t necessarily mean he has bad relationships with his teammates or even that he is worse at baseball.

When I brought this up with the authors of the David Ross paper, they said their methods accommodate exactly this sort of scenario. That shortstop who is fielding a lot of infield ground balls? They argue he has good chemistry with his pitchers.

The issue here, then, isn’t that the authors are bad at math. It’s that their version of chemistry—essentially, anything that makes teams better than the players’ individual characteristics might suggest—is not what most of us would call chemistry.

The authors of the paper—a pair of economists at the Chicago Federal Reserve and a professor at the Indiana University Kelley School of Business—found a creative workaround given their lack of access to baseball clubhouses, using publicly available player performance data to take aim at an abstract target. If we wanted to measure chemistry for real, pro baseball would need to function as a laboratory first and a competitive arena second. In this fantasyland, statisticians would have unrestricted access to clubhouse social scenes. They could track what players talked about behind closed doors and how long those conversations lasted. They could also randomize trades, testing out different players in different circumstances. Carleton argues that measuring chemistry wouldn’t even be that hard in a world like this one. But sadly for researchers (and happily for players), that level of omniscience and omnipotence isn’t in the offing, at least in this century.

In the present day, MLB teams use personality exams that that have little more validity than a Myers-Briggs test. But more advanced analytics may find their way into front offices soon. Bezrukova has presented her research to general managers, and Carleton also confirmed to me that in-house analysts from various teams are working on measuring chemistry. But even small breakthroughs will be hard to come by when no one knows what to look for. Until we reach a consensus view of what chemistry means, we’ll all just be guessing whether David Ross’ paternal drawl instilled just a touch more confidence in Jon Lester, and how much it matters if it did.

Correction, May 2, 2017: This piece originally misstated that Katerina Bezrukova is a psychology professor at the University of Buffalo. She is a professor at the University at Buffalo School of Management. (Return.)