The New York Times this weekend had part one of an in-depth investigation into the environmental impacts of the Internet, and the findings are not pretty. The data centers that store and process everything from your old emails and Facebook data to tweets, Google searches, and e-commerce transactions suck up 2 percent of the nation's entire electricity supply. That's actually more than the notoriously energy-intensive paper industry, belying the myth that a shift to digital information is necessarily better for the planet.
Worse, upwards of 90 percent of that energy is simply wasted. A McKinsey & Company report found that the average data center uses just 6 to 12 percent of its electricity for actual computation. The rest, the Times reports, "was essentially used to keep servers idling and ready in case of a surge in activity that could slow or crash their operations." In short, online companies keep their facilities running "at maximum capacity around the clock, whatever the demand," because they fear what could happen if their sites ever went down. The Times adds:
To guard against a power failure, they further rely on banks of generators that emit diesel exhaust. The pollution from data centers has increasingly been cited by the authorities for violating clean air regulations, documents show. In Silicon Valley, many data centers appear on the state government’s Toxic Air Contaminant Inventory, a roster of the area’s top stationary diesel polluters.
It's not surprising, then, that the Times found the entire industry shrouded in secrecy. Its investigation required a year's worth of reporting, with thousands of public-records requests and hundreds of interviews.
What can we do about it? Some forms of computing are essential, of course. Servers that help govern transportation infrastructure or national defense systems have a compelling need for deep redundancy. But many of the companies the Times cites are providing what amounts to an inessential service to everyday consumers. Yahoo, for instance, spends huge amounts of energy storing data from people's old fantasy football leagues.
One obvious answer is that these companies could adopt some of the energy-efficiency strategies identified in the article, such as shutting down servers overnight when they aren't needed. Ironically, some of the organizations with the most essential data seem to burn less of it, perhaps because they take a more active role in monitoring it. The Times reports that the National Energy Research Scientific Computing Center at Lawrence Berkeley National Laboratory utilizes 95 percent of its capacity by carefully scheduling and managing data-intensive jobs.
Yet it's hard to fully blame Internet companies for excessive redundancy when we in the tech press, and the public at large, are so quick to ridicule any site that goes down, even for a few hours. I'm guilty of making fun of Twitter for its (relatively) frequent outages—which, let's face it, don't exactly bring the global economy to its knees. If we came to understand that occasional outages to nonessential sites are not the end of the world, perhaps companies would feel safer turning off at least some of their machines when they're not in use.
Meanwhile, do we really need Gmail encouraging us to archive all of our old emails rather than delete them? How many severs must be running at this very moment just to accommodate 5MB photo attachments from four years ago that we'll never again look at in our lives? How many of us recycle soda cans and turn off the lights when we leave a room but get outraged when we can't instantaneously load a website or access every bit of data we've ever stored online?
Internet storage today is like a fleet of empty buses that stay running all night just in case millions of people suddenly decide they need a ride at 3 a.m. And instead of retiring buses when they grow too old to be of use, we keep them running too, just because. No way is this sustainable in the long term.
Update, 12:06 p.m.: The Times article has prompted a couple of smart, well-informed rebuttals from bloggers pointing out a number of oversimplifications and misleading generalizations in the article. I don't think any of them overturn the main point of the piece, which is that "the cloud" is not some magical ether, but rather a network of big, power-hungry, polluting, and often wasteful physical data warehouses that store a lot of stuff we need but also tons of stuff we don't need. That may be obvious to those in the tech industry, but for much of the general public—a majority of which apparently thinks cloud computing has something to do with the weather—it's a point worth hammering home.
That said, a number of points raised in a concise piece by Dan Woods at Forbes and a more in-depth post by Diego Doval are salient enough that anyone who reads the Times piece should consider them as well. Woods notes that while the article cites companies like Amazon, Facebook, and Google, many of the numbers it cites actually come from "IT data centers, not from the state-of-the-art-data centers run by the Internet companies." That's a good point, though it's worth noting the reason for this omission: Amazon, Google, and Facebook refuse to disclose that information.
Both bloggers also make the point that redundancies and excess capacity are especially necessary for services for which demand spikes unpredictably. (You could also make the case that 2 percent of our energy supply is not a huge price to pay for all the services the Internet provides.) Finally, Doval punctures the Times' implication that there are silver bullets that companies are ignoring out of fear or carelessness:
There are no “readily available” solutions to this problem, because there isn’t just one problem to solve. There’s multiple overlapping challenges that are all different and sometimes contradictory factors (e.g. exchange utilization for failover capabilities), and no one is “reluctant to make wholesale change” — these are huge, complex systems that can’t be just replaced for something else.
All true, as is Woods' point that companies do in fact face cost pressure to reduce their data usage—though I would argue that, as with many other forms of energy usage, the cost to the companies doesn't capture the full cost to the environment. For that, I suppose we have the nation's energy policies to blame, more than any given company or industry.