Thanks to the cloud, websites and apps around the world can tap into vast, remote stores of data and computing power.
And thanks to the cloud, one good blow to one of those vast, remote storage centers can take down websites and apps around the world.
That’s what happened this past weekend. A ferocious lightning storm in Northern Virginia took down Netflix, Instagram, Pinterest, Heroku, and more—not because any of those companies are based in Northern Virginia, but because they all apparently rely heavily on Amazon’s Elastic Compute Cloud facility there. Amazon said the storm, for reasons not immediately explained, took out both its main power supply and its backup generator.
The outage brought to mind a similar incident a year ago, in which an outage at the same Amazon facility felled Reddit, Quora, and several other sites.
Does the ability of one local weather pattern to affect web users around the globe point to a fundamental flaw in the cloud? Have we entered a world in which Internet users in Palo Alto, Johannesburg, and Taipei must watch the weather report for Northern Virginia?
Not necessarily, but it does highlight the reality that the cloud is as much a physical system as it is a virtual one. And as with most physical systems, redundancy is essential to reliability.
Amazon already has U.S. facilities in Oregon and Northern California in addition to Northern Virginia, and it encourages major customers to take advantage of its Elastic Load Balancing service, which is supposed to shift traffic from one cloud center to another to keep things running smoothly. In the past, some of the companies affected by outages have admitted that they didn’t sign up for this option, which costs extra. Everyone whose business depends on reliability by now should know that it’s worth the price.
But in this case, it seems that some of the companies affected were in fact using Elastic Load Balancing—and it failed too.
If that’s true, this latest outage could spur a different kind of diversification in the cloud. Namely, companies might take a harder look at using providers other than Amazon, the industry leader. Rackspace and Microsoft Windows Azure have been the two closest competitors. But in a timely development, Google announced on Thursday, the day before the outages began, that it is poised to enter the field as well.
The increase in competition will put pressure on Amazon to beef up its reliability. If it doesn’t respond, its customers will.