If you missed that Google had a 2.5-hour Gmail outage yesterday, you were probably hiding under a rock, or possibly in one of those sensory deprivation chambers. Every major tech blog and news outlet was on it (not to mention Twitter users).
It was night-time in the US, which limited the impact there, but the rest of the world wasn’t so lucky. For example, in Europe the outage started at 9:30 in the morning.
Loss of productivy costs… how much?
A lot of companies use Gmail, and while for some the temporary lack of email service may not have had a big impact on business and productivity, for others it must have been a real problem. What could an outage like this actually cost Gmail end users?
Google has put a number on it: It was worth 15 days of free service (if you’re a paying customer, that is).
Considering each license costs $50, that comes down to the equivalent of $2.05 per license. Still, Google is being generous. If they had followed their SLA strictly, it would only have come to 41 cents… (Pointed out over at GigaOM.)
A few lessons learned from the Gmail outage
The Gmail outage yesterday showed us a few interesting things.
- Email is far from dead. You only notice how much you use it once it’s gone.
- If a service as widely used as Gmail goes down for a decent amount of time, the blogosphere and press will collectively go gaga.
- If Google is involved, the effect in the previous point is multiplied by 2.
- Not even Google is immune to long outages.
- Murphy’s Law is alive and well.
Google’s explanation of what happened
We brought up Murphy’s Law, and that pretty much sums up Google’s explanation of what caused the Gmail outage:
Lots of folks are asking what happened, so we thought you’d like an explanation. This morning, there was a routine maintenance event in one of our European data centers. This typically causes no disruption because accounts are simply served out of another data center.
Unexpected side effects of some new code that tries to keep data geographically close to its owner caused another data center in Europe to become overloaded, and that caused cascading problems from one data center to another. It took us about an hour to get it all back under control.
There are simply so many factors involved in providing a large and complex service that eventually something is bound to go wrong. And clearly, no one is immune to this. Not even Google.