One of the latest software errors that had widely noticed consequences was Google’s Gmail outage in February. The problem in that case was, according to Google, a bug in the software that distributed load between its different data centers.
The Gmail outage only resulted in people not having access to their email for a few hours. No one got killed. Nothing exploded. It was an inconvenience, and while it was a significant inconvenience for some of Gmail’s users, it was still just that: an inconvenience.
This article is about some of the more dire consequences of software errors through the years. Incidents that make the Gmail outage seem rather trivial.
WW3, almost…
In 1980, NORAD reported that the US was under missile attack. The problem was caused by a faulty circuit, a possibility the reporting software hadn’t taken into account.
In 1983, a Soviet satellite reported incoming US missiles, but the officer in charge decided to follow his gut feeling that it was a false alarm and decided to do nothing.
Luckily these false reports were never acted upon. If a counterstrike had been launched in either case, a full-blown nuclear war would have been a fact and the world would have been a very different place today.
Undetected hole in the ozone layer
The hole in the ozone layer over Antarctica remained undetected for a long period of time because the data analysis software used by NASA in its project to map the ozone layer had been designed to ignore values that deviated greatly from expected measurements.
The project had been launched in 1978, but it wasn’t until 1985 that the hole was discovered, and not by NASA. NASA didn’t find the error until they reviewed their data, which indeed showed that there was a big hole in the ozone layer.
Deadly radiation therapy
The Therac-25 medical radiation therapy device was involved in several cases where massive overdoses of radiation were administered to patients in 1985-87, a side effect of the buggy software powering the device. A number of patients received up to 100 times the intended dose, and at least three of them died as a direct result of the radiation overdose.
Another radiation dosage error happened in Panama City in 2000, where therapy planning software from US company Multidata delivered different doses depending on the order in which data was entered. This resulted in massive overdoses for some patients, and at least five died. The number of deaths could potentially be much higher, but it is difficult to know how many of the 21 who died in the following years did so as a result of their cancer or ill effects from the radiation treatment.
Rocket launch errors
In 1996, a European Ariane 5 rocket was set to deliver a payload of satellites into Earth orbit, but problems with the software caused the launch rocket to veer off its path a mere 37 seconds after launch. As it started disintegrating, it self-destructed (a security measure). The problem was the result of code reuse from the launch system’s predecessor, Ariane 4, which had very different flight conditions from Ariane 5. More than $370 million were lost due to this error.
Flight crashes
In 1994 in Scotland, a Chinook helicopter crashed and killed all 29 passengers. While initially the pilot was blamed for the crash, that decision was later overturned since there was evidence that a systems error had been the actual cause.
Another example of a software-induced flight crash happened in 1993, when an error in the flight-control software for the Swedish JAS 39 Gripen fighter aircraft was behind a widely publicized crash in Sweden.
Here is a video of the JAS 39 Gripen incident:
Lost in space
One of the subcontractors NASA used when building its Mars climate orbiter had used English units instead of the intended metric system, which caused the orbiter’s thrusters to work incorrectly. Due to this bug, the orbiter crashed almost immediately when it arrived at Mars in 1999. The cost of the project was $327 million, not to mention the lost time (it took almost a year for the orbiter to reach Mars).
An explosion seen from space
In another Cold War escapade, the CIA allegedly managed to slip the Russians a faulty control software to be used for a major gas pipeline (the KGB was to steal the software from a Canadian company, but the CIA had been tipped off). The planted bug eventually caused a huge Siberian gas pipeline explosion in 1982. It was “the most monumental non-nuclear explosion and fire ever seen from space” (observed from US satellites).
Ok, that last example was an intentionally planted software bug, but it was such an extreme consequence of faulty software that we couldn’t help but include it. “An explosion seen from space.” Wow…
More careful testing would save money (and lives)
We have mentioned just 10 examples in this article, but the truth is that these are just the tip of a very large iceberg. Every year, software errors cause massive amounts of problems all over the world.
And bugs are expensive, too. A 2002 study commissioned by the National Institute of Standards and Technology (referred to here) found that software bugs cost the US economy $59.5 billion every year (imagine the global costs…). The study estimated that more than a third of that amount, $22.2 billion, could be eliminated by improved testing.
If you have any doubts as to how common software bugs are, just do a news search for “software bug” or “software error”…
Some bugs may cause only trivial problems, but flight control software and software for medical equipment are examples of things that simply cannot be allowed to fail due to programming errors.
We’re happy designing software for website monitoring, thank you very much. If someone could die if we made a mistake, everyone here at Pingdom would sooner or later have an ulcer worrying about it.
On a slightly lighter note, you might want to check out another “bug post” we made a while back:
Blue Screen of Death in unexpected locations
Image from the game DEFCON by Introversion Software.