|
| |||
|
|
Lightning in Belgium Disrupts Google Cloud Services Not even the mighty Google data centers are immune from acts of god it turns out. A series of successive lightning strikes in Belgium last Thursday managed to knock some cloud storage systems offline briefly, causing errors for some users of Google’s cloud infrastructure services. The lighting hit electrical systems of one of three Google data centers in St. Ghislain, a small town about 50 miles southwest of Brussels. The data center hosts the europe-west1-b zone of Google Compute Engine that experienced issues as a result. Besides failover systems that switch to auxiliary power when primary power source goes offline, servers in Google data centers have on-board batteries for extra backup, which was the case with the servers supporting Persistent Disk, the cloud storage that acts like Network Attached Storage or storage that’s independent of compute. But some of the servers failed anyway because of “extended or repeated battery drain,” according to the company’s incident report. “In almost all cases the data was successfully committed to stable storage, although manual intervention was required in order to restore the systems to their normal serving state,” the report read. Google engineers estimated that about five percent of persistent disks in the zone saw at least one I/O read or write failure over the course of the roughly five days the problems appeared. A tiny fraction of the persistent-disk space in the zone lost some data permanently: 0.000001 percent, according to Google. The company’s infrastructure teams are currently in the process of replacing storage systems with hardware that’s more resilient against power failure, and most Persistent Disk storage is already running on the new hardware, Google said. In a piece of advice cloud service providers commonly offer following cloud outages, Google reminded users that it has multiple cloud regions around the world and multiple isolated zones within each region precisely so that users can set up resilient infrastructure that can fail over from one zone to another in case of a single-zone outage. Google Compute Engine has three regions: Central US in Council Bluffs, Iowa, Western Europe in St. Ghislain, and East Asia in Changhua County, Taiwan. There are four zones in the Central US region and three each in Western Europe and East Asia. |
|||||||||||||