Yesterday, a global outage on the Google Cloud platform knocked out Google Home, Spotify, Snapchat, and even some Cloudflare features, and today the company released a mini incident report as it continues to investigate.
For Cloudflare’s part, its report says the Google failure took out a central data store for one of its services.
From our initial analysis, the issue occurred due to an invalid automated quota update to our API management system which was distributed globally, causing external API requests to be rejected. To recover we bypassed the offending quota check, which allowed recovery in most regions within 2 hours. However, the quota policy database in us-central1 became overloaded, resulting in much longer recovery in that region.