Copyright 2017 Black Swan Technology, Inc.

China Office

383 Tianhe Rd. Tianhe District

Guangzhou, Guangdong 510620

+86 189 - 8892 - 0258

U.S. Office

Black Swan Technology, Inc.

600 1st Ave, Seattle, WA 98104

+1 (800) 838 - 9079

In 2016, there were a total of 525 cloud outage incidents spread across 15 of the most popular global cloud service providers. 

17% of outages lasted for less than 1 hour; 51% lasted between 1 and 5 hours, and the remaining 32% lasted anywhere between 5 and 72 hours.

Cloud Computing:

Power & Frailty

Introduction

In 2016, there were more than 500 publicly recorded instances of cloud failures across more than 15 global cloud service providers. Of these, roughly 17% lasted less than 1 hour, 51% between 1 and 5 hours of downtime, and the remaining 32% lasted anywhere between 5 and 72 hours.

The financial ramifications can hardly be understated. 2016 saw destruction of more than $3.8 billion of business value. When the cloud services that businesses count on fail, operations are disrupted and real financial pain manifest as: forgone revenue, lost transactions, breach of end-customer trust, refunds and claims, related regulatory fines, and even lawsuits in some extreme cases. So what are the lessons here?

 

  1. No infrastructure - whether it's hosted on the cloud or on a local data center - is failproof.
     

  2. Operational recovery from cloud service outages is not instantaneous and businesses are often left at the mercy of the service providers.
     

  3. High-availability cloud architecture is expensive, complex, and most businesses don't actually implement it optimally.
     

  4. The current state of cloud computing and financial systems leave a major void that cannot be filled using traditional risk-management solutions. 

Jump to...

 

17%

33%

50%

% SMB Workloads in Cloud

Survey Sample: 517

% Enterprise Workloads in Cloud

Survey Sample: 458

25%

43%

32%

% Workloads in Cloud

Survey Sample: 1,002

21%

38%

41%

Non Cloud

Public Cloud

Private Cloud

Data courtesy of RightScale State of the Cloud Report 2017

17%

51%

32%

*Oracle Cloud, HP, Alibaba Cloud, CenturyLink, Linode Cloud, Digital Ocean, City Cloud, Faction Cloud, GoDaddy Cloud, Elastic Hosts. 

2016 Cloud Outage Distribution

# Incidents

Downtime Hours

*

Amazon Web Services

Compute: EC2, Elastic Bean Stalk, VPC, Lambda,

             Auto-Scaling

Storage: AWS S3, EBS, Glacier, Elastic File System

Database: AWS RDS, Dynamo DB, Simple DB, Aurora, Elastic                      Cache, RedShift

Microsoft Azure

Compute: Azure Virtual Machine, App Service,                       Functions, Azure Container Service

Storage: Azure Storage, Data Lake Store, StorSimple

Database: Azure SQL Database, MYSQL, PostgreSQL,                DW

Google Cloud Platform

Compute: Google Compute Engine, App Engine,                     Container Engine

Storage: Google Storage, Persistent Disk

Database: GCP BigQuery, SQL, Big Query, Dataflow

Rackspace

Compute: Rackspace Cloud Servers, Cloud Load                       Balancers

Storage: Rackspace Storage

Database: Rackspace Big Data, Database

IBM Softlayer

Compute: IBM Virtual Servers

Storage: IBM Storage

Database: IBM Big Data

Cloud Outage Category

2016 Business Impact by Cloud Outage Category Across Top 5 Providers

Business Impact ($M)

$857 M

(23 hrs)

$451 M

(12 hrs)

$947 M

(25 hrs)

$248 M

(29 hrs)

$216 M

(18 hrs)

$337 M

(28 hrs)

$103 M

(15 hrs)

$77 M

(11 hrs)

$142 M

(20 hrs)

$90 M

(15 hrs)

$65 M

(11 hrs)

$155 M

(26 hrs)

$37 M

(9 hrs)

$28M

(7 hrs)

$76 M

(19 hrs)

Unsurprisingly, AWS cloud service failures across all three categories induces the most amount of negative business impact. After all, AWS is the largest cloud service provider, accounting for 38% of the market. In other words, it has the most number of customers, such that - all else equal - any single cloud service failure would have a larger negative economic footprint compared to its peers. By comparison, the next big four providers (MS Azure, GCP, IBM, and Rackspace) own 12%, 7%, 6%, 4%, respectively. Regardless, over time, as the entire size of cloud computing market grows, service failures from all providers will increase in magnitude of business impact.

Outage incident data courtesy of various cloud service provider dashboards and public reports.