What Facebook and Google Taught Enterprise IT: Smart Commodity Infrastructure Transforms Block Storage

David Noy
VP Product Management, Emerging Technologies Division at EMC

Businesses like Facebook and Google have become global giants—and their innovative approach to managing massive infrastructure growth has valuable lessons to teach enterprise organizations.

By applying smart software solutions to the challenge of data growth, these famous names have shown that it’s possible to rethink the traditional enterprise block storage model, and create a ‘Software Defined Storage’ platform based on low-cost commodity infrastructure—reducing costs, increasing performance, and delivering virtually limitless scalability.

The Rise of the Scalable Data-Driven Enterprise

Facebook, Google, Amazon, Twitter, Dropbox and other data-driven organizations emerging over the last couple of decades have transformed the business landscape. They have also, by necessity of their stratospheric growth, had to reimagine how the data center works—and particularly how to store the massive amounts of data their business models are built on.

To understand their thinking, we need to look back 20 years or so. In the mid-1990s, Google, Amazon, Yahoo and other early web giants were still relative newcomers. Most major mainstream organizations were dipping their toe into the new-fangled ‘World Wide Web’ with a website—but the big enterprise IT change was happening in the corporate data center.

The typical growing 1990s enterprise was running an increasing number of core applications and large databases, each usually on its own server with its own directly-attached data storage. Organizations were also facing data growth—on the multi-gigabyte scale! (That may seem like small potatoes to us now in our exabyte-scale future, but it was a big deal back then.) A growing problem was that these separate servers were like isolated silos of data storage. One server’s storage might be 99% full, and its neighbor almost empty—but there was no way to share the available storage and spread the load efficiently.

SAN Came To The Rescue Of The Enterprise Data Center

The innovation that solved this challenge in the mid-90s was the ‘storage area network’ or SAN. The SAN moved storage from being directly attached to each computing server, to being a centralized, shared resource for the data center—a much more flexible and efficient situation. SAN storage came in the form of a large ‘array’, using specialized high-end hard disk drives and controller technology.

The specific data storage model used by SAN is ‘block storage’, in which the available storage capacity is divided up and managed in fixed-size ‘blocks’. A SAN storage manager can easily provision standard block storage volumes, perfect for applications like large and busy enterprise databases.

As a result of its clear benefits, SAN quickly became the predominant data center storage model—and has stayed as the default enterprise choice for two decades. SAN remains an excellent enterprise storage solution—until your data storage requirements start to enter today’s big data realm.

The 3-Phase Evolution of Data Center Block Storage

Growing Data-Driven Businesses Look Beyond Traditional SAN

Meanwhile, the rapid growth of web businesses was astonishing the world. Companies like Google and Amazon—and, by the mid-2000s, Facebook, Twitter and Dropbox—were running thousands of servers in multiple data centers worldwide—and adding many more every day. They were scaling up to many petabytes of storage—for search data, social media data, multimedia data, and so on. But were these expanding businesses using SAN to achieve this scalability? In a word—no.

Traditional SAN provides the storage performance that mainstream enterprise IT demanded—but the SAN model can struggle to keep up with rapid data growth. SAN can be expensive and difficult to scale—and complex to manage. The typical SAN is specified to last an organization’s needs for 3–5 years or more—but modern ‘cloud-scale’ businesses may have to add new storage capacity every day, and constantly revise their estimates of future requirements.

Technologists at businesses like Facebook and Google quickly realized that scaling storage with a traditional SAN model would be too impractical and expensive. Fortunately, storage experts identified a new approach to large-scale data storage that combined lower-cost ‘commodity’ hardware with advanced software—to create a new kind of scalable storage infrastructure.

Software Defined Storage:
Low-Cost Commodity Infrastructure Plus Software Intelligence

The Emergence of Software Defined Storage

This ‘Software Defined Storage’ approach presents applications, admins and users with what appears to be a single sharable ‘pool’ of storage—behaving like a high-performance SAN—even though it actually uses multiple separate commodity servers, each with their own directly-attached commodity storage.

This approach has enabled organizations to scale out quickly, easily and cheaply, adding thousands of servers over time to provide many petabytes of storage. As hardware components fail, they can be easily swapped out and replaced with new low-cost commodity replacements, with negligible effect on performance. Management of storage in a software defined world also becomes much simpler—with storage provisioned via a simple point-and-click interface in seconds.

This distributed approach has had the additional benefit of scaling up speed for these companies, as each server also adds additional IOPS of storage performance. The approach also increases storage resiliency, as data can be mirrored across multiple servers—allowing any corrupted or lost data items to be healed or regenerated automatically.

The Smart Storage Lesson For Enterprise IT

Enterprise IT is learning a valuable lesson—that commodity infrastructure tied together with smart software now offers a practical and lower-cost way to deal with data growth. This is leading to a new Software Defined Storage approach to enterprise block storage—with multiple commodity servers networked and unified into a kind of ‘virtual SAN’.

This new approach is called ‘Server SAN’—and the expert opinion is that it’s set to transform the enterprise data center. Analyst IDC predicts the overall growth of the Server SAN market to grow 20% CAGR from 2014 to 2019.

Server SAN Advantages for Block Storage

Making software defined block storage a Reality For Enterprise

Data storage requirements are growing for enterprise IT—but budget isn’t. How do you do more with less? Enterprise organizations wanting to bring scalability, simplicity, efficiency and reduced cost to block storage are now looking to the latest software defined Server SAN solutions—like EMC ScaleIO.

ScaleIO is smart software that simplifies the creation and management of a Server SAN network of block storage using commodity infrastructure. It enables an organization to easily scale up its storage as needed, with no fuss and minimal cost. It’s powerful enough to unify hundreds or thousands of individual storage servers (or ‘nodes’) into a shared Server SAN. Each extra node not only adds storage capacity to your Server SAN—but also increases performance, scaling up linearly to millions of IOPS (input/output operations per second).

A software defined block storage solution based on EMC ScaleIO can deliver significant TCO savings compared with traditional SAN. The ScaleIO software enables you to utilize commodity hardware to create your Server SAN simply, so you avoid the expense and complexity of high-end SAN infrastructure. You no longer have to upgrade or replace large SAN storage arrays in an expensive ’tech refresh’ and face complex data migration challenges. With ScaleIO, you simply add more server nodes to scale out as your needs grow. You can also easily swap out any failed disks or outdated servers from the cluster—in a simple and cost-effective manner.

EMC ScaleIO is a true software defined storage solution, and is “infrastructure-agnostic”—working in harmony with a wide range and mix of server brands, operating systems, and storage media (hard drives and flash). Organizations that want the reassurance and confidence of a turnkey solution can turn to a partner like EMC for end-to-end software, support and preconfigured hardware—with the VxRack Node appliance.

See What Software Defined Storage Can Do For Your Organization

Enterprise organizations are finding the software defined Server SAN approach to block storage with ScaleIO is enabling new agility and scalability, leading towards the enablement of data center transformation for traditional and next-gen apps. Service providers using ScaleIO in their data centers are able to meet customer demand for storage quickly and easily.

If you’re beginning your journey to software defined block storage and Server SAN, EMC can help. With ScaleIO, you can start small—and scale up easily as you prove the benefits.

EMC ScaleIO software is available to download and try for free.

Learn more about block storage, Server SAN and EMC ScaleIO.