Business data lakes hold the keys to meeting the fast-growing business appetite for new combinations of data and to putting Big Data analytics to work across the enterprise.
To explore the business opportunities and technological capabilities, we discussed data lakes with Paul Maritz, CEO of Pivotal, the Platform-as-a-Service provider launched by EMC and VMware in 2013. A technology industry leader for three decades, Paul previously served as Chief Strategist of EMC and CEO of VMware.
Data lake is a new concept and capability, so new that as of August 1 it had no Wikipedia entry. Please explain data lakes and what they do.
You can look at business data lakes three ways:
First is as one place to put all the data you may want to use. That includes structured data drawn from traditional databases and unstructured data like text. It includes data generated by the enterprise and data imported from outside sources and services. It includes the social media and sensor and telemetry data thats being generated in vast quantities and that most enterprises are just learning to work with.
Second is a platform for Big Data analytics. A data lake isnt just a landing zone for all sorts of data. Its where you can analyze the data as well, and where you can find the correlations among data that youve never before examined together. Many of the breakthroughs with business analytics come not just through looking at more data or doing more sophisticated analyses, but through new combinations of data that reveal the drivers of business performance.
Third, data lakes help resolve the longstanding tension between the corporate push to get standard data into warehouses and used consistently, and the business unit need for local views and combinations of data that get implemented in all those Excel spreadsheets. A data lake is a shared resource, and it may contain a lot of carefully administered data. But it also provides a platform for business units to get at the data and quickly build the views and data-driven applications they really need.
At Pivotal, we summarize those three uses with a slogan: Store everything. Analyze anything. Build what you need.