DSSD D5, XtremIO, VMAX or Isilon
The first Data Mining / Analytics use case is an analytics application running on a database backed by a block data store. The database's complex data model coupled with unpredictable data traversal paths leads to random scans and possibly large requests. Examples of these types of workloads include highly complex calculations (models) on real-time, hot data sets, such as applications for oil and gas grid simulations, targeted patient treatment based on genomic sequencing, quantitative stock trading, and fraud and risk analysis. The appropriate platform and products for these workloads depends on the storage performance and data ingest needs of particular applications.
For many workloads, the tightly coupled scale-out architectures of XtremIO or VMAX can be the platform of choice. For those workloads requiring extreme levels of latency, IOPS and bandwidth, the Shared External NVM Fabrics architecture of DSSD D5 is the right choice.
Which architecture and platform to choose? DSSD D5 fulfills the extreme IOPS, latency and bandwidth demands of high-performance analytics applications by providing an order of magnitude improvement in storage performance compared to other shared flash storage (up to 10 million IOPS, 100 microseconds latency and 100 GB/s bandwidth).
For workloads that do not require such extreme performance, most common need is for low latency when requesting patterns of blocks, which often have a low "locality of reference" value. Also commonly requested is the ability to quickly and easily create a writable replica of the data that allows ongoing development changes and testing of the database before being merged back into production. This low latency random workload with a need for writeable space-efficient replicas makes XtremIO the platform of choice.
For larger analytic systems the workload skews towards overall high capacity requirements and an importance of RAS. Predictable low latency is delivered when the workload is tolerant to promotion and demotion of data via FASTVP. When the use case has these requirements or runs on iSeries or mainframe, VMAX is the platform of choice.
NOTE: For SAP HANA, customers may prefer to leverage their installed EMC storage infrastructure. An implementation starts with a certified SAP HANA server from your preferred vendor. You then use your installed EMC VMAX, EMC VNX, or XtremIO arrays.
The second Data Mining / Analytics use case is Hadoop analytics on HDFS unstructured data.. For batch analytics, this use case mates nicely with a loosely coupled scale out architecture. Isilon's native HDFS integration coupled with the capability of Isilon to quickly attach many different, independently scalable data sets to Hadoop compute clusters make Isilon the proper choice for the Hadoop use case.
For real-time analytics on HDFS unstructured data, DSSD D5 with its Shared External NVM Fabrics architecture is the platform of choice. With DSSD Hadoop Plug-in, D5 provides Hadoop workloads with unprecedented performance, enabling real or near real-time analytics on “hot data” sets, and can become a great compliment to an Isilon data lake.
The Spider Chart below shows the design characteristics of the platform(s) in relation to the workload requirements. Note overall alignment and relative strengths related to the target workload.
The Spiderchart below shows the design characteristics of the platform(s) in relation to the workload requirements. Note overall alignment and relative strengths related to the target workload.