EMC ViPR Data Services: Geo-protection and Multisite Access
This article applies to EMC ViPR 2.0.
ViPR geo-protection works across all types of ViPR supported hardware, whether file arrays such as EMC Isilon, or commodity storage. The geo-protection layer protects data across geo-distributed sites, and ensures that applications seamlessly function in case of a site failure.
ViPR geo-protection manages the storage overhead that is introduced by replicating multiple copies of data across sites. In addition to local protection, ViPR ensures that data is also replicated efficiently with minimal overhead across multiple sites.
- Tolerates one full site disaster. When commodity servers are used, it can furthermore tolerate 2 node failures (a minimum of 8 nodes total are required per site).
- Avoids WAN traffic for both node and disk failure, by repairing through use of local fragments when the underlying physical storage platform is comomodity-based servers.
- Ensures low storage overhead across multiple sites. Example: 1.77 copies across 4 sites.
For example, an application that writes object-1 to site A can immediately access that data from site B. And any update to object-1 in site B can immediately be read in site A. ViPR ensures that, whether the data is accessed from site A or B, the data returned for object-1 is always the latest version.
ViPR achieves this strong consistency by representing each bucket, object, directory, and file as an entity, and applying the appropriate technique on each entity based on its traffic pattern. When the technique can avoid a WAN roundtrip, average latency is reduced.
Consider these examples:
- A bucket that spans across sites
- Traffic: very large number of reads in all sites, but very few writes.
- Technique: Local reads, update synchronous writes across all the sites.
- A user’s personal file
- Traffic: nearly all reads and writes are from one site, with very few requests from other sites.
- Technique: Local reads and writes. Cross-site access is on demand.
- A user’s shared content
- Traffic: Some reads and writes from one site, reads from other sites.
Technique: Local site grants a lease to the other site to serve the read; notification on write to void the lease.
Physical storage can be a storage array such as EMC Isilon or EMC VNX for example, or commodity-based hardware. The storage type does not have to be the same across sites.
Virtual Data Center: After the ViPR controller is deployed at each site, link them by the VDC name and virtual IP address, using the ViPR UI (Admin > Virtual Assets > Virtual Data Center > Add) or the REST API call POST /vdc.
Virtual Array: Create virtual arrays, which are the groupings of physical storage systems with similar characteristics.
Virtual Pool: Add virtual arrays to the virtual pools across virtual data centers.
- Network latency must not exceed 1000 ms between configured sites.
- When commodity storage is used, a minimum of 8 commodity nodes per site is required.
- If your disaster plan includes running for a period of time with one site failed (instead of promptly recovering the site), each site will need enough free storage across all sites to accommodate for rebalancing of data. Refer to Fail over an EMC ViPR site for details.
The following table illustrates the storage overhead introduced by ViPR based on the number of sites.
In the ViPR UI this is done by clicking Failover next to the failed VDC in the Edit Object Virtual Pool dialog.
Once failover is complete, ViPR automatically starts rebalancing data across the remaining available sites to ensure geo-protection. When the recovered site is back online, local data is lost, however; so the site needs to be redeployed as a new site and added again to the geo-configuration. For this example, the failure is ViPR virtual data center at Site 1.
Details on site failover and recovery are in Fail over an EMC ViPR site