Fail Over an EMC ViPR Site

Table of Contents

Overview

This article describes how to recover from the failure of a site in a multisite EMC ViPR configuration.

It includes a sample geo configuration, the behavior of ViPR Controller and ViPR Data Services during the failure, and what steps are needed for recovery.

Back to Top

Example configuration

For the example in this article we use the following geo configuration:

  • Site 1 running ViPR Controller and ViPR Data Services
  • Site 2 running ViPR Controller and ViPR Data Services
  • Site 3 running ViPR Controller and ViPR Data Services
  • Application 1 in New York, has local affinity to Site 1 New York
  • Application 2 in Los Angeles, has local affinity to Site 2 Los Angeles
Back to Top

Description of the site failure

For this example, the failure is ViPR virtual data center at Site 1.

Before any corrective step is taken by the administrator, the behavior at this point is as follows:

Back to Top

Procedure to fail over a site: ViPR Data Services

When the ViPR virtual data center at Site 1 fails, you simply need to remove it from the geo-configuration, so that reads and writes for Data Services can resume to the copies of the Site 1-associated data that reside on Sites 2 and 3.

Before you begin

Ensure that Site 1 is not processing any requests. This can be done by disabling the network.

Procedure

  1. Log in to the UI with an account that has the System Administrator role (but not root user).
  2. In Admin mode, select Virtual Assets > Object Virtual Pools.
  3. Select an object virtual pool that is used by the VDC at the failed site.
  4. Click Remove next to the failed VDC (see screenshot below). This operation tells ViPR that the failed VDC is no longer part of the multisite configuration.
  5. Confirm that you want to remove the VDC.
  6. Repeat steps 3 and 4 for each object virtual pool used by the VDC at the failed site

Results

After failover, data services are operational on Site 2 and Site 3.

During the rebalancing process ViPR cannot tolerate another site failure, and you need to closely watch the rebalancing process and ensure it completes. The rebalancing process depends on the amount of data being replicated and the network performance.

Back to Top

Recover from site failure

To restore the failed site (Site 1 in our example):

Procedure

  1. Restore the ViPR Controller from backups. You can find details in EMC ViPR native backup and restore service.
  2. Run the ViPR Controller on the restored site.
  3. On the restored site, delete the virtual data center and all resources. (ViPR Data Services cannot be deployed to a site with the same ID.)
  4. Redeploy the ViPR Controller on the recovered site.
  5. Redeploy ViPR Data Services on the recovered site.
Back to Top