ECS 2.1 – What is ECS?
Table of Contents
The ECS scale-out, geo-distributed architecture is a cloud platform that provides:
- Lower cost than public clouds
- Unmatched combination of storage efficiency and data access
- Anywhere read/write access with strong consistency that simplifies application development
- No single point of failure to increase availability and performance
- Universal accessibility that eliminates storage silos and inefficient ETL/data movement processes
The ECS platform includes the following software layers and services:Back to Top
Portal services include interfaces for provisioning, managing, and monitoring storage resources. The interfaces are:
- GUI: A built-in browser-based graphical user interface called the portal Portal.
- REST: A RESTful API that you can use to develop your own portal Portal.
- CLI: A command-line interface that enables you to perform the same tasks as the browser-based interface.
Storage services are provided by the unstructured storage engine (USE) which ensures data availability and protection against data corruption, hardware failures, and data center disasters. It enables global namespace management across geographically dispersed data centers and geo-replication. The USE enables the following storage services:
- Object service: Provides the ability to store, access, and manipulate unstructured data. The object service is compatible with existing Amazon S3, OpenStack Swift APIs, EMC CAS and EMC Atmos APIs.
- HDFS: Enables you to use your portal storage infrastructure as a Big Data repository that you can run Hadoop analytic applications against (in-place).
The provisioning service manages the provisioning of storage resources and user access. Specifically, it handles:
- User management: Keeps track of which users have rights to administer the system, provision storage resources, and access objects via REST requests. portal supports both local and domain users.
- Authorization and authentication for all provisioning requests: Queries the authentication domain to determine if users are authorized to perform management, provisioning, and access operations.
- Resource management: Enables authorized users to create storage pools, Virtual Data Centers, and replication groups.
- Multi-tenancy: Manages the namespace that represents a tenant, and their associated buckets and objects.
The fabric service is a distributed cluster manager that is responsible for:
- Cluster health: Aggregates node-specific hardware faults and reports on the overall health of the cluster.
- Node health: Monitors the physical state of the nodes, and detects and reports faults.
- Disk health: Monitors the health of the disks and file systems. It provides raw, fast, lock-free read/write operations to the storage engine, exposes information about the individual disk drives and their status so the storage engine can place data across the disk drives according to the storage engine's built-in data protection algorithms.
- Software management: Provides command line tools for installing and running services, and for installing and upgrading the fabric software on nodes in the cluster.
This layer provides the Linux OS running on the commodity nodes and it implements network interfaces and other hardware-related tools.Back to Top
The portal appliance is available in the following models:
- U-Series: Unstructured storage servers with separate disk array enclosures engineered to maximize storage capacity
- C-Series: High-density compute servers with integrated disks engineered with greater compute capacity.
The tables below describe the appliance components by series:Back to Top
Nodes have the following network interfaces:
- Public: A 10GbE interface that handles all network traffic. The interface is connected to the rack's 10GbE switches in a bonded configuration. The 10GbE switches are uplinked to the customer network through 1-4 10GbE uplinks.
- Private: A 1GbE interface used for internal administrative operations. All of the interfaces are private and are reserved for use by
portal traffic. Each node is automatically assigned two private IP addresses using the following scheme:
- 192.168.219.port_number: This network is used for installation and maintenance activities. It supports only rack-local traffic.
- 169.254.Rack_ID.port_number: This network handles the distributed configuration service for nodes in the cluster. It supports only data-center local traffic.
After you deploy portal, you can use one of the portal Portal services interfaces to provision the storage resources so that they can be used by S3, Swift, CAS, or Atmos applications.Back to Top
VDCs are the top-level portal resources. They are logical constructs that represent the collection of portal infrastructure you want to manage as a cohesive unit. You can create a VDC to manage the resources of one or more physical racks, but the portal resources in a single VDC must be part of the same Nile Area Network (NAN). A VDC is also referred to as a site or a zone.
You can deploy portal software in multiple data centers to create a geo-federation. In a geo-federation, portal behaves as a loosely coupled federation of autonomous virtual data centers in which you provision each VDC separately.Back to Top
Storage pools allow you to logically partition the available storage resources (nodes) in a VDC. Storage pools provide the means for physically separating data based on application or multi-tenancy requirements. Storage pools require a minimum of four nodes. Data protection levels are defined by assigning storage pools to replication groups.Back to Top
Replication groups are logical constructs that define where storage pool content is protected, and the locations from which data can be read without WAN traffic. Replication groups can be local or global. Local replication groups protect objects within the same VDC against disk or node failures. Global replication groups span multiple VDCs and protect objects against disk, node, and site failures.
The strategy for defining replication groups depends on multiple factors including your requirements for data resiliency, the cost of storage, and physical versus logical separation of data.Back to Top
Namespaces enable portal to handle multi-tenant operations. They are assigned replication groups. Each tenant is defined by a namespace and a set of users who can store and access objects within that namespace. Namespaces can represent a department within an enterprise, or can be a different enterprise. Users of one namespace cannot access objects from another namespace.Back to Top
Buckets are containers for object data. Buckets are created in a namespace so they are only available to namespace users that have the appropriate permissions. Namespace users with the appropriate privileges can create buckets and objects within buckets for each object protocol using its API. Buckets can be configured to support HDFS. Buckets configured for HDFS access can be read and written using its object protocol and also the HDFS protocol.
Within a namespace, it is possible to use buckets as a way of creating subtenants.Back to Top
portal supports the following types of users and roles:
- System Admin: Users in this role, configure the VDC, storage pools, replication groups, LDAP namespaces, buckets, and users. The System Admin can also configure namespaces and perform namespace administration or they can assign a user who belongs to the namespace as the Namespace Admin. portal has a root user account which is assigned to the System Admin role and can be used to perform initial configuration.
- Namespace Admin: Users in this role configure namespace settings, such as quotas and retention periods, and can map domain users into the namespace and assign local users as object users for the namespace. Namespace Admin operations can also be performed from programmatic clients using the portal REST API or the portal Portal.
- Object user: Object users are end-users of portal object storage. They access storage through object clients using the portal supported access protocols (S3, Swift, CAS, or Atmos applications). Object users can be given privileges to read and write buckets and objects, within the namespace they are assigned to.
portal provides monitoring, diagnostics, and event auditing through the portal. Monitoring pages allow overviews of storage, resources, services, and events. The Monitoring pages allow you to drill down to get the right view of diagnostic data.Back to Top