May 1st, 2018 by Brian Beeler
Dell EMC XtremIO Replication Review
Today Dell EMC has announced the addition of native asynchronous replication in XIOS 6.1 for XtremIO all-flash storage arrays. Until now, replication for XtremIO has been handled by software tools like Dell EMC RecoverPoint for snapshot-based replication or Site Recovery Manager for VMware shops. There're also the options of pairing XtremIO and Dell EMC VPLEX for a variety of replication use cases. While these options solve the problem for many customers, they can also introduce increased complexity and cost, depending on the hardware in the environment. As such, a native metadata-aware replication solution for XtremIO has always been on the radar and is the number one feature XtremIO customers have been asking for.
Most know this already, but let's take just a quick look at the two types of replication. Asynchronous replication allows for replication over a long distance (100s of miles or more) while maintaining a dependent write-consistent copy of data between the local and remote site(s) at all times. With synchronous replication, the distance between sites needs to be much closer, typically referred to in terms of a metro area (~30-50 miles). At times, some customers even replicate within a single data center, depending on data needs and regulatory requirements. In any case, a host initiates a write to an array at the main site while at the same time data is written to the replication target. Data must be successfully stored in both local and remote sites before an acknowledgement is sent back to the host. Synchronous replication introduces additional latency as a result, which is why being physically close to the replication target is important. While some industries require synchronous replication, the vast majority find asynchronous replication acceptable for their business requirements.
With XIOS 6.1, WAN optimization is a key emphasis and XtremIO is well-suited to help execute on this goal. It's important to remember that XtremIO does its data compression and deduplication inline, and the arrays rely on metadata tables for pointers as to where to find a duplicated block. This efficiency lends itself well for replication for a few reasons. First, only unique data is sent to the remote site; blocks that already exist on the target never go over the network. Second, only compressed data is sent to the remote site. These factors give XtremIO a significant optimization advantage that helps to ensure recovery goals are met. The replication load on the system is minimal, so there's never a need to disable replication or worry about a performance hit to the other native data services. While other competitive arrays offer replication, Dell EMC argues that XtremIO is arguably one of the most efficient due to the inherent architecture design.
Where the replication can get even more fun from an efficiency perspective is in multi-site, fan-in configs. So rather than being a one-to-one ratio of main-to-remote sites, an organization could have several XtremIO arrays replicating to a single target. Because global dedupe is in play, Dell EMC anticipates a further 38% capacity savings (4:1 fan-in setup) for the replication target array. Again, there are massive WAN gains as well due to the reduced flow of data over the fabric.
With the multiple replication use cases, XtremIO is designed around being highly configurable in terms of how the ports are assigned for local storage access versus dedicated links for replication. In this setup, the unit has 2 SFP+ 10G connectors as well as one 10Gbase-T connection per controller allocated for replication, with two ports per controller configured for FC access.
In addition to the replication news, Dell EMC also has launched the XtremIO X2-T. The X2-T offers all the features and data services of the XtremIO family in a lower-cost configuration designed for the midrange. X2-T comes in a single X-Brick configuration that scales from 34.5TB raw up to 69.1TB raw or 369TB effective capacity given a 6:1 storage efficiency. X2-T is available May 3rd.
Features don't mean a lot if they're cumbersome or confusing to implement. Dell EMC has spent a great deal of time ensuring that replication configuration is simple. The entire process is wizard-driven, making setup of replication and retention policies quick and intuitive. A wizard inside the Data Protection menu guides the administrator through the process, which is easy to use and intuitive for novice to experienced users. Of course, CLI is available for advanced users, but there's good demand for both CLI and the HTML interface.
The first step prior to creating a protection session is creating a consistency group based on the volumes you plan on protecting. This can be done with clicking the "New" button and entering in a consistency group name.
Next you select the volumes you want to be inside of it. Volumes can be manually selected from the full list or narrowed down through a keyword search. This process is simple, yet critical to ensure parity between the main array and replication target by ensuring a match at both sites, eliminating user error. When finished click "Apply."
With the consistency group in place, you move into the Data Protection menu. This is where you can view existing Protected Entities and get high-level information of each at a quick glance.
To create a new Protection Session, you start with selecting the consistency group for your volumes. In this case, we use our previously created Linux-Prod-01 group with 8 volumes. In the same screen you can also select if this will be a remote or local protection type.
As a remote protection type, the next screen is where you select the target cluster.
In the following menu, the XtremIO will either automatically create a consistency group on the target cluster of the same name you created locally, or allow administrators to create one manually. Target-volume access restrictions are also set at this stage.
Next, administrators are able to set the RPO (which can be adjusted anywhere from 30 seconds to 1 day, as well as the Source retention policy. These are highly customizable and can be tuned for the exact application running on the storage group being protected, accounting for elements like change rate and WAN impact/costs. The RPO should be monitored for compliance, as the rate of data change and bandwidth of the connection between the two systems will dictate how fast data can be replicated to a certain extent.
Finally, the target retention policy is selected, which can be duplicated from the source array, or can be customized separately for the target system (depending on the use case). This allows for multiple copies at either location based on scheduling that is dependent on an organization's snapshot requirements, i.e., how much protection they want at intervals like minute, hour, day, week and so on.
The guide completes with a summary window showing the selected options, with an option to finish, or finish and start the protection policy immediately.
After the Data Protection Policy is put into place, you can view its matching information, such as the volumes below, and start to populate on the target array.
With the new Data Protection Session in place, the same interface window allows administrators to start and stop protection services depending on the circumstances. This can be useful if a policy is running much longer than normal, of if something needs to be replicated outside of the window specified.
After the Data Protection is in place, the XtremIO interface gives administrators visibility into the day-to-day information. At a quick glance you can view the bandwidth on the connection, how much data has traversed between the units, the RPO compliance as well as many other high-level stats. Protected sessions not meeting their SLA requirements are immediately known, while those working as intended blend away into the background.
Internal reporting through the XtremIO array is very helpful when monitoring Data Protection in terms of the saturation levels of your outbound links, as well as knowing how much headroom is left based on spikes in bandwidth throughout the day or week. Below we see the target array and the bandwidth used for Remote Protection.
With Data Protection Sessions in place, administrators are also able to start a failover process manually, leveraging either the local system, or failing over to the remote system entirely.
Of course, the headline with this XtremIO update is native replication, but let's not look past the XtremIO X2-T as well. This new single X-Brick scales from 34.5TB raw up to 69.1TB raw or 369TB effective capacity given a 6:1 storage efficiency. The X2-T gives Dell EMC a new tool to bring XtremIO's power and deep feature set to midmarket and smaller remote operations. Additionally, within the context of the replication news, the X2-T offers some customers a lower cost option as a replication target for those not pushing the capacity of their XtremIO boxes as in something like VDI where the inline data reduction services are able to be extremely effective.
Digging into the native asynchronous replication specifically, XtremIO was clearly built for this role. Their emphasis on metadata as the authority for data residence makes XtremIO an ideal platform when considering replication efficiency. Because all of the data on the host system is compressed, deduped and compared to known data inline, XtremIO writes only occur for new blocks, where existing blocks get a pointer update in the metadata. This efficiency extends to the replication site now as well with XIOS 6.1, as only new data is passed over the WAN. These network weight benefits get extended further in a fan-in scenario where up to 4 XtremIO arrays replicate to a single target.
These efficiencies mean little, however, if replication is difficult to set up and manage or if the results are unreliable. Above we walked through the very simple wizard-driven process for configuring consistency groups and data protection policies. While all of this is available via CLI as well, XIOS 6.1 makes the process easy enough that non-storage experts should be able to handle the task with ease. The built-in checks and failover testing are accessed with drop downs and menus that are intuitive and ensure policies are properly configured. There is also an easy-to-understand dashboard that confirms (or not) that your RPO objectives are being met with a detailed view of your SLAs and other relevant information.
We did not complete a deep dive of performance as part of this review, though we did have several workloads running against the X-Bricks under test during the duration of our replication configuration. This is not entirely scientific, but with several replication jobs set up and running, we did not see any noticeable performance drop as those jobs kicked off and completed in perpetuity within their 30-second windows. This is attributed again to the efficiency of the way XtremIO was designed from day one. Even though replication wasn't present then, the XtremIO team had a vision for where they wanted to be and with XIOS 6.1, they're another big step along the path. As replication was the most requested feature by XtremIO customers, there's likely to be a little extra buzz around Dell Technologies World this week as XtremIO customers anticipate what they'll be able to do once the XIOS update hits their boxes. XIOS 6.1 with replication is available now as a free update to XtremIO X2 customers.