July 25th, 2019 by StorageReview Enterprise Lab
The Importance of Hardware in Microsoft Azure Stack HCI
In a previous article we looked at Microsoft Azure Stack HCI, an on-premises implementation of Microsoft Azure cloud service, and how a Microsoft certified hardware vendor (DataON) has been integrating it with their hardware to provide a turnkey solution for it. In this article we will take a deeper look into that hardware and the relationship that DataON has built with their hardware provider to ensure that DataON customers have a platform that is built on reliable and technologically advanced components.
Just as DataON has built up a relationship with Microsoft for software it has built up a relationship with Intel for hardware; DataON has been an Intel Platinum partner for 30 years and has several solutions verified as Intel Select Solutions, including their recently verified Intel Select Solution for Microsoft Azure Stack HCI V2. By having Intel Select Solutions verified to work with the latest Intel technologies, IT administrators can take advantage of new technologies faster which allows business to respond more rapidly to demands of a dynamic business. To be verified as an Intel Select Solution, DataON had to meet the performance benchmarks and validation requirements set by Intel and then write an implementation guide which allows for a fast and easy deployment of a workload optimized solution.
Often using the latest technology leads to compatibility issues, which DataON minimizes by only using Intel server systems and chassis. Intel uses their own servers when developing new technology (processors, storage, etc.) and to showcase their latest technology. Intel then uses this information to create reference architectures for others to use when building systems.
DataON premier partner status with Intel allows them to leverage their relationship and to be the first to market with servers that use the latest Intel technology often months before other server vendors can bring similar products to market. An example of this was seen on June 20, 2019 when we went toMicrosoft’s Azure Stack HCIwebsite noticed that DataON had the only servers certified by Microsoft with Intel’s newest memory innovation, Intel Optane DC Persistent Memory (PMEM).
DataON incorporates the latest Intel processors into their systems and were one of the first vendors to offer the Intel Y series of 2nd generation Intel Xeon Scalable processors to its customers. Intel’s Y series of processors (8260Y /6240Y /4214Y) is interesting as it is the first processor that supports Intel Speed Select Technology - Performance Profile (SST-PP) mode, which allows a processor to be reconfigured into a lower core-count, higher frequency processor or a higher core-count, lower frequency processor. Intel SST-PP should not be confused with processors that have Intel Speed Select Technology - Base Frequency (SST-BF) mode which allows the processor to be deployed with an asymmetric core frequency configuration, which allows some core in the CPU to run at a higher frequency than other cores.
As SST-PP is a new technology we wanted to dive deeper into it. We found that by using SST-PP the 8260Y can be configured at 3 different core counts and frequencies. The chart below shows how the core count affects the base and turbo frequency of the processor.
|Configuration||Cores||Base Frequency||Turbo Frequency|
An SST-PP enabled processor has some interesting use cases. A server with a SST-PP capable processor could be used to provide virtual desktops (which need many cores, but don’t need to run that fast) during the day and then at night when it is not being used for virtual desktops it could be reconfigured switch to have a faster base speed, but with fewer cores and used to run analytics which tend to be single threaded and would take advantage of a processor that has fewer cores but with higher speeds. Another use case for a SST-PP enabled processor would be limiting the number of servers that need to be approved and qualified by an organization, yet still have the ability to deploy 24, 20, or 16 core servers to match business requirements. A business could even use an SST-PP enabled processor to deal with licensing software; a customer could limit the number of cores a processor has to use a less expensive license and then if their workload increases could increase the cores and purchase additional licensing to support without switching servers. Currently Intel has three processors that support SST-PP technology the 8260Y, 6240Y and the 4214Y.
DataON is one of the first vendors that we have seen use Y series processors in their servers. This shows DataON dedication to providing their customers with the latest technology and speaks loudly to their relationship with Intel. Not all DataON customers will opt for servers with the slightly more expensive Y series processors in them, but it is nice to see that DataON is one of the few vendors that gives them this option.
For storage DataON uses Intel devices for reliability and performance. For the highest performance DataON use Intel’s latest storage innovation, Optane, based on 3D XPoint technology. Optane technology is used in NVMe drives persistent memory (PMEM); a new storage layer.
Intel first released Optane NVMe drives with in Q1 of 2017. These drives, as do all NVMe drives, attached to a system via a PCIe bus. At the time of our testing of Optane backed NVMe drives showed that for low-latency workloads, there was nothing that came close to the drive that we tested. Due to the performance of Intel’s Optane NVMe drive DataON uses them for cache on their high performance HCI systems.
Intel released a new line of PMEM storage devices (named Optane DC Persistent Memory) in Q2 2019. PMEM devices are plugged into DDR4 slots where it has an unencumbered path to the CPU. While PMEM is slower than DRAM it is many times faster than NVMe and unlike DRAM the information stored on PMEM is nonvolatile and will remain even after a system reboot. Intel labels its PMEM devices.
DRAM is currently capped at 128GiB per module but PMEM has a maximum capacity of 512GiB and it is less expensive per Gib than DRAM. These characteristics make PMEM attractive for high performance HCI servers. PMEM does requires a processor and motherboard that is designed to work specifically with it, fortunately most of Intel’s second-generation Xeon processors have been designed to work with it.
As proof of the performance of PMEM on the Microsoftwebsitethey state that, in conjunction with Intel, they set a new HCI benchmark record of 13.7 million IOPS using Storage Spaces Direct on Windows Server 2019 on a server with Intel Optane DC PMEM.
As mentioned above DataON was the first vendor to have servers certified by Microsoft for use with PMEM. To effectively leverage the performance of PMEM an entire solution needs to be designed with it in mind to remove any bottlenecks that can hamper its performance. The biggest of these bottlenecks is the interconnect used to pass data between nodes in a cluster.
While 10GbE is prevalent in the enterprise, there are many advantages of moving HCI interconnect to faster speeds. As solutions like those from DataOn move to faster storage devices and processors, the performance limitation quickly becomes the networking fabric. Early generation solutions used dual-port 10G connections in a failover configuration, giving each node an effective 10Gb or 1GB/s connection speed, capping effective cluster storage bandwidth at just over 4GB/s. While impressive at first glance, this is much slower than what the underlying storage technology is capable of. In fact, some solutions are close to surpassing this bandwidth limit on small-block I/O workloads such as random 4K, not just large-block sequential streams. The move to 25Gb base Ethernet connectivity pushes the node to node traffic up to 2.5GB/s and allows aggregate cluster performance to push as high as 10GB/s.
The node to node interconnect on mainstream solutions DataON uses Mellanox RDMA RoCE v2 SMB3 40GbE fabric with 2x switches for redundancy. For clusters that use Optane PMEM DataOn uses 100GbE switches and adaptors on their clusters to support the increased bandwidth that PMEM requires.
DataON’s Microsoft Azure Stack HCI solution is certified by both Microsoft and Intel and is the first vendor to be certified by Microsoft for use with PMEM with Microsoft Azure Stack HCI. These certifications speak volumes about the DataON’s ability to deliver a quality solution that incorporates the latest technology, and more importantly it allows DataON customers to leverage the newest technology in their datacenter.