December 11th, 2015 by StorageReview Enterprise Lab
Memblaze PBlaze4 2.5" NVMe SSD Review
The PBlaze4 enterprise SSD platform is a new generation of PCIe flash storage from Memblaze, built with PMC’s Flashtec NVMe controllers and Toshiba MLC NAND. This hardware profile powers Memblaze’s MemSpeed and MemSolid SSD optimization technologies as an NVMe-compliant SSD solution with performance and endurance specifications oriented towards hyperscale datacenters and other enterprise applications. Like the PBlaze3, this family of drives is capable of offloading many of the processes that are handled by host CPUs with other SSD architectures.
PBlaze4 drives are available in capacities from 800GB to 3.2TB in both 2.5-inch and Add-in Card (AIC) form factors. A 6.4TB (8TB raw capacity) AIC is the largest PBlaze4 drive available. Our review will focus on one 2.5-inch NVMe PBlaze4 as well as one NVMe edge card variant. The PBlaze4 2.5-inch form factor is hot plug, hot removal, and hot swap compliant. The PBlaze4 series also incorporates as power loss capacitor to preserve data in transition, with a data integrity guarantee for data in non-volatile NAND media and any cached writes.
PBlaze4 is Non-Volatile Memory express (NMVe) 1.1b and PCIe Gen3 compliant, providing native driver support across many common operating systems including recent versions of Windows, Linux, and VMware. UEFI motherboards can also boot from PBlaze4 drives. The PBlaze4 utilizes a Pseudo-SLC (pSLC) memory management mode that has been engineered to allow MLC emulate the speed and durability of SLC. PBlaze4 drives designate the portions of the memory used for metadata as pSLC for greater metadata protection and reliability. Two other key technical systems that PBlaze4 bring to the table are MemSpeed and MemSolid, which are designed to improve PCIe SSD performance, reliability, and QoS.
Memblaze PBlaze4 Specifications
- User capacity: 800GB, 1.2TB, 1.6TB, 2.4TB, 3.2TB, 6.4TB
- Sequential Read (128kb): 2.2 GB/s, 2.8 GB/s, 2.8 GB/s, 2.8 GB/s, 2.8 GB/s, 3.4 GB/s
- Sequential Write (128kb): 700 MB/s, 1.4 GB/s, 1.4 GB/s, 2.2 GB/s, 2.2 GB/s, 2.5 GB/s
- Sustained Random Read (4kb) IOPS: 600k, 740k, 750k, 730k, 740k, 800k
- Sustained Random Write (94kb) IOPS (100% span): 60k, 240k, 150k, 320k, 200k, 250k
- Lifetime Endurance (Drive Wipes per Day): 3, 4, 3, 4, 3, 3
- Latency Read/Write: 90us/20us
- Uncorrectable Bit Error Rate: < 1 sector per 10^17 bits read
- Mean Time Between Failures: 2 million hours
- Form Factor: 2.5” HHHL (FHHL for the 6.4TB version)
- Interface: PCIe 3.0 x 4 (PCIe 3.0 x 8 for the 6.4TB version)
- Protocol: NVME
- NAND Flash Memory: MLC
- Operation System: RHEL, SLES, CentOS, Ubuntu, Windows Server, VMware ESXi
- Power Consumption: <25w (<35w for the 6.4TB version)
- Operating Temperature:
- AIC: 0 – 55℃ ambient temperature with suggested airflow
- 2.5’’: 0–35℃ ambient temperature with suggested airflow, 0-70℃ case temperature
- Airflow (LFM): 300@25℃ (450@25℃ for the 6.4TB version)
- Software Support: CLI Management Tool, OS in-box driver
Design and Build
The Memblaze PBlaze4 is an NVMe SSD that has a 2.5” form factor with a 15mm Z-height. It is thicker than other SSDs but it will still fit within most arrays as many are designed to house 3.5” HDDs as well as the thicker 2.5” HDDs. The outer housing is brushed metal with a sticker on top with branding. The drive also has several grooves in line with the sticker to improve heat dissipation. There are four screws on top that hold the case together.
Along the side of the drive are the mounting holes (2 on each side) as well as a sticker with the model/serial number.
The bottom of the drive has grooves running across the entire surface for heat dissipation. There are 4 holes for mounting. On the bottom one can also see the PCIe interface that though it looks similar to a SAS interface, it does enable much higher performance.
Opening up the drive one can see the dual PCB connected with a ribbon cable. Inside we can see that the PBlaze4 uses a PMC Flashtec NVMe Controller and MLC NAND.
Testing Background and Comparables
The StorageReview Enterprise Test Lab provides a flexible architecture for conducting benchmarks of enterprise storage devices in an environment comparable to what administrators encounter in real deployments. The Enterprise Test Lab incorporates a variety of servers, networking, power conditioning, and other network infrastructure that allows our staff to establish real-world conditions to accurately gauge performance during our reviews.
We incorporate these details about the lab environment and protocols into reviews so that IT professionals and those responsible for storage acquisition can understand the conditions under which we have achieved the following results. None of our reviews are paid for or overseen by the manufacturer of equipment we are testing. Additional details about the StorageReview Enterprise Test Lab and an overview of its networking capabilities are available on those respective pages.
We tested the Memblaze while comparing it to the following other NVMe SSDs:
- Samsung XS1715 Enterprise NVMe SSD
- Intel SSD DC P3700
Application Workload Analysis
In order to understand the performance characteristics of enterprise storage devices, it is essential to model the infrastructure and the application workloads found in live production environments. Our first benchmarks for the Memblaze PBlaze4 are therefore the MySQL OLTP performance via SysBench and Microsoft SQL Server OLTP performance with a simulated TCP-C workload. For our application workloads each drive will be running 2-4 identically configured VMs.
StorageReview’s Microsoft SQL Server OLTP testing protocol employs the current draft of the Transaction Processing Performance Council’s Benchmark C (TPC-C), an online transaction processing benchmark that simulates the activities found in complex application environments. The TPC-C benchmark comes closer than synthetic performance benchmarks to gauging the performance strengths and bottlenecks of storage infrastructure in database environments. Each instance of our SQL Server VM for this review uses a 333GB (1,500 scale) SQL Server database and measures the transactional performance and latency under a load of 15,000 virtual users.
When looking at SQL Server Output, the Memblaze drive had a top TPS of 3,157.235 with an aggregate of 3,157.112 TPS. The top performer here was the Intel DC SSD P3700, which recorded an aggregate of 3,157.341 TPS.
Looking at average latency results during the 15k user SQL Server benchmark showed the Memblaze drive just slightly behind the Samsung and Intel SSDs (both of which posted 7.0ms) with an aggregate of 7.5ms.
The next application benchmark consists of a Percona MySQL OLTP database measured via SysBench. This test measures average TPS (Transactions Per Second), average latency, as well as average 99th percentile latency. Percona and MariaDB are using the Fusion-io flash-aware application APIs in the most recent releases of their databases, although for the purposes of this comparison we test each device in their "legacy" block-storage modes.
In the average transactions per second benchmark, the Memblaze was slightly edged out the Intel SSD DC P3700. The Memblaze’s top performance of a single VM was 5,717.2 TPS, though its aggregate was 1,429.8, while the Intel SSD DC P3700 showed the best results with 5,779.7 TPS aggregate.
When looking at average latency results, the Memblaze was behind the Intel again, with individual VMs running between 22.34ms through 22.42ms and aggregate latency of 22.38ms. The Intel drive ranked at the top of the leaderboard with 22.15ms aggregate.
In terms of our worst-case MySQL latency scenario (99th percentile latency), the Memblaze showed VMs running between 58.03ms and 58.00ms while the top performing Intel drive boasted an impressive aggregate of just 45.97ms.
Enterprise Synthetic Workload Analysis
Flash performance varies as the drive becomes conditioned to its workload, meaning that flash storage must be preconditioned before each of the fio synthetic benchmarks in order to ensure that the benchmarks are accurate. Each of the comparable drives are secure erased using the vendor's tools and preconditioned into steady-state with a heavy load of 16 threads and an outstanding queue of 16 per thread.
- Preconditioning and Primary Steady-State Tests:
- Throughput (Read+Write IOPS Aggregate)
- Average Latency (Read+Write Latency Averaged Together)
- Max Latency (Peak Read or Write Latency)
- Latency Standard Deviation (Read+Write Standard Deviation Averaged Together)
Once preconditioning is complete, each device is then tested in intervals across multiple thread/queue depth profiles to show performance under light and heavy usage. Our synthetic workload analysis for the Memblaze PBlaze4 uses two profiles, which are widely used in manufacturer specifications and benchmarks. It is important to take into consideration that synthetic workloads will never 100% represent the activity seen in production workloads, and in some ways inaccurately portray a drive in scenarios that wouldn't occur in the real world.
- 100% Read and 100% Write
- 70% Read/30% Write
In our throughput 4k write preconditioning test, the Memblaze was easily the most consistent drive, starting at roughly 160,000 IOPS and hitting a steady-state at around the same speed. Both the Samsung and Intel drives showed major spikes near the beginning of the test.
Next we look at average latency. Again, the Memblaze was the most consistent drive; however, it still finished behind the Intel drive when it reached its steady-state.
When measuring max latency, the Memblaze showed major spikes during the entirety of our test. The Intel SSD was the most stable drive by a noticeable margin, hovering around the 25ms latency mark throughout the test and without any major spikes.
Standard deviation calculations is designed make it easier visualize the consistency of the SSD latency performance results. In this scenario, the Intel drive started out strong, though it showed a significant spike in latency around the 22 minute mark. It remained fairly stable afterwards, taking top spot and just under 1.5ms by the end.
During the primary 4k synthetic benchmark, the Memblaze drive was by the top performer by far in the read column with an impressive 717,172 IOPS (reaching 148,111 IOPS write) while the Intel drive showed the best write performance with 172,672 IOPS.
Results were more or less of the same when looking at average latency, with the Memblzae boasting an impressive 0.36ms read and 1.73ms write. The Intel drive places just behind the Memblaze with 0.56ms read and 1.48ms write.
In max latency, the Memblaze drive came in just behind the Samsung in reads with just 6.9ms, while its max write latency was much higher at 128.2ms, taking last by a significant margin.
Looking at standard deviation shows the same rankings, with the Memblaze drive posting an average write latency of 2.848 and an average read latency of 0.195ms. The top performer here was the Samsung drive with 0.08ms read.
Our next workload uses 8k transfers with a ratio of 70% read operations and 30% write operations. Again, we will start off with the preconditioning results before switching to the main tests. In throughput, the Memblaze drive showed mid performance at beginning of the test while ending up with a steady-state of over 176,000 IOPS for first among the comparables.
Average latency told a similar story, with the Memblaze showing 0.8ms in at the beginning of the test, and reaching a steady-state around 1.5ms for top spot on the leaderboard. The Intel drive wasn’t far behind, with 1.6ms.
When looking at max latency, the Memblaze drive was by far the least consistent drive, showing a huge latency spikes throughout, similar to that of our 4K preconditioning results. The most consistent drive here was the Intel P3700.
Standard deviation showed pretty inconsistent results across the board during the first portion of the test. Again, the Intel P3700 showed the best results, as it only peaked over the 1.2ms on a handful of occasions, while the Memblaze performed fairly close (but a bit better) to the Samsung drive during most of the test.
After we fully preconditioned the Intel P3700 drive, we put it through our main 8k 70/30 test. In throughput, the Intel drive showed the best throughput throughout the majority of the test only to fall behind the Memblaze at the last queue depth surpassing, which surpassed 166,250 IOPS.
Average latency mirrored the throughput results, with the Intel and Memblaze drives performing neck and neck to until the very even, where the Memblaze just pulled away again in the terminal once again to take the lead.
Looking at max latency showed the Memblaze with the slowest latency, peaking over 70ms in the terminal queue depths. The Intel drive had the best overall results, as it only surpassed the 20ms mark on a handful of occasions. The Samsung XS1715, however, boasted the most consistent results with the least amount of latency spikes.
Standard deviation demonstrated near identical performance between the Intel and Memblaze drives, all the way to the end of the test; however, the Intel drive pulled away around the 16T8Q mark, posting the best overall results.
The PBlaze name is known as a vehicle for Memblaze to start with third party controllers and NAND in order to focus their own engineering efforts on proprietary performance, reliability, and host CPU offloading technologies. In this case, PMC Flashtec controllers and Toshiba MLC NAND provide the basic NVMe interface and storage components for the PBlaze4. With hot swap functionality (for the 2.5-inch variant) and robust power loss protection across the board, the PBlaze4 features the kind of data protection that data center and enterprise clients are coming to expect. Its thermal throttling and new Memblaze out of band management functionality also increase the variety of applications possible at large scales.
The Memblaze PBlaze4 certainly had a good showing for the most part during our tested workloads when we put it against the Samsung XS1715 1.6TB and Intel SSD DC P3700 NVMe SSDs. In our SQL Server test, the Memblaze drive boasted 3,157.235 TPS as well as an aggregate of 3,157.112 TPS while showed the Memblaze drive just slightly behind the Samsung and Intel SSDs (both of which posted 7.0ms) with an aggregate of 7.5ms. In our Sysbench tests, we saw impressive performance with a top TPS of 5,717.2 TPS, an aggregate average latency of 22.38ms, and a worst case (99th percentile) VMs running between 58.03ms and 58.00ms.
During our main Synthetic benchmarks, the Memblaze drive posted an impressive 4k throughput of 717,172 IOPS (while reaching 148,111 IOPS write). Looking at average latency, the Memblzae boasted an impressive 0.36ms read and 1.73ms write, while leading boasting max latency/standard deviation read performance of 6.9ms read and 0.195ms, respectively. Performance remained very good during our 8k 70/30 workloads, as it boasted peak throughput with that surpassed 166,250 IOPS as well as the best average latency of the tested drives; however, it performed poorly in max latency readings.
- Impressive performance during our main synthetic workloads
- Ranks high in application performance
- Slightly slower SQL server performance
The Memblaze PBlaze4 NVMe SSD offers terrific performance in a 2.5” form factor that would be ideal for hyperscale data centers.