April 6th, 2018 by Lyle Smith
Intel SSD DC P4500 Review
The Intel SSD DC P4500 Series is part of the company’s Intel 3D NAND SSD line and is designed to handle software-defined cloud infrastructures across the multi-cloud environment, specifically to reduce server footprint. With its availability in 2.5-inch PCIe, HHHL PCIe, and “ruler” form factors, the P4500 offers a ton of flexibility and will suit the needs of a range of different physical use cases. The Intel drive also features pair-to-CPU core mapping, which supports both high drive count and multiple SSDs scaling on Intel platforms.
The P4500 SSD is powered by an all-new NVMe controller, and Intel indicates that it has been optimized for read-intensive workloads and to maximize CPU utilization. As such, the Intel line is quoted to reach up to 3,300MB/s and 1,900MB/s in sequential reads and writes, respectively, and upwards of 645,000 IOPS and 65,600 IOPS in 4K random read and write performance. We will be looking at the 2TB 2.5-inch model, which is quoted to deliver results in the lower part of the P4500 line's performance range: 3,200 MB/s read and 1,050MB/s write in sequential performance, while hitting a potential 490,000 IOPS read and 38,000 IOPS write in random performance.
As far as reliability features go, the Intel P4500 comes with some pretty useful built-in end-to-end data protection for extra peace of mind. This includes protection from silent data corruption and Power Loss Imminent (PLI) technology, the latter of which safeguards drive content from unforeseen power loss via its power management chips, capacitors, firmware algorithms, and a built-in PLI self-test.
Backed by a 5-year warranty, the 2.5-inch Intel P4500 come in capacities of 1TB, 2TB and 4TB.
Intel SSD DC P4500 Series Specifications
|Form factor||2.5-inch PCIe 3.1 x4|
1TB, 2TB, 4TB
|NAND||3D NAND TLC|
|Random 4K Read||
|Random 4K Write||38,000 IOPS|
|Mean Time Between Failures (MTBF)||
2 million hours
|Vibration operating||2.17 GRMS|
|Vibration non-operating||3.13 GRMS|
|Shock (Operating and Non-Operating)||1000 G/0.5 msec|
|Average||Sequential Avg. 13.8W (Write), 9.5W (Read)|
5-year warranty with free technical support
Our Enterprise SSD reviews leverage a Lenovo ThinkSystem SR850 for application tests and a Dell PowerEdge R740xd for synthetic benchmarks. The ThinkSystem SR850 is a well-equipped quad-CPU platform, offering CPU power well in excess of what's needed to stress high-performance local storage. Synthetic tests that don't require a lot of CPU resources use the more traditional dual-processor server. In both cases, the intent is to showcase local storage in the best light possible that aligns with storage vendor maximum drive specs.
Lenovo ThinkSystem SR850
- 4 x Intel Platinum 8160 CPU (2.1GHz x 24 Cores)
- 16 x 32GB DDR4-2666Mhz ECC DRAM
- 2 x RAID 930-8i 12Gb/s RAID Cards
- 8 NVMe Bays
- VMware ESXI 6.5
Dell PowerEdge R740xd
- 2 x Intel Gold 6130 CPU (2.1GHz x 16 Cores)
- 16 x 16GB DDR4-2666MHz ECC DRAM
- 1x PERC 730 2GB 12Gb/s RAID Card
- Add-in NVMe Adapter
Testing Background and Comparables
The StorageReview Enterprise Test Lab provides a flexible architecture for conducting benchmarks of enterprise storage devices in an environment comparable to what administrators encounter in real deployments. The Enterprise Test Lab incorporates a variety of servers, networking, power conditioning, and other network infrastructure that allows our staff to establish real-world conditions to accurately gauge performance during our reviews.
We incorporate these details about the lab environment and protocols into reviews so that IT professionals and those responsible for storage acquisition can understand the conditions under which we have achieved the following results. None of our reviews are paid for or overseen by the manufacturer of equipment we are testing. Additional details about the StorageReview Enterprise Test Lab and an overview of its networking capabilities are available on those respective pages.
Comparables for this review:
- Memblaze PBlaze5 3.2TB
- Memblaze PBlaze4 3.2TB
- Intel P3700 2TB
- Intel P4510 8TB, 2TB
- HGST SN100 3.2TB
- Toshiba PX04 1.6TB
Application Workload Analysis
In order to understand the performance characteristics of enterprise storage devices, it is essential to model the infrastructure and the application workloads found in live production environments. Our benchmarks for the Intel P4500 are therefore the MySQL OLTP performance via SysBench and Microsoft SQL Server OLTP performance with a simulated TCP-C workload. For our application workloads, each drive will be running 2-4 identically configured VMs.
The next application benchmark consists of a Percona MySQL OLTP database measured via SysBench. This test measures average TPS (Transactions Per Second), average latency, and average 99th percentile latency as well.
Each Sysbench VM is configured with three vDisks: one for boot (~92GB), one with the pre-built database (~447GB), and the third for the database under test (270GB). From a system resource perspective, we configured each VM with 16 vCPUs, 60GB of DRAM and leveraged the LSI Logic SAS SCSI controller.
Sysbench Testing Configuration (per VM)
- CentOS 6.3 64-bit
- Percona XtraDB 5.5.30-rel30.1
- Database Tables: 100
- Database Size: 10,000,000
- Database Threads: 32
- RAM Buffer: 24GB
- Test Length: 3 hours
- 2 hours preconditioning 32 threads
- 1 hour 32 threads
With the Sysbench transactional benchmark, we saw the Intel P4500 2TB place last with just 4,476.3 TPS.
Moving on to Sysbench average latency, the P4500 placed last again with 28.6ms.
Looking at our worst-case MySQL latency scenario (99th percentile latency), the P4500 placed at the bottom of the leaderboard with 53.7ms, just behind the HGST drive.
Houdini by SideFX
The Houdini test is specifically designed to evaluate storage performance as it relates to CGI rendering. The test bed for this application is a variant of the core Dell PowerEdge R740xd server type we use in the lab with dual Intel 6130 CPUs and 64GB DRAM. In this case, we installed Ubuntu Desktop (ubuntu-16.04.3-desktop-amd64) running bare metal. Output of the benchmark is measured in seconds to complete, with fewer being better.
The Maelstrom demo represents a section of the rendering pipeline that highlights the performance capabilities of storage by demonstrating its ability to effectively use the swap file as a form of extended memory. The test does not write out the result data or process the points in order to isolate the wall-time effect of the latency impact to the underlying storage component. The test itself is composed of five phases, three of which we run as part of the benchmark, which are as follows:
- Loads packed points from disk. This is the time to read from disk. This is single threaded, which may limit overall throughput.
- Unpacks the points into a single flat array in order to allow them to be processed. If the points do not have dependency on other points, the working set could be adjusted to stay in-core. This step is multi-threaded.
- (Not Run) Process the points.
- Repacks them into bucketed blocks suitable for storing back to disk. This step is multi-threaded.
- (Not Run) Write the bucketed blocks back out to disk.
In our Houdini workload, the Intel P4500 placed in the lower-mid range of the tested drives with an 8-frame rendering time of 3,067 seconds. In comparison, the P4510 posted a solid 2,595.7 seconds.
VDBench Workload Analysis
When it comes to benchmarking storage devices, application testing is best, and synthetic testing comes in second place. While not a perfect representation of actual workloads, synthetic tests do help to baseline storage devices with a repeatability factor that makes it easy to do apples-to-apples comparison between competing solutions. These workloads offer a range of different testing profiles ranging from "four corners" tests, common database transfer size tests, to trace captures from different VDI environments. All of these tests leverage the common vdBench workload generator, with a scripting engine to automate and capture results over a large compute testing cluster. This allows us to repeat the same workloads across a wide range of storage devices, including flash arrays and individual storage devices. Our testing process for these benchmarks fills the entire drive surface with data, then partitions a drive section equal to 25% of the drive capacity to simulate how the drive might respond to application workloads. This is different than full entropy tests which use 100% of the drive and take them into steady state. As a result, these figures will reflect higher-sustained write speeds.
- 4K Random Read: 100% Read, 128 threads, 0-120% iorate
- 4K Random Write: 100% Write, 64 threads, 0-120% iorate
- 64K Sequential Read: 100% Read, 16 threads, 0-120% iorate
- 64K Sequential Write: 100% Write, 8 threads, 0-120% iorate
- Synthetic Database: SQL and Oracle
- VDI Full Clone and Linked Clone Traces
In our first VDBench Workload Analysis, we looked at random 4K read performance where all drives posted sub-millisecond latency throughout the benchmark. The Intel P4500 peaked at 463,745 IOPS with a latency of 275.1μs, placing it well behind the other tested drives.
Next we looked at 4K write performance, and here the P4500 peaked at 107,159 IOPS and 1,191.2μs latency, again placing in last spot.
Moving to sequential performance, we looked at our 64K benchmarks. With 64K sequential read, all drives posted sub-millisecond latency throughout the benchmark. The P4500 peaked at 24,009 IOPS or 1.5GB/s with a latency of 337μs.
For 64K sequential write, the P4500 fell well behind the pack again, peaking at just 8,399 IOPS or 524MB/s with a latency of 1,893μs.
Next, we looked at our SQL workloads where again no drive encroached the 1ms latency. The P4500 peaked at 113,441 IOPS with a latency of 280.9μs.
In our SQL 90-10, the Intel P4500 recorded a peak performance of just 95,208 IOPS and a latency of 335μs.
The SQL 80-20 showed the Intel P4500 with a peak score of 82,092 IOPS with a latency of 389μs. The P4510, on the other hand, performed exponentially better with a peak of 204,683 IOPS and 156μs.
The Oracle workloads again showed all drives with sub-millisecond performance. In the Oracle workload, the P4500 peaked at 74,764 IOPS with a latency of 480μs.
With Oracle 90-10, the P4500 straggled behind with 78,496 IOPS and a latency of 278μs.
In our last Oracle benchmark (80-20), the P4500 showed a peak score of 67,820 IOPS and a latency of 324μs.
Next, we switched over to our VDI clone test, Full and Linked (labeled FC and LC, respectively). For VDI Full Clone Boot, the P4500 once again fell well behind the comparables with a peak of 76,736 IOPS and a latency of 446μs.
For VDI FC Initial Login, the P4500 recorded a peak of just 30,102 IOPS and a latency of 991μs.
With VDI FC Monday Login, the P4500 had a peak performance of 24,678 IOPS and a latency of 639μs.
On the VDI LC Boot test, the P4500 recorded a peak of 40,181 and a latency of 395μs.
The VDI LC Initial Login saw just 14,386 IOPS and a latency of 546μs, placing it well behind the more performance-driven drives.
Our final test, the VDI LC Monday Login, had the P4500 peaking at 18,937 IOPS and 841μs latency.
The Intel P4500 series is an NVMe SSD that leverages the company’s new 3D NAND and comes in 2.5-inch PCIe, HHHL PCIe, and “ruler” form factors, the first of which we looked at for this review. The drive comes with a maximum capacity of 4TB (4 TB for the 2.5-inch models, 8TB for the ruler version) and quotes speeds of 3,200 MB/s in sequential read performance and 490,000 IOPS read in random performance. As was made obvious in the charts above, the P4500 has been improved upon with the release of the P4510.
Looking at details of our benchmark results, the P4500 showed lower-range performance in virtually all categories, with the exception of our Houdini test, which is specifically designed to evaluate storage performance as it relates to CGI rendering. Here it had lower mid-range results with an 8-frame rendering time of 3,067 seconds. In Sysbench, the P4500 had 4,476.3 TPS in our transactional test, 28.6ms in average latency, and 53.7ms in worst-case scenario.
In our VDbench benchmarks, the P4500 fell behind in our tests by a noticeable margin. During the 4K tests, the Intel drive peaked at 463,745 IOPS with a latency of 275.1μs for read and 107,159 IOPS and 1,191.2μs in write. For 64K sequential, the drive hit 1.5GB/s with a latency of 337μs in read and 524MB/s with a latency of 1,893μs in write.
Again, it's worth noting that Intel pretty quickly updated the P4500 to the P4510, which is obviously a worthy upgrade. The P4500 SSDs are still quite widely available and show up in many places like the VMware vSAN cluster we're reviewing; so understanding the underlying drive performance is important in that context. All things being equal, however, the P4510 is clearly the better buy for those interested in a more aggressive performance profile.