Home Enterprise Inside the new SPEC storage solution benchmark

Inside the new SPEC storage solution benchmark

by Guest Author

The Standard Performance Evaluation Corp. (SPEC) has released a major new version of its storage solution benchmark, designed to help IT managers make better purchasing, configuration and optimization decisions.

by Don Capps, SPEC Storage subcommittee chair

The Standard Performance Evaluation Corp. (SPEC) has released a major new version of its storage solution benchmark, designed to help IT managers make better purchasing, configuration and optimization decisions.

The new SPECstorage Solution 2020 benchmark is based on actual applications and real-world scenarios. It includes new workloads for artificial intelligence (AI) and genomics, expanded custom workload capabilities, massively better scaling, and a statistical visualization mechanism for displaying benchmark results. There are also dozens of upgrades that make the benchmark easier to use and more efficient in delivering results.

Real-world workloads

Workloads within the SPECstorage Solution 2020 benchmark represent applications used within five different markets. Each workload has a set of pass-fail criteria that must be met in order to validate the benchmark results. Examples of criteria include:

  • Whether the storage solution under test maintained the requested sustained average op_rate.
  • Whether the solution maintained an average latency below a given threshold.
  • Whether the solution maintained an equal balance of operations across the subcomponents of the workload.
  • Whether the solution maintained a maximum latency below a specific threshold for any subcomponent in the workload.

The new AI workload represents AI Tensorflow image processing environments, with traces collected from systems running COCO, Resnet50 and CityScape datasets. Metrics are based on the number of simultaneous jobs— defined as a set of subcomponents that encompass one complete workflow — that can be sustained under the defined pass/fail criteria.

The new genomics workload comes from traces of I/O operations used by commercial and research facilities to perform genetic analysis. The data has been sanitized so that it doesn’t contain any of the original genome data. Metrics are based on the number of simultaneous jobs that can be maintained under the workload’s pass-fail criteria.

Three workloads are carried over from the SPEC SFS 2014 SP2 benchmark, released in December 2017:

  • The electronic design automation (EDA) workload represents the typical behavior of a mixture of EDA applications, including front- and back-end processing. Metrics are based on the number of jobs that can be completed within a defined set of pass/fail criteria.
  • The software build environment workload comprises metadata-intensive tests derived from analysis and system traces of real-world applications. Metrics are based on the number of simultaneous builds that can be completed under the pass-fail criteria.
  • The video data acquisition workload (VDA) simulates applications that store data acquired from a time-sensitive source, such as a surveillance camera. Metrics are based on the number of video streams that can be captured under the pass-fail criteria.

Customized workloads

The new benchmark enables users to customize existing workloads or create new ones to gain insight into storage performance issues.

Customized workloads can accommodate a combination of Unix and Windows load generators for applications that use this combination in their workflows. An example of this combination would be a Windows system that collects sensor data and stores it using SMB while a Unix-based compute farm is analyzing the data using NFS, Lustre, GPFS, or some other file system technology.

SPEC storage benchmark

A sophisticated synchronization mechanism within the SPECstorage Solution 2020 benchmark keeps all of the geographically distributed load-generating processes in sync at a sub-millisecond resolution.

Massive scaling with fewer constraints

New internal IPC mechanisms in SPECstorage Solution 2020 significantly reduce the number of TCP ports and DNS lookups, improving scalability, reliability, and start-up and run-time performance. Scaling has been increased from 60,000 load-generating processes in previous versions to more than 4 million processes that can be distributed throughout the world.

A sophisticated synchronization mechanism keeps all of the geographically distributed load-generating processes in sync at a sub-millisecond resolution.

The new scaling efficiency is especially significant for testing cloud-based storage solutions. This is an important development, since among cloud providers, TCP ports are a scarce resource, with most virtual machines able to accommodate only a few thousand at best. The new benchmark never requires more than 1,000 TCP ports. This enables users to scale up to 4 million processes without running into cloud constraints.

SPEC storage benchmark visual 2The new statistical visualization mechanism within SPECstorage Solution 2020 allows users to extract runtime counters and connect them to a database for visualization.

Visualization for greater insights

The new statistical visualization mechanism within SPECstorage Solution 2020 allows users to extract runtime counters and connect them to a database such as Graphite, Carbon or Grafana for visualization. This provides greater insight into the behavior and operation details of the system under test.

Available now for downloading

The SPECstorage Solution 2020 benchmark is available for immediate download on the SPEC website for $2,000, with discounts for qualified research and academic institutions. Results from the new benchmark are not comparable to those from past SPEC storage benchmarks.

SPEC corporate members active in the development of the benchmark include Dell, IBM, Intel, iX Systems, NetApp, Pure Storage, and WekaIO. Supporting individual contributors includeUdayan Bapat, Sorin Faibish, and Brian Pawlowski.

###

Don Capps has more than 30 years of experience in performance engineering for storage systems. He is a performance engineer for Netapp; a founder of Iozone.org, a non-profit organization that produces and distributes free software for measuring the performance of computer storage systems; and chair of the Standard Performance Evaluation Corp. (SPEC) Storage subcommittee.

Engage with StorageReview

Newsletter | YouTube | Podcast iTunes/Spotify | Instagram | Twitter | Facebook | RSS Feed