The VMmark Virtualization Benchmark is a comprehensive multi-host datacenter virtualization benchmark designed to mimic the behavior of complex consolidation environments. Legacy benchmarking methodologies designed for single-workload performance and scalability are insufficient for server consolidation, which gathers a collection of workloads onto a virtualization platform consisting of a set of physical servers with access to shared storage and network infrastructure.
The ability to virtualize irregular workloads while effortlessly load-balancing and automating workload provisioning combined with a wider range of administrative tasks has revolutionized server usage. As such, VMmark benchmarking focuses on user-centric application performance and accounts for the effects of this infrastructure activity (that can impact CPU, network, storage or other performance) on overall platform performance.
VMmark 2.x's benchmarking approach utilizes a series of sub-tests derived from commonly used load-generation tools and commonly initiated virtualization administration tasks. The benchmark implements a tile-based scheme for measuring application performance. The unit of work known as a tile is best defined as a collection of VMs running a diverse set of workloads encapsulated in diverse set on VMs.
VMmark 2.x also executes ubiquitous platform infrastructure workloads such as cloning and deploying of VMs, automatic VM load balancing across a datacenter, VM live migration (vMotion) and dynamic datastore relocation (storage vMotion). These operations complement the conventional application-level workloads. A data center's consolidation capacity, which measures scalability and individual application performance, is thus measured as the number of tiles that the data center platform can handle while at the same time supporting the required administrative operations. The performance of each workload within every tile that a multi-host platform can accommodate combined with the performance of the infrastructure operations determines the overall benchmark score.
Fully compliant VMmark benchmark tests are designed to run for a minimum of 3 hours with workload metrics reported every minute. After a benchmark run, each tile's metrics are computed and aggregated into a score for that tile. For aggregation, first the test normalizes the metrics via a reference system (in order to match up ratings such as MB/s and database commits/second). Then, a geometric mean is computed as the final score for the tile, with all of the per-tile score added to create the application workload portion of the final metric. Infrastructure workloads utilize a similar process for their portion of the metric. Dissimilar however, is how the infrastructure workloads are scaled by the size of the underlying server cluster and not explicitly by the user. As a result, the infrastructure workloads are compiled as a single group and no multi-tile sums are required. From this point, a final benchmark score is then computed as a weighted average where application-workloads account for 80% and infrastructure-workloads comprise 20%. These weights reflect the relative contribution of infrastructure and application workloads to overall resource demands.
In order to run the VMmark Virtualization Benchmark, there are some serious hardware requirements to start, which only increases as the number of tiles you are testing goes up.
VMmark Virtualization Benchmark Minimum Specifications
VMware's VMmark 2.5 utilizes a wide range of software and operating systems to fully reflect a real-world virtualized environment. Below is an overview of the VM's included in each VMmark tile and the applications and operating systems they use.
VMmark 2.5 Tile Configuration
VMware VMmark Testing Environment
Storage solutions are tested with the VMware VMmark benchmark in the StorageReview Enterprise Test Lab utilizing multiple servers connected over a high-speed network. We utilize Dell PowerEdge R730s for different segments of the VMware VMmark Environment, including four for the VMmark 2.5.1 hosts, two for hosting multiple virtual clients, one operating as a physical prime client, one running a VMware vCenter Appliance, and one as a temporary staging ground for each tile leveraged in our VMmark test. The PowerEdge line also offers excellent hardware compatibility, which is an absolute must as we incorporate different forms of storage and networking technology into our testing platform. As with our other testing platforms, our goal is to show realistic performance customers can expect from mid-range server platforms, versus the top-spec servers generally leveraged in most competitive benchmarks. Another advantage of this unique 4-host VMmark platform is we can leverage more host-side resources in aggeragate than a top-spec 2-host setup, putting the stress on the storage product under test without getting CPU-bound.
For local storage in this VMmark environment, we went with a cost and power-efficient SD boot card layout. These SD cards are utilized as hypervisor boot drives on the VM server and virtual client side of our VMmark testing layout. This removes the cost of a SSD or HDD from each server in this environment, as well as cuts down on power consumption. Our Windows SErver 2008 R2-based virtual client VMs reside on storage presented by a DotHill Ultra48 SAN, running off a pool of 10K HDDs and SSD tiering. This helps rule out all possibility of the hosts becoming I/O bound during this benchmark.
Mellanox 56Gb InfiniBand interconnects were used to provide the highest performance and greatest network efficiency on each ESXi vSphere host to ensure that the VMs connected are not network-limited. We use one single-port Mellanox ConnectX-3 NIC operating in IPoIB mode, with multiple VM networks running on a single vSwitch. This alleviates any network constraints and reduces the complexity of the environment in our multi-use testing infrastructure.
We are constantly growing our networking infrastructure to use the best and fastest equipment in our reviews. As such, we are constantly upgrading our lab and enterprise testing equipment to adapt to the ever-changing technology.
First Generation VMmark Platform
First Generation VMware VMmark Virtualization Benchmark Equipment
Second Generation VMware VMmark Virtualization Benchmark Equipment
All VMmark result folders are available for download upon request. The Synology RackStation RS10613xs+'s 1-Tile raw score of 1.10 Application and 1.11 VMmark2 with 10 15K SAS HDDs in RAID0 is used as our baseline of 1 to normalize results against.
VMware VMmark Virtualization Benchmark