Supermicro Expands A100 GPU Capacity

by Adam Armstrong on October 5, 2020

Enterprise ◇ Server

Today after NVIDIA made several GPU-related announcements at its annual GTC event, Supermicro did what it does bests, announcing hardware support for the latest innovation to hit the market. In this case, Supermicro announced that it has expanded its new 4U server to support up to eight NVIDIA HGX A100 GPUs. The company also has a 2U server and supports up to four A100 GPUs.

The Supermicro servers for GPU systems, with the NVIDIA HGX A100 GPUs, ran the gamut including 1U, 2U, 4U, and 10U rackmount GPU systems. These solutions work form edge to cloud and are powered by either AMD EPYC or Intel Xeon processors. According to the company, the 1U GPU systems contain up to four NVIDIA GPUs with NVLink, including NEBS Level 3 certified, 5G/Edge-ready SYS- 1029GQ. Supermicro’s 2U GPU systems, such as SYS- 2029GP-TR, can support up to six NVIDIA V100 GPUs with dual PCI-E Root complex capability in one system. And finally, the 10U GPU servers, such as SYS- 9029GP-TNVRT, supports 16 V100 SXM3 GPU expansions with to Dual Intel Xeon Scalable processors with built-in AI acceleration.

For the new servers, Supermicro is leveraging an advanced thermal design, with custom heatsinks and optional liquid cooling to feature NVIDIA HGX A100 4-GPU 8-GPU baseboards, along with a new 4U server supporting eight NVIDIA A100 PCI-E GPUs. The new servers utilize the company’s Advanced I/O Module (AIOM) form factor for more flexibility in network communication. AIOM works with PCIe gen 4 storage and networking devices that support NVIDIA GPUDirect RDMA and GPUDirect Storage with NVME over Fabrics (NVMe-oF) on NVIDIA Mellanox InfiniBand. All of the above aims to eliminate bottlenecks going into all of the GPUs.

First up, the 2U system can house up to 4 NVIDIA GPUs through the thermal heatsink design. This system enables high GPU peer-to-peer communication via NVIDIA NVLink, up to 8TB of DDR4 3200Mhz system memory, five PCI-E 4.0 I/O slots supporting GPUDirect RDMA as well as allowing four hot-swappable NVMe with GPUDirect Storage capability. Impressive in a 2U system.

On to the bigger system, the 4U server has the NVIDIA HGX A100 8-GPU baseboard, up to six NVMe U.2 and two NVMe M.2, and ten PCI-E 4.0 x16 slots. The system leverages the above mentioned AIOM, NVIDIA NVLink, and NVSwitch technology. The use cases for this big guy are large-scale deep learning training, neural network model applications for research or national laboratories, supercomputing clusters, and HPC cloud services.

For maximum GPU density, Supermicro also has an 8U SuperBlade enclosure. This server can support up to 20 nodes and 40 GPUs with two single-width GPUs per node, or one NVIDIA Tensor Core A100 PCI-E GPU per node. Fitting up to 20 NVIDIA A100s in one 8U footprint can actually save costs, being only 8U that needs to be powered or leaving room for other devices in the racks. This SuperBlade provides a 100% non-blocking HDR 200Gb/s InfiniBand networking infrastructure to accelerate deep learning and enable real-time analysis and decision making.

Supermicro

Adam Armstrong

Adam is the chief news editor for StorageReview.com, managing our internal and freelance content teams.

Previous post: NVIDIA Omniverse Open Beta Announced

Next post: Commvault Announces Metallic Cloud Storage Service

Supermicro Expands A100 GPU Capacity

Adam Armstrong

Advertisement