Supermicro Unveils Advanced Generative AI Solutions at GTC 2024

Supermicro has launched a new set of solutions to advance the deployment of generative AI, marking a significant step in the evolution of infrastructure for large language models (LLMs). These SuperCluster solutions are designed as core components to support current and future AI demands.

This new release includes three distinct SuperCluster configurations tailored for generative AI tasks. Options include a 4U liquid-cooled system and an 8U air-cooled setup engineered for intensive LLM training and high-capacity LLM inference. Additionally, a 1U air-cooled variant featuring Supermicro NVIDIA MGX systems is geared toward cloud-scale inference applications. These systems are built to deliver unparalleled performance in LLM training, boasting features such as large batch sizes and substantial volume handling capabilities for LLM inference.

Expanding Capacity for AI Clusters

With the ability to produce up to 5,000 racks per month, Supermicro is positioned to rapidly supply complete generative AI clusters, promising faster delivery speeds to its clients. A 64-node cluster, as an example, can incorporate 512 NVIDIA HGX H200 GPUs, utilizing high-speed NVIDIA Quantum-2 InfiniBand and Spectrum-X Ethernet networking to achieve a robust AI training environment. In conjunction with NVIDIA AI Enterprise software, this config is an ideal solution for enterprise and cloud infrastructures aiming to train sophisticated LLMs with trillions of parameters.

Innovating Cooling and Performance

The new Supermicro 4U NVIDIA HGX H100/H200 8-GPU systems leverage liquid cooling to double the density compared to the 8U air-cooled alternatives, resulting in lower energy consumption and a decrease in the total cost of ownership for data centers. These systems support next-generation NVIDIA Blackwell architecture-based GPUs, featuring efficient cooling technologies that maintain optimal temperatures for maximum performance.

SuperCluster Specifications

The Supermicro SuperClusters are scalable solutions for training massive foundation models and creating cloud-scale LLM inference infrastructures. With a highly scalable network architecture, these systems can expand from 32 nodes to thousands, ensuring seamless scalability. Integrating advanced liquid cooling and comprehensive testing processes guarantees operational efficiency and effectiveness.

Supermicro details two primary configurations: the SuperCluster with 4U Liquid-cooled System, capable of supporting up to 512 GPUs in a compact footprint, and the SuperCluster with 1U Air-cooled NVIDIA MGX System, designed for high-volume, low-latency inference tasks. Both configurations are highlighted for their high network performance, which is essential for LLM training and inference.

Here is a quick rundown of their specifications:

SuperCluster with 4U Liquid-cooled System in 5 Racks or 8U Air-cooled System in 9 Racks

256 NVIDIA H100/H200 Tensor Core GPUs in one scalable unit
Liquid cooling enabling 512 GPUs, 64-nodes, in the same footprint as the air-cooled 256 GPUs, 32-node solution
20TB of HBM3 with NVIDIA H100 or 36TB of HBM3e with NVIDIA H200 in one scalable unit
1:1 networking delivers up to 400 Gbps to each GPU to enable GPUDirect RDMA and Storage for training large language models with up to trillions of parameters
400G InfiniBand or 400GbE Ethernet switch fabrics with highly scalable spine-leaf network topology, including NVIDIA Quantum-2 InfiniBand and NVIDIA Spectrum-X Ethernet Platform.
Customizable AI data pipeline storage fabric with industry-leading parallel file system options
NVIDIA AI Enterprise 5.0 software, which brings support for new NVIDIA NIM inference microservices that accelerate the deployment of AI models at scale

SuperCluster with 1U Air-cooled NVIDIA MGX System in 9 Racks

256 GH200 Grace Hopper Superchips in one scalable unit
Up to 144GB of HBM3e + 480GB of LPDDR5X unified memory suitable for cloud-scale, high-volume, low-latency, and high batch size inference, able to fit a 70B+ parameter model in one node.
400G InfiniBand or 400G Ethernet switch fabrics with highly scalable spine-leaf network topology
Up to 8 built-in E1.S NVMe storage devices per node
Customizable AI data pipeline storage fabric with NVIDIA BlueField-3 DPUs and industry-leading parallel file system options to deliver high-throughput and low-latency storage access to each GPU
NVIDIA AI Enterprise 5.0 software

Supermicro Expands AI Portfolio with New Systems and Racks Using NVIDIA Blackwell Architecture

Supermicro is also announcing the expansion of its AI system offerings, including the latest in NVIDIA’s data center innovations aimed at large-scale generative AI. Among these new technologies are the NVIDIA GB200 Grace Blackwell Superchip and the B200 and B100 Tensor Core GPUs.

To accommodate these advancements, Supermicro is seamlessly upgrading its existing NVIDIA HGX H100/H200 8-GPU systems to integrate the NVIDIA HGX B100 8-GPU and B200. Furthermore, the NVIDIA HGX lineup will be bolstered with the new models featuring the NVIDIA GB200, including a comprehensive rack-level solution equipped with 72 NVIDIA Blackwell GPUs. In addition to these advancements, Supermicro is introducing a new 4U NVIDIA HGX B200 8-GPU liquid-cooled system, leveraging direct-to-chip liquid cooling technology to handle the increased thermal demands of the latest GPUs and unlock the full performance capabilities of NVIDIA’s Blackwell technology.

The new Supermicro’s GPU-optimized systems will soon be available, fully compatible with the NVIDIA Blackwell B200 and B100 Tensor Core GPUs and certified for the latest NVIDIA AI Enterprise software. The Supermicro lineup includes diverse configurations, from NVIDIA HGX B100 and B200 8-GPU systems to SuperBlades capable of housing up to 20 B100 GPUs, ensuring versatility and high performance across a wide range of AI applications. These systems include first-to-market NVIDIA HGX B200 and B100 8-GPU models featuring advanced NVIDIA NVLink interconnect technology. Supermicro indicates they are poised to deliver training outcomes for LLMs (3x faster) and support scalable clustering for demanding AI workloads, marking a significant leap forward in AI computational efficiency and performance.

Supermicro Liquid Cooling Technology

Supermicro NVIDIA Solutions

Engage with StorageReview

1 month ago

Lyle Smith

Lyle is a staff writer for StorageReview, covering a broad set of end user and enterprise IT topics.

Next Wiwynn Showcases NVIDIA GB200 NVL72 AI Computing Solutions »

Previous « NVIDIA GTC 2024 Keynote Highlights - Day 1 Megapost

JetCool Unveils Cold Plates for the NVIDIA H100 GPU

JetCool has launched an innovative liquid cooling module tailored for NVIDIA's H100 SXM and PCIe GPUs, claiming a significant advancement…

2 days ago

Enterprise

iXsystems Expands TrueNAS Enterprise with H-Series Platforms

iXsystems has launched the TrueNAS Enterprise H-Series platforms, designed to give organizations ultimate performance. The H10 model is now available,…

6 days ago

Enterprise

Microsoft Azure Edge Infrastructure At Hannover Messe 2024

Hannover Messe 2024 represents a significant event in the global industrial sector, serving as the world's largest industrial trade fair.…

6 days ago

Enterprise

IBM Storage Assurance Program Provides Purchase Protection and Flexibility

The IBM Storage Assurance program offers access to the latest FlashSystem hardware and software, supporting investment protection from day one.…

6 days ago

Enterprise

Proxmox Backup Server 3.2 Adds Advanced Notification System and Automated Installations

Proxmox Backup Server 3.2 has been released - open-source solution designed for backup of VMs, containers, and physical hosts. (more…)

7 days ago

Enterprise

IBM FlashSystem 5300 Entry All-Flash Array Launched

IBM has unveiled the FlashSystem 5300, setting a new standard for entry-level all-flash storage systems by providing impressive performance, high…

7 days ago

Supermicro Unveils Advanced Generative AI Solutions at GTC 2024

Expanding Capacity for AI Clusters

Innovating Cooling and Performance

SuperCluster Specifications

Supermicro Expands AI Portfolio with New Systems and Racks Using NVIDIA Blackwell Architecture

Recent Posts

JetCool Unveils Cold Plates for the NVIDIA H100 GPU

iXsystems Expands TrueNAS Enterprise with H-Series Platforms

Microsoft Azure Edge Infrastructure At Hannover Messe 2024

IBM Storage Assurance Program Provides Purchase Protection and Flexibility

Proxmox Backup Server 3.2 Adds Advanced Notification System and Automated Installations

IBM FlashSystem 5300 Entry All-Flash Array Launched

About StorageReview

Supermicro Unveils Advanced Generative AI Solutions at GTC 2024

Expanding Capacity for AI Clusters

Innovating Cooling and Performance

SuperCluster Specifications

Supermicro Expands AI Portfolio with New Systems and Racks Using NVIDIA Blackwell Architecture

Related Post

Recent Posts

JetCool Unveils Cold Plates for the NVIDIA H100 GPU

iXsystems Expands TrueNAS Enterprise with H-Series Platforms

Microsoft Azure Edge Infrastructure At Hannover Messe 2024

IBM Storage Assurance Program Provides Purchase Protection and Flexibility

Proxmox Backup Server 3.2 Adds Advanced Notification System and Automated Installations

IBM FlashSystem 5300 Entry All-Flash Array Launched

About StorageReview