DDN Launches AI400X3M and Dynamo-Integrated KV Cache Acceleration at ISC 2026

by Harold Fritts on June 24, 2026

AI ◇ Enterprise

At ISC 2026, DDN announced a broad expansion of its AI and HPC data platform portfolio, introducing new hardware, software, and platform capabilities designed to improve GPU utilization, accelerate AI inference, and simplify large-scale AI infrastructure deployments.

The announcements include the AI400X3M storage appliance, the general availability of DDN’s distributed KV Cache acceleration technology integrated with NVIDIA Dynamo, and new security, observability, and infrastructure management capabilities across the company’s AI data platforms.

DDN said the updates are intended to address bottlenecks throughout the AI data pipeline, including data preparation, model training, inference, retrieval-augmented generation (RAG), reasoning models, and agentic AI workloads. The company is focusing on improving infrastructure efficiency by increasing GPU utilization while lowering power consumption and infrastructure costs.

DDN CEO and Co-Founder Alex Bouzari said modern AI infrastructure requires more than compute resources alone. He emphasized that efficient data management, security, and operational visibility have become critical components of production AI deployments, with the company’s latest releases aimed at improving GPU efficiency, reducing inference costs, and accelerating time-to-first-token.

AI400X3M Targets AI and HPC Performance Density

Leading the announcement is the AI400X3M, the latest appliance built on DDN’s EXAScaler parallel file system.

AI400X3M Specifications

Performance
Sequential Read Performance	Up to 190 GB/s
Sequential Write Performance	Up to 110 GB/s
Random Read IOPS	8M
System Features
Platform	Turnkey EXAScaler shared parallel file system appliance for AI with Active/Active storage controllers
Host Ports Per Appliance	4x XDR/400GbE OSFP or 8x NDR200/200GbE QSFP112
Drive Support	24x 2.5″ dual port, hot swappable PCIe G5 NVMe
Capacity Options	120/250/500TB and 1PB/2PB usable
Standard Firmware Features	LUN mapping and masking, intelligent write striping, read QoS, data integrity check/correction, interface options (CLI, GUI, Python API), outbound syslog including RELP support, state change messages (via email and SNMP traps)
Safety & Compliance
Safety	IEC/EN/UL/CSA 62368-1, GB4943
EMC	EN 55032 Class A, EN 55035, EN 61000-3-2, EN 61000-3-3, FCC Part 15 Class A, VCCI Class A, ICES-003 Class A, GB/T9254 Class A, BSMI Class A
Environmental	RoHS, REACH, WEEE, ERP Lot9X
Physical Attributes
Form Factor	2U rack mount
Dimensions	Height: 3.43″ (87mm) Width: 17.56″ (446mm) Depth: 33.7″ (856mm) without bezel; 35.2″ (895mm) overall Adjustable rack rail: 26.75″ to 34.0″ (680–860mm)
Weight	98 lbs (44kg) empty; 108 lbs (49kg) max
Power & Cooling
Input Voltage	200–240V 50/60 Hz
Power Supply	2x hot swappable, redundant, IEC60320-C20 inlet
Operating Environment
Temperature Range	5°C to 30°C (41°F to 86°F)
Relative Humidity	8%–80% non-condensing
Altitude	3,117 ft (950m) @ 30°C — 10,000 ft (3,048m) @ 23°C

The platform delivers up to 190GB/s of throughput and up to 35% higher read performance than the previous generation. DDN also supports configurations that scale to 30PB within a single rack, while adding hybrid disk support that combines flash and HDD storage to balance performance, capacity, and cost.

The appliance is designed for AI model training, inference, checkpointing, HPC simulation, and other highly parallel workloads where storage throughput can directly impact GPU utilization. General availability is expected by the end of the third quarter of 2026.

Distributed KV Cache Now Generally Available

Following an earlier technology preview at NVIDIA GTC 2026, DDN officially launched its distributed KV Cache acceleration architecture, integrating NVIDIA Dynamo. The capability is available across both the Infinia object storage platform and the EXAScaler file system.

Rather than relying on local GPU memory for model context, the distributed architecture stores and retrieves KV cache data from DDN’s data platform, reducing memory bottlenecks during inference.

DDN said the platform supports NVIDIA Dynamo, vLLM, and other modern inference frameworks while enabling shared distributed KV cache across multiple inference nodes. The company claims up to 55 times faster KV cache loading for large-scale inference workloads, along with improved GPU utilization, faster token generation, and lower infrastructure costs for large language model deployments.

The technology is intended to benefit long-context inference, retrieval-augmented generation, reasoning models, and agentic AI workflows that repeatedly access large model contexts.

Unified AI Data Pipeline

DDN also outlined its broader strategy of combining Infinia object storage for inference with EXAScaler for model training and checkpointing under a single AI data infrastructure.

The company said Infinia provides low-latency metadata operations, high concurrency, and object performance optimized for inference-heavy workloads, while EXAScaler continues to serve high-performance training environments. Together, the platforms are intended to eliminate data silos and maintain high GPU utilization throughout the AI lifecycle.

New Security and Infrastructure Features

Additional platform enhancements focus on enterprise deployment requirements, including bare-metal multi-tenancy, KMIP-based encryption and key management, VictoriaLogs integration for operational visibility, and expanded multi-tenant APIs.

On the storage side, DDN introduced intelligent file pinning and NAND-accelerated Hot Pools, allowing frequently accessed data to remain on flash while automatically tiering colder data to lower-cost hard drives. The company said these features are designed to improve storage efficiency while reducing flash capacity requirements.

Continued Cloud AI Expansion

DDN also highlighted continued adoption of its cloud AI infrastructure offerings, including Managed Lustre developments announced with Google Cloud and a Salesforce deployment that uses Google Cloud Managed Luster powered by EXAScaler.

According to DDN, the Salesforce implementation achieved 1.5 times faster model training, reduced I/O latency by 75%, and lowered training costs by 42%, demonstrating the impact of reducing storage bottlenecks in large-scale AI environments.

With the latest announcements, DDN continues to expand its portfolio beyond parallel storage into AI data orchestration, inference acceleration, and infrastructure management, positioning its platforms to support enterprise, cloud, research, and sovereign AI deployments operating at large GPU scale.

Harold Fritts

I have been in the tech industry since IBM created Selectric. My background, though, is writing. So I decided to get out of the pre-sales biz and return to my roots, doing a bit of writing but still being involved in technology.

Previous post: Micron and Anthropic Form Strategic AI Infrastructure Partnership

Next post: AMD Powers 4 of the Top 10 on the June 2026 TOP500 as China’s LineShine Takes No. 1