StorageReview.com

DDN Launches AI400X3M and Dynamo-Integrated KV Cache Acceleration at ISC 2026

AI  ◇  Enterprise

At ISC 2026, DDN announced a broad expansion of its AI and HPC data platform portfolio, introducing new hardware, software, and platform capabilities designed to improve GPU utilization, accelerate AI inference, and simplify large-scale AI infrastructure deployments.

DDN AI400X3 hero

The announcements include the AI400X3M storage appliance, the general availability of DDN’s distributed KV Cache acceleration technology integrated with NVIDIA Dynamo, and new security, observability, and infrastructure management capabilities across the company’s AI data platforms.

DDN said the updates are intended to address bottlenecks throughout the AI data pipeline, including data preparation, model training, inference, retrieval-augmented generation (RAG), reasoning models, and agentic AI workloads. The company is focusing on improving infrastructure efficiency by increasing GPU utilization while lowering power consumption and infrastructure costs.

DDN CEO and Co-Founder Alex Bouzari said modern AI infrastructure requires more than compute resources alone. He emphasized that efficient data management, security, and operational visibility have become critical components of production AI deployments, with the company’s latest releases aimed at improving GPU efficiency, reducing inference costs, and accelerating time-to-first-token.

AI400X3M Targets AI and HPC Performance Density

Leading the announcement is the AI400X3M, the latest appliance built on DDN’s EXAScaler parallel file system.

AI400X3M Specifications

Performance
Sequential Read Performance Up to 190 GB/s
Sequential Write Performance Up to 110 GB/s
Random Read IOPS 8M
System Features
Platform Turnkey EXAScaler shared parallel file system appliance for AI with Active/Active storage controllers
Host Ports Per Appliance 4x XDR/400GbE OSFP or
8x NDR200/200GbE QSFP112
Drive Support 24x 2.5″ dual port, hot swappable PCIe G5 NVMe
Capacity Options 120/250/500TB and 1PB/2PB usable
Standard Firmware Features LUN mapping and masking, intelligent write striping, read QoS, data integrity check/correction, interface options (CLI, GUI, Python API), outbound syslog including RELP support, state change messages (via email and SNMP traps)
Safety & Compliance
Safety IEC/EN/UL/CSA 62368-1, GB4943
EMC EN 55032 Class A, EN 55035, EN 61000-3-2, EN 61000-3-3, FCC Part 15 Class A, VCCI Class A, ICES-003 Class A, GB/T9254 Class A, BSMI Class A
Environmental RoHS, REACH, WEEE, ERP Lot9X
Physical Attributes
Form Factor 2U rack mount
Dimensions Height: 3.43″ (87mm)
Width: 17.56″ (446mm)
Depth: 33.7″ (856mm) without bezel; 35.2″ (895mm) overall
Adjustable rack rail: 26.75″ to 34.0″ (680–860mm)
Weight 98 lbs (44kg) empty; 108 lbs (49kg) max
Power & Cooling
Input Voltage 200–240V 50/60 Hz
Power Supply 2x hot swappable, redundant, IEC60320-C20 inlet
Operating Environment
Temperature Range 5°C to 30°C (41°F to 86°F)
Relative Humidity 8%–80% non-condensing
Altitude 3,117 ft (950m) @ 30°C — 10,000 ft (3,048m) @ 23°C

 

The platform delivers up to 190GB/s of throughput and up to 35% higher read performance than the previous generation. DDN also supports configurations that scale to 30PB within a single rack, while adding hybrid disk support that combines flash and HDD storage to balance performance, capacity, and cost.

The appliance is designed for AI model training, inference, checkpointing, HPC simulation, and other highly parallel workloads where storage throughput can directly impact GPU utilization. General availability is expected by the end of the third quarter of 2026.

Distributed KV Cache Now Generally Available

Following an earlier technology preview at NVIDIA GTC 2026, DDN officially launched its distributed KV Cache acceleration architecture, integrating NVIDIA Dynamo. The capability is available across both the Infinia object storage platform and the EXAScaler file system.

Rather than relying on local GPU memory for model context, the distributed architecture stores and retrieves KV cache data from DDN’s data platform, reducing memory bottlenecks during inference.

DDN said the platform supports NVIDIA Dynamo, vLLM, and other modern inference frameworks while enabling shared distributed KV cache across multiple inference nodes. The company claims up to 55 times faster KV cache loading for large-scale inference workloads, along with improved GPU utilization, faster token generation, and lower infrastructure costs for large language model deployments.

The technology is intended to benefit long-context inference, retrieval-augmented generation, reasoning models, and agentic AI workflows that repeatedly access large model contexts.

Unified AI Data Pipeline

DDN also outlined its broader strategy of combining Infinia object storage for inference with EXAScaler for model training and checkpointing under a single AI data infrastructure.

DDN Infinia Object Stoage graphic

The company said Infinia provides low-latency metadata operations, high concurrency, and object performance optimized for inference-heavy workloads, while EXAScaler continues to serve high-performance training environments. Together, the platforms are intended to eliminate data silos and maintain high GPU utilization throughout the AI lifecycle.

New Security and Infrastructure Features

Additional platform enhancements focus on enterprise deployment requirements, including bare-metal multi-tenancy, KMIP-based encryption and key management, VictoriaLogs integration for operational visibility, and expanded multi-tenant APIs.

On the storage side, DDN introduced intelligent file pinning and NAND-accelerated Hot Pools, allowing frequently accessed data to remain on flash while automatically tiering colder data to lower-cost hard drives. The company said these features are designed to improve storage efficiency while reducing flash capacity requirements.

Continued Cloud AI Expansion

DDN also highlighted continued adoption of its cloud AI infrastructure offerings, including Managed Lustre developments announced with Google Cloud and a Salesforce deployment that uses Google Cloud Managed Luster powered by EXAScaler.

Google Cloud Lustre

According to DDN, the Salesforce implementation achieved 1.5 times faster model training, reduced I/O latency by 75%, and lowered training costs by 42%, demonstrating the impact of reducing storage bottlenecks in large-scale AI environments.

With the latest announcements, DDN continues to expand its portfolio beyond parallel storage into AI data orchestration, inference acceleration, and infrastructure management, positioning its platforms to support enterprise, cloud, research, and sovereign AI deployments operating at large GPU scale.

Engage with StorageReview

Newsletter | YouTube | Podcast iTunes/Spotify | Instagram | Twitter | TikTok | RSS Feed

Harold Fritts

I have been in the tech industry since IBM created Selectric. My background, though, is writing. So I decided to get out of the pre-sales biz and return to my roots, doing a bit of writing but still being involved in technology.