StorageReview.com
AI  ◇  Enterprise

Inference Providers Leverage NVIDIA Blackwell to Drive 10x Reduction in Token Costs

The fundamental unit of intelligence in modern AI interactions is the token. Whether powering clinical diagnostics, interactive gaming dialogue, or autonomous customer service agents, the scalability of these applications depends heavily on tokenomics. Recent MIT Data indicate that advances in infrastructure and algorithmic efficiency are reducing inference costs by up to 10x annually. Leading inference

AI  ◇  Enterprise

Maia 200 Signals Microsoft’s Push Toward Custom Silicon for AI Inference

Microsoft has introduced Maia 200, a new custom inference accelerator designed to improve the economics of AI token generation at scale. Positioned as the company’s first silicon and system platform optimized specifically for AI inference. Microsoft frames AI inference around an “efficient frontier” that balances capability and accuracy against cost, latency, and energy. In practice,

AI  ◇  DPU  ◇  Enterprise  ◇  Networking  ◇  Server  ◇  Server Rack

NVIDIA Launches Vera Rubin Architecture at CES 2026: The VR NVL72 Rack

At CES 2026, NVIDIA unveiled the Rubin platform, anchored by the Vera Rubin NVL72 rack-scale system. This is NVIDIA’s third-generation rack-scale architecture, combining six co-designed chips into a single unified system. The platform will be available from partners in the second half of 2026, with all six chips already back from fabrication and currently undergoing

AI  ◇  Enterprise

AMD Introduces Ryzen AI Embedded P100 and X100 Series for Edge Inference

AMD has introduced its new Ryzen AI Embedded Processor lineup. This portfolio targets AI workloads at the edge for automotive, industrial automation, and emerging physical AI platforms, including humanoid robotics. It launches with the Ryzen AI Embedded P100 Series and the forthcoming X100 Series. These processors combine Zen 5 CPU cores, RDNA 3.5 graphics, and

AI  ◇  Enterprise

VAST Data Introduces DPU-Native Inference Architecture for Shared KV Cache and Long-Lived Agentic AI

VAST Data has launched a new inference architecture that supports the NVIDIA Inference Context Memory Storage Platform. This system focuses on AI applications that involve ongoing, multi-turn agent-driven sessions. VAST presents this platform as a storage class designed for AI, enhancing access to key-value (KV) cache, enabling fast sharing of inference context between nodes, and

AI  ◇  Enterprise

NVIDIA DGX Spark Achieves 2.5× Performance and 8× Video Speed in CES 2026 Enterprise Update

At CES 2026, NVIDIA outlined a software update for DGX Spark that expands its role as a compact, on-premises AI system. The changes focus on new runtimes, quantization formats, and deployment playbooks that make it easier to run open-source models, automated workflows, and creator and 3D pipelines locally. The update is intended for teams already

AI  ◇  Enterprise

Supermicro Expands Liquid Cooling for NVIDIA Vera Rubin NVL72 and HGX Rubin NVL8 Platforms

Supermicro announced expanded rack-scale manufacturing capacity and upgraded direct liquid cooling capabilities to support the upcoming NVIDIA Vera Rubin NVL72 and NVIDIA HGX Rubin NVL8 platforms. The company is positioning the move to shorten deployment timelines for high-density AI infrastructure by combining US-based in-house design and manufacturing with its Data Center Building Block Solutions (DCBBS)

AI  ◇  Enterprise

NVIDIA RTX PRO 5000 72GB Blackwell GPU Expands Desktop Options for Agentic AI

The NVIDIA RTX PRO 5000 72GB Blackwell GPU is now generally available, bringing Blackwell-class compute to a broader range of professional desktops. Targeted at technical users building and deploying agentic and generative AI, the new configuration addresses memory-constrained workflows that have outgrown prior-generation workstation GPUs. By complementing the existing RTX PRO 5000 48GB model, the

AI  ◇  Enterprise

NVIDIA Extends Open HPC and AI Stack With SchedMD Acquisition and Nemotron 3

NVIDIA has announced two significant moves targeting high-performance computing and AI: the acquisition of SchedMD, the company behind the Slurm workload manager, and the launch of the Nemotron 3 family of open models, data, and tools for multi-agent AI. Together, these efforts reinforce NVIDIA’s position across HPC, generative AI, and enterprise AI infrastructure while maintaining

AI  ◇  Enterprise

Vultr, AMD, and NetApp Build a Hybrid Cloud Blueprint for AI and Sovereign Data

Vultr, AMD, and NetApp, all members of the Vultr Cloud Alliance, have developed a new reference architecture focused on data-intensive and AI workloads across hybrid and sovereign cloud environments. The design targets organizations that need to consolidate distributed data, accelerate AI, and enforce data locality and compliance controls without adding operational complexity. The architecture combines