StorageReview.com
AI  ◇  Enterprise

NVIDIA GTC 2026: Rubin GPUs, Groq LPUs, Vera CPUs, and What NVIDIA Is Building for Trillion-Parameter Inference

At GTC 2026 in San Jose, NVIDIA CEO Jensen Huang delivered a keynote outlining the company’s next-generation AI infrastructure platform and a sweeping set of announcements spanning silicon, systems, software, and ecosystem partnerships. With more than 30,000 attendees from over 190 countries, GTC 2026 served as the stage for NVIDIA’s most comprehensive platform refresh since

AI  ◇  Enterprise

Meta’s MTIA Roadmap: Four Chip Generations in Two Years Put GenAI Inference First

Meta outlined the rapid development of its Meta Training and Inference Accelerator (MTIA) program. It described four generations of chips created over roughly two years. These chips solve a key infrastructure problem that Meta considers essential for AI worldwide: quickly adjusting to evolving model architectures at low cost, without depending on long silicon development cycles.

AI  ◇  Enterprise

Tenstorrent QuietBox 2 Brings RISC‑V AI Inference to the Desktop

Tenstorrent has introduced TT‑QuietBox 2 (Blackhole), a liquid‑cooled AI workstation designed to run models up to 120 billion parameters entirely on the desktop. The system combines a fully open‑source software stack with RISC‑V–based silicon and is positioned as a teraflop‑class inference platform that does not require racks, a server room, or specialized power. Inference as

AI  ◇  Enterprise

MinIO Introduces AIStor Table Sharing for Direct On-Premises Data Access from Databricks

MinIO has launched AIStor Table Sharing, a new feature in MinIO AIStor that enables businesses to make on-premises data directly accessible to the Databricks platform via the Delta Sharing open protocol. This feature enables Databricks to access current on-premises datasets immediately for analytics and AI tasks, without the need for traditional data movement or replication.

AI  ◇  Enterprise

MWC 2026: AMD Targets Telco-Grade AI from Core to Edge

At MWC 2026 in Barcelona, AMD highlighted how its portfolio is helping telecom operators move from AI pilots to production deployments as they transition from traditional RAN to open, virtualized architectures. The company is positioning a combination of open software stacks, GPUs, CPUs, networking, and adaptive computing to support distributed, telco-grade AI across core, edge,

AI  ◇  Enterprise

VAST Data Unveils Agentic AI OS and Advances Its Thinking Machine Vision

During VAST Forward 2026, VAST Data introduced multiple updates, ranging from a full-stack agentic computing platform to a secure, scalable thinking machine. The VAST Data PolicyEngine and VAST Data TuningEngine are two new computing services that will enable the next generation of the VAST AI OS to meet key requirements for organizations looking to scale

AI  ◇  Enterprise

Inference Providers Leverage NVIDIA Blackwell to Drive 10x Reduction in Token Costs

The fundamental unit of intelligence in modern AI interactions is the token. Whether powering clinical diagnostics, interactive gaming dialogue, or autonomous customer service agents, the scalability of these applications depends heavily on tokenomics. Recent MIT Data indicate that advances in infrastructure and algorithmic efficiency are reducing inference costs by up to 10x annually. Leading inference

AI  ◇  Enterprise

Maia 200 Signals Microsoft’s Push Toward Custom Silicon for AI Inference

Microsoft has introduced Maia 200, a new custom inference accelerator designed to improve the economics of AI token generation at scale. Positioned as the company’s first silicon and system platform optimized specifically for AI inference. Microsoft frames AI inference around an “efficient frontier” that balances capability and accuracy against cost, latency, and energy. In practice,

AI  ◇  DPU  ◇  Enterprise  ◇  Networking  ◇  Server  ◇  Server Rack

NVIDIA Launches Vera Rubin Architecture at CES 2026: The VR NVL72 Rack

At CES 2026, NVIDIA unveiled the Rubin platform, anchored by the Vera Rubin NVL72 rack-scale system. This is NVIDIA’s third-generation rack-scale architecture, combining six co-designed chips into a single unified system. The platform will be available from partners in the second half of 2026, with all six chips already back from fabrication and currently undergoing

AI  ◇  Enterprise

AMD Introduces Ryzen AI Embedded P100 and X100 Series for Edge Inference

AMD has introduced its new Ryzen AI Embedded Processor lineup. This portfolio targets AI workloads at the edge for automotive, industrial automation, and emerging physical AI platforms, including humanoid robotics. It launches with the Ryzen AI Embedded P100 Series and the forthcoming X100 Series. These processors combine Zen 5 CPU cores, RDNA 3.5 graphics, and