NVIDIA Unveils Vera Rubin Platform, Marking a New Era in AI Infrastructure
Las Vegas, CES 2026 — NVIDIA has officially launched its next-generation AI computing platform, Vera Rubin, signaling a major leap forward in artificial intelligence infrastructure and reinforcing the company’s dominance in the global data-center market.
Announced by NVIDIA founder and CEO Jensen Huang during his CES 2026 keynote, the Rubin platform succeeds the Blackwell architecture and is already in full production, with broader rollout to cloud providers and enterprise partners expected in the second half of 2026. Huang described Rubin as a response to the explosive growth in AI computing demand, noting that the amount of computation required for modern AI systems is “skyrocketing.”
From Chips to Rack-Scale AI Supercomputers
Unlike traditional product launches focused on individual processors, Vera Rubin represents a rack-scale computing architecture, where an entire rack functions as a single unit of computation. The platform is built on extreme co-design across six tightly integrated components: the Vera CPU, Rubin GPU, NVLink 6 interconnect, ConnectX-9 SuperNIC, BlueField-4 DPU and Spectrum-6 Ethernet switch.
At the heart of the system is the Rubin GPU, capable of delivering up to 50 petaflops of AI inference performance, alongside the new Vera CPU, which is optimized for agentic reasoning and large-scale AI factories. Together, these components are designed to eliminate bottlenecks in compute, networking and storage that increasingly constrain advanced AI workloads.
Built for Agentic AI and Long-Context Reasoning
NVIDIA says AI workloads are rapidly evolving beyond model training toward inference-heavy, always-on systems that require long-context reasoning and multi-step decision-making. Rubin directly targets these demands with major gains in efficiency and scalability.
According to NVIDIA, the platform can reduce inference token costs by up to 10 times and cut the number of GPUs needed to train mixture-of-experts (MoE) models by four times compared to Blackwell. It also introduces a new Inference Context Memory Storage Platform, powered by BlueField-4, to better manage the growing key-value cache requirements of agentic AI systems.
Performance gains are matched by efficiency improvements. NVIDIA reports that Rubin delivers up to eight times more inference compute per watt, while new Spectrum-X Ethernet Photonics systems offer five times better power efficiency and improved uptime for large AI data centers.
Broad Industry Adoption
Major cloud providers and AI labs have already committed to the Rubin platform. Companies including Amazon Web Services, Microsoft, Google Cloud, Oracle, OpenAI, Anthropic and Meta plan to deploy Rubin-based systems to power next-generation AI services. Microsoft, in particular, will integrate Vera Rubin NVL72 rack-scale systems into its future Fairwater AI superfactories, scaling to hundreds of thousands of GPUs.
Hardware partners such as Dell, HPE, Lenovo, Cisco and Supermicro are also preparing servers based on Rubin, while Red Hat has expanded its collaboration with NVIDIA to deliver a complete AI software stack optimized for the new platform.
A Strategic Bet on AI Infrastructure
The launch of Vera Rubin underscores NVIDIA’s strategy of offering end-to-end AI infrastructure rather than standalone chips. As rivals race to develop custom silicon, NVIDIA is betting that tightly integrated systems optimized for scale, security and efficiency will remain the preferred choice for enterprises and cloud providers.
With Huang estimating that $3 trillion to $4 trillion will be spent on AI infrastructure over the next five years, Rubin positions NVIDIA at the center of what the company calls the next wave of “AI factories.” As AI models grow more complex and ubiquitous, the Rubin platform is designed to serve as the backbone for the coming decade of artificial intelligence.
08-01-2026






