Highlights
- Rubin platform cuts inference token costs up to 10x and lowers GPU needs for MoE models by 4x.
- Major cloud providers including Microsoft, AWS, Google Cloud, and OCI to deploy Rubin systems in 2026.
- Six new chips integrate CPU, GPU, networking, and storage for large-scale AI workloads.
NVIDIA (NASDAQ:NVDA) introduced its Rubin platform, a next-generation AI computing system designed to support training, inference, and reasoning at scale. Named after astronomer Vera Rubin, the platform integrates six specialized chips — NVIDIA Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet Switch — enabling extreme codesign across hardware and software. Compared with the previous Blackwell platform, Rubin reduces inference token costs by up to 10x and requires 4x fewer GPUs for mixture-of-experts (MoE) model training.
Extreme Codesign for AI Efficiency
Rubin incorporates advanced NVLink interconnects, a third-generation Transformer Engine, Confidential Computing, and a second-generation RAS Engine. NVIDIA Vera CPU offers 88 custom Olympus cores with full Armv9.2 compatibility and NVLink-C2C connectivity. Rubin GPU delivers 50 petaflops of NVFP4 compute, while NVLink 6 ensures GPU-to-GPU communication at 3.6TB/s per GPU and 260TB/s at rack scale.
AI-Native Storage and Secure Infrastructure
The platform introduces NVIDIA Inference Context Memory Storage with BlueField-4 storage processors. This infrastructure supports efficient sharing and reuse of key-value cache data across AI workloads, improving throughput and scaling agentic AI efficiently. BlueField-4 also provides Advanced Secure Trusted Resource Architecture (ASTRA), enabling secure provisioning, isolation, and operation of large-scale AI environments without performance compromise.
Platform Configurations for Diverse Workloads
Rubin-based deployments include the NVIDIA Vera Rubin NVL72 rack-scale system, integrating 72 GPUs, 36 CPUs, NVLink 6, ConnectX-9 SuperNICs, and BlueField-4 DPUs. The HGX Rubin NVL8 server board links eight Rubin GPUs for x86-based AI platforms. NVIDIA DGX SuperPOD serves as a deployment reference, integrating NVL72 or NVL8 systems with BlueField-4 DPUs, ConnectX-9 SuperNICs, InfiniBand networking, and Mission Control software.
Ecosystem Adoption
Cloud providers and AI labs expected to adopt Rubin include Microsoft, AWS, Google Cloud, OCI, CoreWeave, Lambda, and Nscale. AI labs such as Anthropic, Meta, OpenAI, xAI, and Cohere anticipate using Rubin for long-context, multimodal systems and large-scale reasoning workloads. Infrastructure partners including Cisco, Dell, HPE, Lenovo, Supermicro, and Red Hat are collaborating to deliver Rubin-based AI solutions.
Share performance
NVDA shares closed at USD 188.12 on 5 January 2026.






Please wait processing your request...