NVIDIA has unveiled the Rubin platform, marking a major step in the evolution of large-scale artificial intelligence systems. With this launch, the company introduces a unified AI supercomputing architecture built from six tightly integrated chips. As a result, the platform focuses on accelerating both AI training and inference while reducing overall deployment costs. Moreover, the design targets mainstream AI adoption by simplifying how massive systems are built, secured, and scaled.
At the core of Rubin lies an extreme codesign approach. Therefore, the NVIDIA Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet Switch operate as a single optimized system. Consequently, this integration reduces training time and lowers inference token costs across complex workloads. Compared with earlier platforms, Rubin delivers higher efficiency while requiring fewer GPUs for large mixture-of-experts models.
Architecture Designed for Scale and Efficiency
Rubin introduces several architectural advances that redefine AI infrastructure performance. For instance, sixth-generation NVLink provides ultra-fast GPU communication, enabling massive bandwidth within rack-scale systems. At the same time, the Vera CPU supports agentic reasoning with high efficiency and modern Arm compatibility. Meanwhile, the Rubin GPU advances transformer-based workloads through adaptive compression and higher compute density.
In addition, the platform integrates next-generation confidential computing and advanced reliability features. As a result, large AI models remain protected across CPU, GPU, and interconnect domains. Furthermore, real-time system health monitoring improves uptime and simplifies maintenance. Together, these features allow AI factories to operate at larger scales with greater stability and predictability.
Ecosystem Readiness and Deployment Outlook
Rubin supports multiple deployment formats to address diverse workloads. On one hand, the NVL72 rack-scale system combines CPUs, GPUs, networking, and security into a unified architecture. On the other hand, the HGX Rubin NVL8 platform targets server-based deployments for generative AI and high-performance computing. Therefore, organizations can align Rubin systems with specific infrastructure needs.
Beyond hardware, Rubin introduces AI-native storage designed to manage large inference contexts efficiently. Consequently, inference performance improves while power consumption remains controlled. As AI workloads move toward multi-tenant and bare-metal environments, the platform also strengthens isolation and system-level trust.
Production of Rubin-based systems is underway, with partner availability expected in the second half of 2026. Accordingly, cloud providers, system builders, and AI labs are preparing to deploy the platform. Overall, the Rubin platform establishes a new foundation for scalable, efficient, and secure AI computing.








