Nvidia Unveils Next-Generation Rubin Chip Architecture

At the Consumer Electronics Show, Nvidia introduced its new Rubin computing architecture, positioning it as a major advance in AI hardware. Moreover, the architecture is already in production. It is also expected to scale further during the second half of the year.

The Rubin architecture was first announced in 2024. Since then, it has emerged from a rapid and continuous hardware development cycle. As a result, it will replace the Blackwell architecture. Previously, Blackwell succeeded the Hopper and Lovelace platforms, reflecting an accelerated pace of innovation.

Cloud adoption and system design

Rubin chips are already scheduled for deployment by nearly every major cloud provider. In addition, they form part of large-scale partnerships across the AI ecosystem. Meanwhile, Rubin-based systems will power HPE’s Blue Lion supercomputer. They will also support the upcoming Doudna supercomputer at Lawrence Berkeley National Lab.

Named after astronomer Vera Florence Cooper Rubin, the architecture integrates six chips designed to operate together. At the core sits the Rubin GPU. However, the platform also addresses storage and interconnection challenges. Specifically, it introduces improvements to Bluefield and NVLink systems. Furthermore, it adds a new Vera CPU optimized for agentic reasoning and advanced AI workflows.

To support modern AI workloads, the architecture introduces a new storage tier. As a result, external storage can connect directly to the compute device. Consequently, large memory pools can scale more efficiently, especially for long-term and agent-based tasks.

Sony Launches WF-1000XM6 With Next-Gen Noise Cancelling

Performance gains amid rising competition

As expected, the new architecture delivers substantial improvements in speed and power efficiency. According to performance tests, Rubin operates three and a half times faster than Blackwell during model training. Additionally, it delivers up to five times faster inference performance. In peak scenarios, it reaches 50 petaflops. Moreover, it supports eight times more inference compute per watt.

These advancements arrive amid intensifying competition in AI infrastructure. Consequently, AI labs and cloud providers continue to compete for high-performance chips and supporting facilities. Over the next five years, global investment in AI infrastructure is projected to reach several trillion dollars, underscoring the scale and urgency of the race.