Nvidia Launches Nemotron 3 Ultra to Power Next-Generation Enterprise AI

Nvidia CEO Jensen Huang introduced Nemotron 3 Ultra, a 550-billion-parameter open-weights AI model, during his Computex 2026 keynote in Taipei. With this launch, Nvidia takes a major step deeper into enterprise AI software and autonomous agent development.

The model uses a mixture-of-experts architecture with approximately 55 billion active parameters per token and 90% sparsity. As a result, it delivers significantly higher efficiency than its total parameter count suggests. According to Artificial Analysis, Nemotron 3 Ultra scored 48 on the Intelligence Index. Consequently, it outperformed all U.S.-based open-weights models, including Google’s Gemma 4 31B, which scored 39. However, it remains behind China’s Kimi K2.6, which achieved a score of 54.

Nvidia also stated that the model generates more than 300 output tokens per second. In comparison, competing models such as DeepSeek and Moonshot typically deliver between 50 and 100 tokens per second. Furthermore, the company claims the model lowers costs by roughly 30% for complex agentic workloads.

The model focuses on multi-step reasoning, planning, and self-correction. Therefore, it is designed to support AI agents that can manage sophisticated workflows with minimal human involvement.

Nvidia Expands Its Full-Stack AI Ecosystem

Alongside Nemotron 3 Ultra, Nvidia launched the broader Nemotron 3 family. The lineup includes the mid-tier Super model and Nano Omni, a lightweight multimodal model built for edge devices. Moreover, Nano Omni combines vision, audio, and language capabilities to power on-device AI agents.

At the same event, Huang introduced NemoClaw, an orchestration framework for agent planning and task delegation. In addition, Nvidia revealed OpenShell, a runtime layer focused on security, governance, and operational control.

The company also showcased the Vera CPU, a processor designed specifically for agentic AI workloads. Nvidia claims the chip delivers twice the efficiency of traditional x86 server processors. Meanwhile, the RTX Spark combines an Arm CPU with a Blackwell GPU and supports up to 128GB of unified memory for advanced AI computing.

Apple Sues OpenAI Over Trade Secrets

Nvidia Strengthens Position in the AI Race

These announcements highlight Nvidia’s ambition to become more than a hardware provider. Instead, the company is building a full-stack AI platform that competes with OpenAI, Google, and Meta in AI model development while maintaining its leadership in computing infrastructure.

Meanwhile, Nvidia’s Vera Rubin platform continues to gain traction among major cloud providers. Early adopters now include AWS, Google Cloud, and Microsoft, reflecting growing industry interest in Nvidia’s AI ecosystem.

Computex 2026, held from June 1 to June 5 under the theme “AI Together,” provides the backdrop for this vision. Through its latest products and platforms, Nvidia aims to enable AI agents across data centers, enterprise systems, and personal devices while supplying the technology stack that powers them.