
Google has revealed its new seventh-generation Tensor Processing Unit (TPU), named Ironwood, claiming it offers more than 24 times the computing power of the world’s fastest supercomputer when deployed at scale. This custom AI accelerator, introduced at Google Cloud Next ’25, marks a significant shift in Google’s AI chip strategy. While previous TPUs were designed for both training and inference, Ironwood is purpose-built for inference, the process that helps deployed AI models generate predictions and responses.
The specs are impressive. Ironwood delivers 42.5 exaflops of computing power when scaled with 9,216 chips per pod, far outpacing the current supercomputer, El Capitan, which offers 1.7 exaflops. Each Ironwood chip delivers 4,614 teraflops of peak compute, paired with 192GB of High Bandwidth Memory (HBM) and 7.2 terabits per second memory bandwidth. This substantial upgrade over previous generations makes Ironwood not just more powerful but also more energy-efficient. It delivers twice the performance per watt compared to Google’s previous TPU, and nearly 30 times more efficient than the 2018 Cloud TPU.
The Shift to Inference: Why It Matters for AI’s Future
The pivot to a focus on inference is crucial for the next phase of AI development. Google sees this as the beginning of what they term the “age of inference,” where AI systems will proactively gather and generate insights instead of just responding to queries. This shift follows a decade of building large AI models primarily focused on training. Now, the industry’s focus is turning to deployment, efficiency, and reasoning.
Inference operations are central to AI’s real-world application. Unlike training, which occurs once, inference happens billions of times daily as AI interacts with users. This makes inference optimization essential. Google’s Ironwood chip is designed with this new era in mind, delivering superior performance and efficiency to meet the growing demand for AI computing power. Google has already seen a 10x year-over-year increase in AI compute demand, which Ironwood is poised to meet.
Ironwood and Google’s Ambitious AI Ecosystem
Ironwood is not just about hardware. Google’s comprehensive AI strategy includes software and networking solutions to support its chip advancements. At the same event, Google unveiled its Cloud WAN service, offering improved network performance, and announced plans to expand its software offerings for AI workloads. Google’s deep integration of hardware, software, and network infrastructure sets it apart from competitors like Microsoft and Amazon, who rely on partnerships for their AI hardware.
Moreover, Google’s push for multi-agent systems, including the introduction of an Agent Development Kit (ADK), showcases its long-term vision for AI. These systems will enable AI agents to collaborate seamlessly across different frameworks and platforms, breaking down silos and fostering greater interoperability in enterprise applications.
Ironwood, paired with Google’s other innovations, is positioning the company to lead the next wave of AI advancements, focusing not only on performance but also on the infrastructure that will allow AI to reach its full potential across industries.