Google today announced GA on the Google Cloud Platform of three products built on custom silicon built for inference and agentic workloads:
– Ironwood, Google’s seventh generation Tensor Processing Unit, will be generally available in the coming weeks. The company said it is built for large-scale model training and complex reinforcement learning, as well as high-volume, low-latency AI inference and model serving.
It offers a 10X peak performance improvement over TPU v5p and more than 4X better performance per chip for both training and inference workloads compared to TPU v6e (Trillium), “making Ironwood our most powerful and energy-efficient custom silicon to date,” the company said in an announcement blog.
– New Arm-based Axion instances. The N4A, an N series virtual machine, is now in preview. N4A offers up to 2x better price-performance than comparable current-generation x86-based VMs, Google said. The company also announced C4A metal, their first Arm-based bare metal instance, will be coming soon in preview.
Google said Anthropic plans to access up to 1 million TPUs for training its Claude models.
“Our customers, from Fortune 500 companies to startups, depend on Claude for their most critical work,” James Bradbury, head of compute, Anthropic. As demand continues to grow exponentially, we’re increasing our compute resources as we push the boundaries of AI research and product development. Ironwood’s improvements in both inference performance and training scalability will help us scale efficiently while maintaining the speed and reliability our customers expect.”
Google said its TPUs are a key component of AI Hypercomputer, the company’s integrated supercomputing system for compute, networking, storage, and software. At the macro level, according to a recent IDC report, AI Hypercomputer customers achieved on average 353 percent three-year ROI, 28 percent lower IT costs, and 55 percent more efficient IT teams, the company said.
With TPUs, the system connects each individual chip to each other, creating a pod — allowing the interconnected TPUs to work as a single unit.
“With Ironwood, we can scale up to 9,216 chips in a superpod linked with breakthrough Inter-Chip Interconnect (ICI) networking at 9.6 Tb/s,” Google said. “This massive connectivity allows thousands of chips to quickly communicate with each other and access a staggering 1.77 Petabytes of shared High Bandwidth Memory (HBM), overcoming data bottlenecks for even the most demanding models.
Regarding the N4A (preview), this is Google’s second general-purpose Axion VM, built for microservices, containerized applications, open-source databases, batch, data analytics, development environments, experimentation, data preparation and web serving jobs for AI applications.




