NVIDIA says its first standalone Vera CPU systems have reached customers, a small delivery story that points to a larger shift in AI hardware: the bottleneck is no longer only the accelerator.
According to NVIDIA, Vice President of Hyperscale and High-Performance Computing Ian Buck hand-delivered early Vera CPU systems to Anthropic, OpenAI and SpaceXAI on Friday, followed by Oracle Cloud Infrastructure on Monday. The company frames Vera as its first custom CPU built specifically for “agentic AI”, workloads in which models do not simply return text, but call tools, run code, search files, manage context, orchestrate subtasks and coordinate with other systems.
That distinction matters because GPUs do not run an AI factory alone. In a modern inference or training environment, CPUs handle scheduling, sandboxing, data movement, retrieval, orchestration and the glue logic around accelerators. NVIDIA’s pitch is that agent-heavy workloads create a new CPU moment: lots of concurrent, latency-sensitive work that traditional server CPUs were not optimised around.
The company says Vera contains 88 custom NVIDIA-designed Olympus cores, offers 1.2 TB/s of memory bandwidth and delivers 50% faster per-core performance under full load. It also says Vera is the host processor for Vera Rubin NVL72 systems, pairing with Rubin GPUs through second-generation NVLink-C2C and a unified memory architecture intended to keep accelerators fed more efficiently.
Oracle Cloud Infrastructure supplied the most concrete deployment language in the announcement. NVIDIA quoted OCI product executive Karan Batta saying OCI plans to deploy “hundreds of thousands” of Vera CPUs beginning in 2026 for high-throughput reasoning workloads. NVIDIA also said OCI is the first cloud provider to deploy Vera at hyperscale.
The customer list is designed to signal seriousness: Anthropic and OpenAI for frontier-model demand, SpaceXAI for reinforcement-learning and simulation pipelines, and Oracle for cloud-scale enterprise deployment. But the source is still a vendor blog, and the numbers to watch are external benchmarks, power efficiency in real clusters, software maturity and whether Vera systems can make GPUs measurably more utilised in production.
The larger trend is clear even before those answers arrive. The AI hardware market is becoming a full-stack systems contest. GPUs remain central, but memory bandwidth, CPU orchestration, interconnects, DPUs and rack-scale design are now part of the same story.