Compute

OpenAI's first custom
chip, Jalapeño, lands
at 50% the
cost of GPUs

OpenAI and Broadcom on June 24 unveiled Jalapeño, an inference ASIC taped out in nine months with help from OpenAI's own models, with initial gigawatt-scale deployment planned by year-end alongside Microsoft.

By Felix Aroyan Compute & infrastructure · June 25, 2026

OpenAI and Broadcom on June 24 unveiled Jalapeño, OpenAI’s first custom inference ASIC, with Broadcom CEO Hock Tan telling Bloomberg the chip delivers roughly 50% cost savings versus typical GPUs while landing on par with Nvidia’s Blackwell line and Google’s TPUs. Hock Tan and Charlie Kawwas, president of Broadcom Semiconductor Solutions, handed the first part to Sam Altman and Greg Brockman in a joint ceremony staged to look exactly like what it’s: the moment OpenAI stopped being purely a customer of someone else’s silicon roadmap.

The joint announcement calls the nine-month sprint from initial design to manufacturing tape-out “what may be the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors.” Brockman told CNBC the chip was designed end-to-end with help from OpenAI’s own models, and that “the degree to which our models have been able to accelerate it was very surprising to us.” Engineering samples are already running ML workloads in the lab at production frequency and power, including GPT-5.3-Codex-Spark. Richard Ho, who leads OpenAI’s hardware program, says the design pushes “our most important workloads close to the hardware’s theoretical limits.”

ASICs are less flexible than Nvidia GPUs but cheaper and tunable for specific tasks, and Jalapeño is aimed squarely at inference, the ChatGPT-serving workloads where OpenAI’s bill scales with usage rather than with research ambition. Pre-training will likely still run on Nvidia hardware, per TechCrunch. OpenAI has been one of Nvidia’s largest customers since 2022, and that doesn’t change here. What changes is the marginal economics of every token served.

The deployment curve is the real story. Tan had previously projected 1.3 gigawatts of chip deployments next year, but the partnership announced last October floated a 10 gigawatt envelope, and Broadcom’s release commits to racks of accelerator and networking systems running from the second half of 2026 through the end of 2029. Microsoft is the named launch partner. Celestica handles board, rack, and system integration. Broadcom supplies the silicon and its Tomahawk networking fabric. Tan framed the cadence to CNBC as “small prototype development” in late 2026, ramping in 2027, “going full tilt in first half ‘28.”

“By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt scale data centers with Microsoft and other partners beginning in 2026,” Tan said. On Nvidia’s margins, his line to Bloomberg was shorter: “we like to think we can do better.”

The market has already priced the thesis. Broadcom shares are up 10% in 2026 and have multiplied nearly sevenfold since the end of 2022, a re-rating that tracks the same hyperscaler-ASIC pattern Google’s TPU program established a decade ago. The difference is the timeline. Google took years to make its accelerators load-bearing for its own products. OpenAI is attempting it in eighteen months, using the models the chip is meant to serve to design the chip itself.

Sources