Compute

OpenAI and Broadcom
unveil Jalapeño, the
lab's first custom
inference chip

The ASIC, co-designed with Broadcom and built with Celestica, taped out in nine months and is showing roughly 50% cost savings versus AI GPUs, per Broadcom CEO Hock Tan. Initial deployment is targeted for late 2026.

By Felix Aroyan Compute & infrastructure · June 25, 2026

OpenAI and Broadcom unveiled Jalapeño on Wednesday, the lab’s first in-house ASIC for LLM inference, with Broadcom CEO Hock Tan telling Bloomberg’s Dina Bass that early silicon is showing roughly 50% cost savings against typical AI GPUs. The chip went from initial design to tape-out in nine months, a cycle the companies describe as possibly the fastest ever achieved in high-performance semiconductors.

The reveal was staged with the choreography these announcements now require. Tan and Charlie Kawwas, president of Broadcom Semiconductor Solutions, handed a physical sample to Sam Altman and OpenAI president Greg Brockman. The lab says engineering samples are already running GPT-5.3-Codex-Spark internally. Richard Ho, who leads OpenAI’s hardware program, describes early testing as running “close to the hardware’s theoretical limits” with performance per watt “substantially better than current state-of-the-art.” A detailed technical report is promised in the coming months.

What’s interesting isn’t the cost claim itself but the speed. Nine-month ASIC cycles aren’t normal, and Brockman gestured at why: “The degree to which our models have been able to accelerate it was very surprising to us.” That’s the recursive loop the AI maximalists have been forecasting since 2023, with the lab’s own models compressing the design of the silicon that runs the lab’s own models. Whether it generalizes beyond OpenAI’s particular workload is the open question.

Broadcom contributed the silicon implementation along with Tomahawk networking, and Celestica handled board, rack, and system integration. Tan placed Jalapeño on par with Nvidia’s Blackwell line and Google’s TPUs, which is the kind of claim that needs the promised technical report behind it before it travels.

Deployment is targeted for the end of 2026, what Tan called “small prototype development” before scale. “We will start seeing it really ramp up in ‘27 and really going full tilt in first half ‘28,” he told CNBC’s David Faber. The October partnership had already telegraphed a 10-gigawatt build-out, and Tan’s earlier 1.3-gigawatt projection for next year may, in his words, prove conservative. The joint release pointed to “the deployment of gigawatt scale data centers with Microsoft and other partners beginning in 2026.”

Jalapeño doesn’t displace anyone. OpenAI’s pre-training will likely keep running on Nvidia hardware, TechCrunch’s Russell Brandom notes, and the lab’s recent AWS deal includes Trainium, with separate agreements covering AMD and Cerebras, which IPO’d in May. The strategy is supplier multiplicity, not substitution, which is what every hyperscaler eventually arrives at once the bills start scaling with the models.

The market has been ahead of the news. Broadcom shares are up 10% year-to-date in 2026 and have multiplied nearly sevenfold since the end of 2022, repricing the company as the indispensable second source in a market most observers still describe as Nvidia’s alone.

Sources