OpenAI just gave its coding assistant Codex a speed boost.
On Thursday, the company introduced GPT-5.3-Codex-Spark, a lightweight version of the latest Codex model released earlier this month. Spark is built to deliver faster responses, making it better suited for real-time collaboration and rapid iteration.
What makes this launch particularly notable isn’t just the smaller model it’s the hardware behind it.
Powered by Cerebras’ Wafer-Scale Chip
To achieve faster inference speeds, OpenAI is tapping into a dedicated chip from hardware partner Cerebras. The move signals deeper integration between OpenAI’s software models and its physical computing infrastructure.
The partnership between the two companies was announced last month as part of a multi-year agreement reportedly worth over $10 billion. At the time, OpenAI said bringing Cerebras into its compute stack would help dramatically improve responsiveness. With Spark, that collaboration reaches its first major milestone.
Spark runs on Cerebras’ Wafer Scale Engine 3 (WSE-3) the company’s third-generation wafer-scale processor, packed with an eye-popping 4 trillion transistors. Cerebras’ architecture is specifically designed for ultra-low latency workloads, which aligns with Spark’s goal: near-instant responses inside the Codex app.
Built for Speed, Not Heavy Lifting
OpenAI positions Spark as a “daily productivity driver.” Instead of handling long, compute-heavy tasks that require deep reasoning (the kind the full GPT-5.3 Codex excels at), Spark focuses on:
- Rapid prototyping
- Real-time collaboration
- Quick code iteration
- Low-latency responses
In other words, it’s meant for the flow state — when developers want immediate feedback while building.
Spark is currently available in research preview for ChatGPT Pro users within the Codex app.
A Hint from Sam Altman
Ahead of the announcement, CEO Sam Altman teased the release on X (formerly Twitter), writing:
“We have a special thing launching to Codex users on the Pro plan later today. It sparks joy for me.”
The name makes sense now.
Two Modes for the Future of Codex
OpenAI says Spark represents the first step toward a dual-mode Codex experience:
- Real-time mode for fast collaboration and iteration
- Long-running mode for deeper reasoning and complex execution
The idea is to give developers the best of both worlds instant responsiveness when speed matters, and sustained compute when complexity demands it.
Cerebras’ Growing Role in AI Infrastructure
Cerebras, founded more than a decade ago, has become increasingly prominent in the AI boom. Just last week, the company announced it raised $1 billion in new funding at a $23 billion valuation. It has also signaled plans to pursue an IPO.
Cerebras CTO and co-founder Sean Lie described the collaboration as an opportunity to rethink how developers interact with AI:
“What excites us most about GPT-5.3-Codex-Spark is partnering with OpenAI and the developer community to discover what fast inference makes possible new interaction patterns, new use cases, and a fundamentally different model experience. This preview is just the beginning.”
If Spark delivers on its promise, it could mark a shift toward AI tools that feel less like batch processors and more like live collaborators.

0 Comments