The company unveiled Rubin CPX, a graphics processor optimized for tasks with a context of over 1 million tokens.
The chip is designed for “disaggregated inference,” an approach where different GPUs process different parts of a task. This should improve the efficiency of models in video generation, programming, and other long-context scenarios.
Rubin CPX will be released in late 2026.