I'm figuring out the best way to lower token cost of my LLM based news bot. Particularly, I'm working on splitting the workload between different models: give simpler tasks to lighter models and hard ones – like translations – to GPT-5. 

> As GPT-5 dropped I rushed to switch to it only to figure out that this novelty is incredibly token hungry. I covered this in more detail here https://stacker.news/items/1075448. 

I often run my ideas by GPT-5 with [Microsoft's Copilot](https://copilot.microsoft.com) chatbot app 'cause I use OpenAI API to power my news bot. Copilot allows you to specifically use GPT-5 and not have it fall back to lighter models whenever it "wants". And I noticed something pretty cool – you can use natural language to ask GPT-5 to pretend it is a lighter model and test how your prompts or other approaches would work if you fed them to those `non-reasoning` models.

In this example GPT-5 processed the text as if was GPT-Mini and shared the results with me. 

![](https://m.stacker.news/104740)

![](https://m.stacker.news/104741)

![](https://m.stacker.news/104742)

Will try this new split approach tomorrow and share my further findings here on SN.