reply on: Making LLMs Cheaper and Better via Performance-Efficiency Optimized Routing \ stacker news ~AI

pull down to refresh

163 sats \ 8 replies \ @optimism 22 Aug \ on: Making LLMs Cheaper and Better via Performance-Efficiency Optimized Routing AI

Interesting how they claim to boost accuracy over the highest accurate model they use, just by mixing models?

I've been trying something similar on Roo Code, where I let Claude do the architecture and use self-hosted models for everything else. qwen3-coder isn't as good as claude-4-sonnet in coding, but it's still decent enough to let it slug it out.

I've been trying to build a special "hard-problem debug" mode but since I haven't found a single model that is capable of fixing concurrency issues without constant manual interruption (incl all of the commercial closed models) I've put that on hold. But this makes me think that if I can alternate between a model that's good at determination, and let it guide / judge a coder model... this may work?

100 sats \ 7 replies \ @carter OP 22 Aug

I wonder if its related to this phenomenon where running 2 times gives much better results https://brooker.co.za/blog/2012/01/17/two-random.html

0 sats \ 6 replies \ @optimism 22 Aug

I read that as: when you gamble twice, you have a greater chance to win once. lol

100 sats \ 5 replies \ @carter OP 22 Aug

I think the big thing is this: Just random is the same no matter the delay, picking the fastest of 2 is much better but not that much worse than picking from 3. It seems similar to this strategy https://www.tiktok.com/t/ZP8BD7XQp/

100 sats \ 4 replies \ @optimism 22 Aug

bro made me watch tiktok!

It's a good theory, but the reason why I say gamble is because of the randomization going on, even in MoE, where it's been reduced a lot. I'm not sure how this works in gpt-5 or claude-4 though, so maybe that's worth testing too.

100 sats \ 3 replies \ @carter OP 25 Aug

I wish it would download and embed them like it does for twitter

56 sats \ 2 replies \ @optimism 25 Aug

I'd settle for the mp4 as upload being embedded...

100 sats \ 1 reply \ @carter OP 25 Aug

yeah... not always downloadable

35 sats \ 0 replies \ @optimism 25 Aug

yt-dlp

reply on another page