pull down to refresh

Jan-v3-4B-base-instruct is a 4B-parameter model obtained via post-training distillation from a larger teacher, transferring capabilities while preserving general-purpose performance on standard benchmarks. The result is a compact, ownable base that is straightforward to fine-tune, broadly applicable and minimizing the usual capacity–capability trade-offs.

Building on this base, Jan-Code, a code-tuned variant, will be released soon.

Last week I needed something to replace Qwen3-8B in a low-memory, local-only RAG workflow I was developing and while I was A/B testing multiple alternatives, I found that this recently got released. I've been using it's predecessor, Jan-v2, for MCP mid last year (#1016343), before the second Qwen3 update made that the better choice.

It beat other, larger models, like nemotron-3 in consistency and instruction following (which Nvidia advertises to be great at). Where most of the smaller open weight models (including Qwen3, Nemotron3, gpt-oss, Mistral 3.1 small.. and so on) can be pretty awful in their ways of making stuff up when quoting and editing source text (which is unacceptable in zero-fail RAG), jan-v3 shone. I can even run this on my old Macbook M1 with a 20k context window, without it needing any memory compression.

Thus far, this model has been a blessing to my project but I want to test it further. Perhaps this can serve as pleb-claude?

Wouldn't it be awesome if "we can actually have nice things"?