Am I right in thinking this is a tiny model which has been trained well to reason, and that's it? Makes me think of a smart person who doesn't know anything about a given topic, but with the right tools will go and research the heck out of it. I really like the sound of this... why have models train on learning anything when you can just train them how to learn and let them get on with it from something as small as a Pi Zero and an internet connection.
secretslol
Looks like we are seeing small but mighty model breakthroughs, outpacing the pure capital firepower of SOTA providers. I love rooting for the little guy, but is it too soon to call it? To play devils advocate, could it just be the benchmarks are not efficient enough to capture success of real developer workflows?
rbbydotdev
There is some base level of intelligence any model needs to be useful, even in narrow tasks.
Could you teach a 5 year old to drive a car? A 10 year old? A 12 year old? To drive a car requires being able to read, to have judgement about ice or rainy conditions, to anticipate a child running after a ball. By the time a human in in their mid teens they have acquired the base knowledge...
Small models need to have enough base knowledge to be able to be good enough -- even in a seemingly narrow regime. Where is that? Obviously they don't need all the obscure knowledge of a frontier model but there is some base level which is probably more than it would first seem.
deftio
Note that these are Python-only results, the model will not do as well with other languages.
I'm glad to see more domain-focused SLMs, we need more of them! A programming focused MoE should work well across many languages.
gslepak
The interesting thing about models this small is they should be able to be put on a single Taalas chip (the HC1 already runs a Llama 3.1 8B model). We're already at the point where half-decent reasoning could be run on an ASIC (and at mind-boggling speeds).
NotSuspicious
So I went ahead and quickly vibecoded a working harness with a barebones tool interface and some constraints on output (credit to noperator for the idea).
github: https://github.com/NickalasLight/VibeHarness.git
Its meant for a Windows machine using ollama but I'm sure anyone who wants to mess with it can point claude code at it to convert it for your own operating system and requirements. After install you can ask it to do something with "vibe 'create me a poem about cheese in cheese.txt'" its workspace is by default the directory the cli was located in when you called it.
nickalaso
Having some success while testing this model out as a replacement for GPT-5 nano in source code security review. Running on RTX 3090 (24 GB VRAM) via vLLM. It's not great on structured output (as noted in the model card) but I'm working around that in my harness.
noperator
Really enjoying seeing these really capable SMLs.
Note that on HF they state:
"This model was not trained on tool-calling or agent-based programming data. We therefore do not recommend using it for tasks that involve function calling, API orchestration, or autonomous coding agents." - https://huggingface.co/WeiboAI/VibeThinker-3B
So we can't just hook it up to a coding harness like pi.dev or something.
mvitorino
Sounds like something that could be pretty useful as a 'validation' subagent. Provide it the details/context related to a larger LLM's run or turn in a harness and have it act as a gatekeeper. At this size and speed it looks like it could be economical to have it run every turn or even every tool call and inform the main agent about the result and success/failure.
troglodytetrain
I tried generating the classic pelican svg, but it failed horribly just showing me a rectangle and a black circle...
comments (10)
secretslol
rbbydotdev
Could you teach a 5 year old to drive a car? A 10 year old? A 12 year old? To drive a car requires being able to read, to have judgement about ice or rainy conditions, to anticipate a child running after a ball. By the time a human in in their mid teens they have acquired the base knowledge...
Small models need to have enough base knowledge to be able to be good enough -- even in a seemingly narrow regime. Where is that? Obviously they don't need all the obscure knowledge of a frontier model but there is some base level which is probably more than it would first seem.
deftio
I'm glad to see more domain-focused SLMs, we need more of them! A programming focused MoE should work well across many languages.
gslepak
NotSuspicious
Its meant for a Windows machine using ollama but I'm sure anyone who wants to mess with it can point claude code at it to convert it for your own operating system and requirements. After install you can ask it to do something with "vibe 'create me a poem about cheese in cheese.txt'" its workspace is by default the directory the cli was located in when you called it.
nickalaso
noperator
mvitorino
troglodytetrain
aero2146