pull down to refresh

We saw LLMs are possible to call tools for a long time and we even saw protocols like MCP out there to make it more standard.
It's simple to understand how it work at the surface level, but for me the big question was how a large LANGUAGE mode calls a tool? It was obvious that they had to SAY it and ask to the client somehow to do it for them, but they are very dynamic? how do we force them to always respond in a known format? So, based on this video I just understand they are fine-tuning models to learn how to call a tool correctly in standard form.
You may knew this, but it was always a question for me and after a long time I had enough time to do a search about it.
110 sats \ 0 replies \ @optimism 6h
It basically gets addressed in the chat_template (the composed text that gets presented to the tokenizer after you press enter). For qwen3's gguf, the tool injection looks like this:
{%- if tools %}
    {{- '<|im_start|>system\n' }}
    {%- if messages[0].role == 'system' %}
        {{- messages[0].content + '\n\n' }}
    {%- endif %}
    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
    {%- for tool in tools %}
        {{- "\n" }}
        {{- tool | tojson }}
    {%- endfor %}
    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
{%- else %}
reply