Skip to main content
Michael Neale
Principal Engineer
View all authors

Goose and Qwen3 for Local Execution

· 3 min read
Michael Neale
Principal Engineer

local AI agent

A couple of weeks back, Qwen 3 launched with a raft of capabilities and sizes. This model showed promise and even in very compact form, such as 8B parameters and 4bit quantization, was able to do tool calling successfully with goose. Even multi turn tool calling.

I haven't seen this work at such a scaled down model so far, so this is really impressive and bodes well for both this model, but also future open weight models both large and small. I would expect the Qwen3 larger models work quite well on various tasks but even this small one I found useful.

Finetuning Toolshim Models for Tool Calling

· 6 min read
Alice Hau
Machine Learning Engineer
Michael Neale
Principal Engineer

blog cover

Our recently published Goose benchmark revealed significant performance limitations in models where tool calling is not straightforwardly supported (e.g., Gemma3, Deepseek-r1, phi4). These models often fail to invoke tools at appropriate times or produce malformed or inconsistently formatted tool calls. With the most recent releases of Llama4 and Deepseek v3 (0324), we are again observing challenges with effective tool calling performance, even on these flagship openweight models.