Michael Neale

Principal Engineer

View all authors

Isolated Dev Environments in Goose with container-use

June 19, 2025 · 4 min read

Michael Neale

Principal Engineer

blog cover

Over ten years ago, Docker came onto the scene and introduced developers en masse to the concept and practice of containers. These containers helped solve deployment and build-time problems, and in some cases, issues with development environments. They quickly became mainstream. The technology underlying containers included copy-on-write filesystems and lightweight, virtual-machine-like environments that helped isolate processes and simplify cleanup.

Dagger, the project and company founded by Docker’s creator Solomon Hykes, has furthered the reach of containers for developers.

One project that emerged from this work is Container Use, an MCP server that gives agents an interface for working in isolated containers and git branches. It supports clear lifecycles, easy rollbacks, and safer experimentation, without sacrificing the ergonomics developers expect from local agents.

Container Use brings containerized, git-branch-isolated development directly into your Goose workflow. While still early in its development, it's evolving quickly and already offers helpful tools for lightweight, branch-specific isolation when you need it.

Treating LLMs Like Tools in a Toolbox: A Multi-Model Approach to Smarter AI Agents

June 16, 2025 · 4 min read

Michael Neale

Principal Engineer

Angie Jones

Head of Developer Relations

blog cover

Not every task needs a genius. And not every step should cost a fortune.

That's something we've learned while scaling Goose, our open source AI agent. The same model that's great at unpacking a planning request might totally fumble a basic shell command, or worse - it might burn through your token budget doing it.

So we asked ourselves: what if we could mix and match models in a single session?

Not just switching based on user commands, but building Goose with an actual system for routing tasks between different models, each playing to their strengths.

This is the gap the lead/worker model is designed to fill.

Goose and Qwen3 for Local Execution

May 12, 2025 · 3 min read

Michael Neale

Principal Engineer

local AI agent

A couple of weeks back, Qwen 3 launched with a raft of capabilities and sizes. This model showed promise and even in very compact form, such as 8B parameters and 4bit quantization, was able to do tool calling successfully with goose. Even multi turn tool calling.

I haven't seen this work at such a scaled down model so far, so this is really impressive and bodes well for both this model, but also future open weight models both large and small. I would expect the Qwen3 larger models work quite well on various tasks but even this small one I found useful.

Finetuning Toolshim Models for Tool Calling

April 11, 2025 · 6 min read

Alice Hau

Machine Learning Engineer

Michael Neale

Principal Engineer

blog cover

Our recently published Goose benchmark revealed significant performance limitations in models where tool calling is not straightforwardly supported (e.g., Gemma3, Deepseek-r1, phi4). These models often fail to invoke tools at appropriate times or produce malformed or inconsistently formatted tool calls. With the most recent releases of Llama4 and Deepseek v3 (0324), we are again observing challenges with effective tool calling performance, even on these flagship openweight models.