Prompt Injection Detection

Prompt injection happens when malicious instructions are hidden inside executable content. In the world of AI, prompt injection can be used to nudge AI agents (like goose) to run unsafe commands that compromise your environment or data.

You can help protect your goose workflows by enabling prompt injection detection. This feature uses pattern matching to detect common attack techniques, including:

Attempts to delete system files or directories
Commands that download and execute remote scripts
Attempts to access or exfiltrate sensitive data like SSH keys
System modifications that could compromise security

important

These checks provide a safeguard, not a guarantee. They detect known patterns but cannot catch all possible threats, especially novel or sophisticated attacks.

How Detection Works

When enabled, goose scans tool calls for risky patterns before they run:

Tool call is intercepted and analyzed - When goose prepares to execute a tool, the security system extracts the tool parameter text and checks it against threat patterns
Risk is assessed - Detected threats are assigned confidence scores
Execution pauses - Threats that exceed your configured threshold need your decision

Security alert appears - The alert displays the confidence level, a description of the finding, and a unique finding ID. For example:

🔒 Security Alert: This tool call has been flagged as potentially dangerous.

Confidence: 95%
Explanation: Detected 1 security threat: Recursive file deletion with rm -rf
Finding ID: SEC-abc123...

[Allow Once] [Deny]

You choose whether to proceed or cancel after reviewing the alert details. Note that:
- Each decision is logged with its finding ID in the goose system logs
- Allowed commands still run with your full permissions

Responding to Alerts:

Read the explanation to understand what triggered the detection
Consider your context—does this match what you're trying to do?
Try rephrasing your request more specifically
Check the source and be extra cautious with prompts from unknown sources

When in doubt, deny.

Enabling Detection

goose Desktop
goose config file

Click the button in the top-left to open the sidebar
Click Settings on the sidebar
Click the Chat tab
Toggle Enable Prompt Injection Detection to the on setting
Optionally adjust the Detection Threshold to configure the sensitivity

Add these settings to your config.yaml:

security_prompt_enabled: true
security_prompt_threshold: 0.7  # Optional, default is 0.7

Other Security Features

Beyond prompt injection detection, goose automatically:

Warns you before running new or updated recipes
Warns you when importing recipes that contain invisible Unicode Tag Block characters
Checks for known malware when installing extensions for locally-run MCP servers

Configuring Detection Threshold

The threshold (0.01-1.0) controls how strict detection is:

Threshold	Sensitivity	Use When
0.01-0.50	Very lenient	You're experienced and understand the risks
0.50-0.70	Balanced	General development work (good default)
0.70-0.90	Strict	Working with sensitive data or systems
0.90-1.00	Maximum	High-security environments

When the injection prompt detection feature is enabled, the default threshold is 0.7 (recommended for most users).

Lower thresholds mean fewer alerts but might miss threats. Higher thresholds catch more potential issues but may flag legitimate operations. You can control this sensitivity/convenience tradeoff based on your needs.

How Detection Works​

Enabling Detection​

Configuring Detection Threshold​

See Also​

How Detection Works

Enabling Detection

Configuring Detection Threshold

See Also