Tool Poisoning

Models choose tools based on the descriptions servers advertise. If those descriptions contain hidden instructions or misleading semantics, the agent's planner can be steered toward unsafe calls without any traditional prompt injection appearing in the user's input.

Tool poisoning techniques include:

  • Hidden instructions: System-prompt-like text smuggled inside tool descriptions.
  • Misleading names: Tool labels that imply safer behavior than the tool actually performs.
  • Schema manipulation: Argument schemas crafted to trick the model into supplying secrets.
  • Bait-and-switch updates: Trusted tools that change behavior after agent approval.
  • Cross-tool collusion: Multiple poisoned tools coordinating to chain a malicious action.

Because tool descriptions influence the model directly, defenses cannot rely on user awareness. Programmatic inspection of descriptions and schemas, combined with intent validation on tool calls, is what keeps poisoned tools from quietly steering agent behavior.

Mature programs also include red team exercises that probe the tool selection layer specifically, so emerging poisoning techniques are surfaced before they are exploited in production.

How PointGuard AI Helps

The PointGuard MCP Security Gateway inspects tool descriptions and schemas for poisoning markers and blocks suspicious updates, while AI Runtime Guardrails validate model output against expected tool-call semantics before execution. Together they ensure that tool descriptions cannot quietly reshape agent behavior, even when servers are otherwise trusted.

Learn More

Watch Blog Video

Follow us on LikedIn

Our Newsletter

Subscribe

Ready to get started?

Our expert team can assess your needs, show you a live demo, and recommend a solution that will save you time and money.