Back

MemMorph Memory Poisoning Attack Biases AI Agent Tool Selection

Key Takeaways

MemMorph biases agent tool selection by poisoning long term memory rather than the live prompt.
It reached up to an 85.9 percent success rate with only three injected records.
Researchers tested ten agent backbones and three memory modules across three benchmarks.

Summary

Agents that remember can be taught the wrong lessons. On May 24, 2026 researchers published MemMorph, a memory poisoning attack that quietly biases which tools an AI agent chooses. By inserting a few records disguised as technical facts and policies into long term memory, the attack reshaped agent decisions, reaching up to an 85.9 percent success rate with only three planted entries in lab testing.

What We Know

MemMorph was released as an academic paper on arXiv on May 24, 2026. Unlike prompt injection, which manipulates the immediate input, MemMorph targets the persistent memory that tool using agents accumulate across sessions. The attacker plants a few crafted records framed as technical facts, incident reports and operational policies. These records do not issue explicit commands. Instead they reshape the agent contextual perception so it autonomously selects the tool the attacker prefers. The authors evaluated the technique across three benchmarks, ten agent backbones and three memory module implementations, reporting attack success rates as high as 85.9 percent using only three injected records. The work frames long term memory as a critical, under explored attack surface and calls for memory level integrity safeguards. Related research the same month, such as gradient coupled anomaly detection for memory poisoning, began proposing defenses, signaling an emerging research front rather than an isolated finding.

What Could Happen

MemMorph is a demonstrated technique rather than an exploited breach, so the risk is what it enables once agents with persistent memory are widely deployed. The core weakness is that long term memory is treated as trusted context, so anything written there can later influence reasoning with the authority of remembered fact. Autonomy means the agent acts on its conclusions without a human reviewing each tool choice, and data dependency means it leans on stored context to decide what to do. If an attacker can write to memory, through a poisoned document, a manipulated interaction or a compromised data source, the bias can persist and resurface across many future tasks. Because the planted records look like ordinary facts, the manipulation is hard to spot in review. Outcomes range from steering an agent toward an attacker controlled tool or data source to subtly degrading decisions in ways that are hard to attribute back to a memory entry.

Why It Matters

As enterprises adopt agents that retain memory for continuity and personalization, that memory becomes a durable attack surface. A single successful poisoning could shape an agent behavior for days or weeks, well after the initial access is gone, which complicates detection and response. The implication is that securing prompts and tools is not enough, because the integrity and provenance of what an agent remembers must also be controlled. This matters for safety, since biased tool selection can route sensitive operations through attacker preferred paths, and for trust, since users cannot easily tell an agent has been influenced. The findings align with the OWASP guidance for agentic applications and the NIST AI Risk Management Framework, which stress monitoring how autonomous systems form and act on context.

PointGuard AI Perspective

PointGuard AI helps organizations extend security from the prompt and the tool to the memory that ties an agent together. Continuous model and agent risk monitoring can baseline normal tool selection and flag the drift that memory poisoning produces, surfacing manipulation that static testing would miss. Policy enforcement can constrain which sources may write to an agent long term memory and require provenance on the records that influence decisions, addressing the trust assumption MemMorph exploits. AI software bill of materials visibility maps the memory stores, data sources and tools an agent depends on, giving teams the inventory needed to investigate when behavior changes. We have repeatedly seen how unguarded agent infrastructure invites abuse, as in our analysis of exposed Clawdbot MCP endpoints, and our monthly reviews such as the April 2026 incident roundup track how quickly these techniques mature. The path to trustworthy AI runs through treating agent memory as security relevant state, with integrity checks, least privilege writes and full auditability, so that what an agent remembers cannot quietly become what an attacker decides.

Incident Scorecard Details

Total AISSI Score: 5.2 / 10

Criticality = 6, targets agent decision integrity that can route sensitive operations, AISSI weighting: 25%

Propagation = 7, applies broadly across agent backbones and memory modules, AISSI weighting: 20%

Exploitability = 4, a strong proof of concept with no observed real world abuse, AISSI weighting: 15%

Supply Chain = 6, depends on agent frameworks and memory components often sourced externally, AISSI weighting: 15%

Business Impact = 3, research stage with no confirmed operational harm, AISSI weighting: 25%

MemMorph Memory Poisoning Attack Biases AI Agent Tool Selection

Key Takeaways

Summary

What We Know

What Could Happen

Why It Matters

PointGuard AI Perspective

Incident Scorecard Details

Sources

AI Security Severity Index (AISSI)

0/10

Threat Level

Criticality

6

Propagation

7

Exploitability

4

Supply Chain

6

Business Impact

3

Scoring Methodology

Category

Description

weight

Criticality

Importance and sensitivity of theaffected assets and data.

25%

PROPAGATION

How easily can the issue escalate or spread to other resources.

20%

EXPLOITABILITY

Is the threat actively being exploited or just lab demonstrated.

15%

SUPPLY CHAIN

Did the threat originate with orwas amplified by third-partyvendors.

15%

BUSINESS IMPACT

Operational, financial, andreputational consequences.

25%

Watch Incident Video

Subscribe for updates:

Ready to get started?