Semantic Kernel Lets a Prompt Open a Shell (CVE-2026-25592, CVE-2026-26030)

Key Takeaways

  • Microsoft disclosed two critical Semantic Kernel vulnerabilities on May 7, 2026.
  • CVE-2026-26030 (CVSS 9.8) routes attacker-controlled vector store fields into a Python eval() call.
  • CVE-2026-25592 exposes a host-side file download method as a callable kernel function.
  • One retrieved document is enough to launch a process on the host running the agent.
  • Patches shipped in semantic-kernel 1.39.4 (Python) and 1.71.0 (.NET).

Summary

Microsoft disclosed two critical remote code execution flaws in the Semantic Kernel agent framework that escalate indirect prompt injection from a content problem into a host compromise, documented in the Microsoft Security Blog research post. The flaws affect agents using the default in-memory vector store or the Azure Container Apps Python plugin, and patched versions shipped on the same day as the disclosure.

What We Know

On May 7, 2026, the Microsoft Security Response Center published advisories for CVE-2026-26030 and CVE-2026-25592 alongside a companion research post titled "When prompts become shells." Semantic Kernel is one of the most widely used agent frameworks in the Microsoft and Azure ecosystem, shipped by the same team behind Copilot Studio.

CVE-2026-26030 affects the Python SDK below version 1.39.4 and resides in the InMemoryVectorStore component, where the default filter expression is built as a Python lambda and executed via eval(). The independent technical writeup by Particula walks through the lambda construction step by step and shows how a single field can break out of the expression.

The .NET counterpart, CVE-2026-25592, affects releases prior to 1.71.0 and lives in the SessionsPythonPlugin used to run model-generated code inside Azure Container Apps dynamic sessions. Microsoft credits its internal AI red teams with reproducible exploit code that turned a single retrieved document into a process launch on the host.

What Happened

Both flaws stem from the same architectural mistake of trusting model-routed input deep enough that it reaches a code-evaluation primitive. In the Python SDK, the city field of a vector store filter is interpolated into a Python lambda and passed to eval(), so an attacker who controls that field can break out of the string and run arbitrary code in the agent process.

In the .NET SDK, the SessionsPythonPlugin exposed a sandbox-to-host file download method as a kernel function the model could choose to call. Prompt injection routed through the planner picks up that tool and writes attacker-chosen files outside the Azure Container Apps sandbox.

The AI-specific failure pattern is that the boundary between retrieved content and executable code dissolves when retrieval feeds tool arguments and tool arguments feed an interpreter. Traditional input sanitization sits outside the agent's runtime, so it never sees the payload, and the NVD entry for CVE-2026-26030 confirms a CVSS 9.8 rating for that path.

Why It Matters

Semantic Kernel powers production agents inside Microsoft 365 Copilot environments, enterprise RAG applications on Azure, and a long tail of internal automation across regulated industries. Any organization that built on the default vector store or the Container Apps plugin and did not promptly upgrade is exposed, since the prerequisite is only an attacker-influenced field reaching the index.

Affected data ranges from internal documentation and RAG corpora to system credentials accessible from the agent process, with the worst case being lateral movement from the agent host into Azure tenant resources. The blast radius for a single compromised retrieval source can include every downstream agent that queries it.

Regulators are watching this exact pattern. The EU AI Act high-risk obligations and the NIST AI Risk Management Framework both anchor on input integrity and traceable tool-call governance, neither of which a prompt-injection-to-RCE path preserves.

PointGuard AI Perspective

The Semantic Kernel disclosures are precisely the threat model that the PointGuard Agent Governance Mesh is built to neutralize. The Mesh sits between agent intent and action and intercepts every tool call at sub-millisecond latency, so a model that has been steered toward eval() or an unsafe file-write helper never reaches the host primitive.

Where retrieval feeds the model, PointGuard AI Runtime Guardrails inspect content for known prompt-injection markers and block templates that pull tool calls into unsafe code paths. Policy decisions stay enforceable across both Python and .NET runtimes without requiring code changes in the agent itself.

For Microsoft 365 Copilot and Copilot Studio environments, the MCP Security Gateway brokers every tool call with per-agent identity and tool-level least privilege, so even a successful injection cannot reach plugins the agent was never authorized to use. The forward-looking lesson is straightforward. Agent frameworks are now application servers in their own right, and applying the same vulnerability management and runtime defense discipline as the rest of the stack is the price of admission for trustworthy AI.

Incident Scorecard

Total AISSI Score: 7.1/10

Criticality: 8/10. Semantic Kernel runs inside Microsoft 365 Copilot, Azure agent stacks, and enterprise RAG; reachable assets are sensitive. AISSI weighting: 25%.

Propagation: 8/10. Single SDK family across Python and .NET reused by the default vector store and Container Apps plugin patterns. AISSI weighting: 20%.

Exploitability: 6/10. Reproducible POC published by Microsoft; no widespread in-the-wild campaigns confirmed at disclosure. AISSI weighting: 15%.

Supply Chain: 7/10. Microsoft-published agent SDK used as upstream by a large universe of downstream agents. AISSI weighting: 15%.

Business Impact: 6/10. Credible potential for harm in production deployments; no widely reported customer breaches confirmed yet. AISSI weighting: 25%.

Sources

AI Security Severity Index (AISSI)

0/10

Threat Level

Criticality

8

Propagation

8

Exploitability

6

Supply Chain

7

Business Impact

6

Scoring Methodology

Category

Description

weight

Criticality

Importance and sensitivity of theaffected assets and data.

25%

PROPAGATION

How easily can the issue escalate or spread to other resources.

20%

EXPLOITABILITY

Is the threat actively being exploited or just lab demonstrated.

15%

SUPPLY CHAIN

Did the threat originate with orwas amplified by third-partyvendors.

15%

BUSINESS IMPACT

Operational, financial, andreputational consequences.

25%

Watch Incident Video

Subscribe for updates:

Subscribe

Ready to get started?

Our expert team can assess your needs, show you a live demo, and recommend a solution that will save you time and money.