• Zenity Labs
  • Posts
  • EchoLeak: A Reminder That AI Agent Risks Are Here to Stay

EchoLeak: A Reminder That AI Agent Risks Are Here to Stay

The EchoLeak attack published by Aim Security shows how Microsoft 365 Copilot can be tricked into leaking sensitive data, without the user clicking anything. The exploit involves sending a seemingly normal email that gets picked up by Copilot’s Retrieval-Augmented Generation (RAG) system. When a user later asks Copilot about earnings reports, it retrieves the attacker’s email along with real company data and embeds the results into a markdown image that secretly sends the data to an attacker-controlled endpoint.

The Same Techniques Still Work

The attack demonstrated that almost a year after this attack path was disclosed, Microsoft 365 Copilot remains vulnerable to remote hijacking. In fact, the techniques leveraged in EchoLeak have been known to the research community for 12-24 months now, and yet, they still work:

EchoLeak builds on these ideas and applies them to Microsoft 365 Copilot, showing that the underlying issues leading to the data exfiltration are still relevant today, and complete mitigation of these issues is a challenge. EchoLeak provides the first demonstration of bypassing markdown image filtering on Microsoft 365 Copilot, clearing out a path for data exfiltration.

Attack Breakdown using the AI Agents Attack Matrix

The AI Agents Attack Matrix is an Open Source framework designed to capture everything that can go wrong when using AI agents. It maps out how these agents can be abused by attackers across their entire lifecycle, from initial access and manipulation to data exfiltration and impact.

A full technical breakdown of the EchoLeak procedure, along with additional details on each technique can be found here. Following is a summary of the techniques used in this attack:

Technique

General Description

EchoLeak

Retrieval Content Crafting

The adversary writes content designed to be retrieved by user queries and influence a user of the system in some way. This crafted content can be combined with a prompt injection or stand alone in a separate document or email. The content is then inserted into the system’s knowledge base, often via RAG ingestion.

The attack involved crafting an email that Copilot retrieved when asked about the latest earnings reports. This was later extended to target additional user questions.

Acquire Infrastructure

Adversaries may buy, lease, or rent infrastructure for use throughout their operation. Infrastructure solutions include physical or cloud servers, domains, mobile devices, and third-party web services.

The attack involved setting up an Azure tenant to host an endpoint which was later used for data exfiltration.

RAG Poisoning

Adversaries may inject malicious content into data indexed by a RAG system to contaminate a future thread through RAG-based search results. This may be accomplished by placing manipulated documents in a location the RAG indexes. The content may be targeted such that it would always surface as a search result for a specific user query.

A malicious email sent to the user was indexed into Copilot’s RAG system, poisoning future interactions where the user asked about the latest earning reports. This was later extended to cover additional user questions.

LLM Prompt Injection

An adversary can create malicious prompts that lead an agent to behave in unintended ways. These prompt injections are typically designed to override the agent's original instructions and make it follow the attacker’s commands instead.

The email included an Indirect Prompt Injection, disguised as legitimate instructions for the email recipient.

LLM Jailbreak

An adversary may use a carefully crafted prompt injection designed to place the LLM in a state in which it will freely respond to any user input, bypassing any controls, restrictions, or guardrails placed on it. Once successfully jailbroken, the LLM can be used in unintended ways by the adversary.

The Indirect Prompt Injection in the email circumvented Copilot's system instructions and provided new ones. The new instructions caused Copilot to embed sensitive data into a markdown image and return it to the user.

Abuse Trusted Sites

The adversary exfiltrates sensitive data by embedding it in resources loaded from attacker-controlled endpoints hosted trusted domains. This bypasses security controls like Content Security Policies and evades detection by leveraging implicit trust in known sites.

Sensitive data was exfiltrated using a trusted Microsoft Teams domain, bypassing the existing Content Security Policies.

Image Rendering

The adversary gets AI to present an image to the user, which is rendered by the user's client application with no user clicks required. The image is hosted on an attacker-controlled website, allowing the adversary to exfiltrate data through image request parameters.

Microsoft Copilot embedded sensitive information in the parameter of a markdown image URL. Since the image was hosted on an attacker-controlled domain, the data was exfiltrated when the image loaded.

Zenity Detects EchoLeak and More

Zenity detects attacks like EchoLeak across multiple stages of the attack chain. Our defense-in-depth approach doesn’t stop at blocking bad prompts. We treat AI agents as untrusted entities and focus on their behavior, identifying patterns that suggest adversarial activity. This includes detecting individual techniques across the entire kill chain, starting with reconnaissance and moving through data harvesting, defense evasion, and data exfiltration. We then connect those pieces into a single attack story.

The image rendering technique was first uncovered by Johann Rehberger over two years ago, and is still relevant today. However, attackers no longer need to rely on tricks like image rendering to steal data. AI agents now come with built-in tools that make it much easier to collect and exfiltrate sensitive information at scale.

We will be sharing more details at Black Hat USA 2025, where we’ll present zero-click exploits that affect enterprise assistants and custom agents. If your AI agents are powerful enough to help users, they’re powerful enough to help attackers too. This isn’t something we can simply patch. It’s a risk that we have to manage.

Reply

or to participate.