TTPs.ai for GenAI-Targeted Attacks

Guiding threat simulation and defense for Copilots and Agents

A couple of months ago at BlackHat we dropped a lot of research showing how hackers can leverage copilot as a new attack vector into the enterprise. Since then, we’ve been having lots of conversations with security pros about the demonstration and its implications. Two things have become abundantly clear. First, I’m glad to report that the talks were well received; the risk and its magnitude were well understood. We’ve been able to shine a light on a huge blindspot, and people are reconsidering their priorities to address this new attack vector. They also realize that it’s not just about Microsoft Copilot, but about any application of GenAI in a real-world setting. Second, we didn’t really offer an actionable way forward. To put it in other words - we understand the risk now, but what should we do about it?

We’ve spent a lot of time thinking about this. My key suggestion is that we should consider malicious instructions as promptware, and treat it with the same level of vigor as we do for malware. Adopt an assume-breach approach rather than trying and failing to enumerate every “bad prompt” out there. There’s more on this idea in the post linked above, but it’s still high level. A way of thinking about the problem. It’s not actionable, yet.

Today we’re taking a step forward in making it actionable. I’m thrilled to introduce our latest Open Source project (contributions very welcome!):

This project started as a way for us to make sense of our own research. How would we mitigate our own attacks? Microsoft’s first reaction was to deny-list our prompts, but that’s not really helpful, finding others is easy. Was there a better solution?

Learning from EDRs, we should stop over-relying on static signatures and refocus on behavior. What are the behavior patterns of a copilot, an agent and the human interaction with those, when an attack is underway? If we can capture these patterns in a meaningful way, we can guide mitigation for builders, defense for defenders and detection for hunters.

GenAI Attacks Matrix is focused GenAI-based applications like copilots and agents. It’s inspired by MITRE ATT&CK and Atlas frameworks (intended as the sincerest form of flattery). One important distinction from ATT&CK is that we’re documenting security research, not observed adversary behavior. We believe its important to get on top of these threats well before they are observed, given how fast AI is being adopted. We’re considering three distinct threat models:

  1. A malicious or curious insider

  2. A malicious external attacker

  3. Confused AI doing the wrong thing

We’re collaborating with others across the industry to enrich this framework. We want YOUR contributions. It’s easy to get started. There are other awesome frameworks for AI security, and you can read our thoughts about the relation to each in the Q&A section. We’re also working on contributing back into these.

This is still very much a work-in-progress. We’ve decided to build in public to get others involved early on. Please let us know if you find this useful or have a suggestion on how we (you included) could make it better!

Reply

or to participate.