Zenity Labs
Posts
Exhibit & Exploit: Two DEF CON 33 Highlights from the Past & Future of Hacking

Exhibit & Exploit: Two DEF CON 33 Highlights from the Past & Future of Hacking

Humans, hacker culture and AI: Notes from Hacker Summer Camp

Avishai Efrat
August 25, 2025

Every August, tens of thousands of curious souls descend upon Las Vegas, not just to break hardware or hack software, but to explore the strange, shifting edges of our digital culture. This year, between all the chaos of DEF CON 33 and Black Hat, two moments injected themselves into my brain: one representing our past, the other - our future.

The first moment was during a talk about building a physical and digital museum for malware (a truly amazing concept IMO): a place where viruses, worms and trojans are treated not just as threats, but as cultural artifacts worth preserving. In the words of the presenter: they are part of our digital culture, a culture worth preserving. The second, was a panel in a small room packed with AI hackers from the notorious BASI group, Basi Team 6 (BT6) - people dedicated to jailbreaking the latest large language models almost quicker than the rate in which they’re coming out (often literally within hours from their official release).

Both made me think both about entirely different aspects of hacking, AI and digital culture, as well as how they relate to what I do as a researcher: AI agentic security. This post tries to capture why they mattered to me… and how they connect.

When Exploits Become Art

Mikko Hypponen has been archiving malware since 1991 - not to eradicate it, but to remember it. His Malware Museum, physically located in Helsinki, Finland (and online as part of the Internet Archive, reflects on viruses not as weapons but as cultural artifacts: emulated DOS-era and early Windows-era samples sit alongside psychedelic “virus art,” lovingly preserved with the same reverence usually reserved for Renaissance paintings. It’s a reminder that malware didn’t begin as a multi-million-dollar cyber-crime industry, but as a way for geeks & programmers to show off, prank each other, and mark their existence inside a machine as an artful escape. It was a way to check the boundaries of how far these malware creations would go (some literally spreading around the world). It’s incredible to me that programs created in the 90s with the intention to wreak havoc are presented as works of art, and I believe this can tell us something about the future.

There’s also something somewhat poetic about preserving these early digital life-forms in glass cases: because if this is what becomes museum-worthy, then I wonder what does that mean for AI promptware today? Will the jailbreaks, malicious prompts and “disregard above instructions” LLM exploits of today someday hang in similar halls? Moreover, it made me realize: we’re standing at the starting line of AI security, a moment just as experimental, weird and human as those early days of computer viruses and early internet. Understanding AI security and how it matters today, how risks to agents and companies differ from LLM model risks - these are things that are relatively new to the industry and are still unraveling. Are we experiencing something adjacent to that early digital era with the current state of AI?

Museum of malware art

The Museum of Malware Art captures the dark side of our digital world. Through our collections and exhibitions, we explore the history, present and future of cyber attacks: the motivations behind them, the ethical considerations of digital security and malware's effects on individuals, companies and societies.

www.withsecure.com/en/experiences/museum-of-malware-art

And there’s also a version hosted as part of the Internet Archive!

Internet Archive: Digital Library of Free & Borrowable Texts, Movies, Music & Wayback Machine

archive.org/details/malwaremuseum

Jailbreaking the Future of AI

The Misaligned: AI Jailbreaking Panel wasn’t even on the official DEF CON schedule until the last minute AFAIK, which somehow made it feel even more underground. BASI Team Six (a splinter of the BASI Discord) took the stage wearing masks, replicating a kind of chaotic, ’90s-style hacker aesthetic that felt pulled directly from IRC channels, warez crews and early botting forums. It was almost funny: the most cutting-edge topic in the world (jailbreaking frontier AI systems) delivered by people who looked like they’d stepped out of a 1998 DEF CON photo. And maybe that’s the point.

Agenda 2025 | Bug Bounty Village

www.bugbountydefcon.com/agenda?trk=public_post_reshare-text

Jailbreaking modern LLMs isn’t about shellcode or buffer overflows, you don’t necessarily need to be technical. These are very much linguistic exploits: you coax, trick or social-engineer the model into ignoring its own instructions and training and exploit its unrestricted power. Listening at the panel and talking to researchers after it, it also emphasized the need to keep researching the mechanics of why these linguistic attacks even work on the LLM & transformer level and what weaknesses they are exploiting (more on that on upcoming posts as well). BASI are great at challenging AI platforms who say they’re secure by default and uncovering ways to manipulate them - if you’re not already there, you can experience the BASI Discord server here to learn more:

Join the BASI Discord Server!

BASI is the top Discord for AI jailbreaking, red-teaming, and prompt engineering, where members push the limits of AI :) | 29151 members

discord.gg/basi

The fact that the panel heavily focused on the LLM level and hacker mentality in performing these jailbreaks, struck a surprising chord for me: these enterprise models are getting real-world jailbreaks that affect products, customers and decisions right now, and where I think it’s mostly already happening is the agentic landscape.

On that point, here are two open-source projects we are supporting here at Zenity, that aim to improve visibility & understanding of agentic risks: the AI Agents Attack Matrix and PowerPwn.

Attacks Matrix - AI Agents Attack Matrix

Documentation for the AI Agents Attack Matrix

ttps.ai

GitHub - mbrg/power-pwn: An offensive security toolset for Microsoft 365 focused on Microsoft Copilot, Copilot Studio and Power Platform

An offensive security toolset for Microsoft 365 focused on Microsoft Copilot, Copilot Studio and Power Platform - mbrg/power-pwn

github.com/mbrg/power-pwn

Yesterday’s playful experiments are today’s industries

Which brings me back to the Malware Museum: if those early 90s viruses, created by artists, pranksters and curious geeks, ended up transforming into a multi-billion-dollar cybercrime ecosystem, what trajectory are we stepping onto with “promptware” today?

From floppy disks to frontier models, it’s not just the technology that advances, it’s the culture of offense, the shared language of hackers, and the sheer human curiosity at the edge. BASI’s masked panel wasn’t nostalgic by accident, I think it was a bit prophetic. I believe this is partially because AI is also extending our ability to learn about the nature of attacks using its NLP origin - and enabling anyone be a noob.

Seeing the Malware Museum next to the BASI jailbreakers made another thought crystallize for me: if early malware turned into the global cybercrime industry we know today, what happens when “promptware” follows a similar 30-year curve? And what would we have said about the start of securing malware that we can say after 30 years?

I believe part of the answer lies in embracing hard guardrails today, deliberately restricting AI capabilities based on research-driven security practices, and investing in truly understanding what AI security even is. Efforts like MITRE ATLAS, AI Agents Attack Matrix, and other emerging frameworks are a strong start, but they’re just that: a start. If you’re interested in learning more about hard guardrails, here’s a deeper dive:

Why Aren’t We Making Any Progress In Security From AI

Guardrails Are Soft Boundaries. Hard Boundaries Do Exist.

labs.zenity.io/p/why-aren-t-we-making-any-progress-in-security-from-ai-bf02

In malware, we built this collective knowledge and discipline over decades. With AI, we’re barely laying the foundation. If we want to avoid repeating history, we need to focus not only on finding attacks, but on mitigation, defensive design, and deciding where the real value of secure, agentic AI systems actually lies.

Looking Ahead Through the Lens of Agency

I love the parallelism between the retro-futuristic vibes of BASI Team Six and what the Malware Museum represents by its mission to preserve digital culture. I’ll be honest, I started writing about these two events simply because they were fun, weird, and so uniquely DEF CON. But the more I thought about it, the more I realized there’s a much more important point hidden underneath.

Hard boundaries are real and probably something we will discuss in the far future as obvious first places to have improved agentic security posture. This means we don’t just need to find attacks, we need to fix, detect, prevent, and most of all understand what makes them possible. Today’s agentic attacks aren’t theoretical, they’re already expanding the attack surface of applications and enterprises.

Check out this research for a terrifying example of prompt-based, zero-click exploits targeting AI agents in production:

A Copilot Studio Story 2: When AIjacking Leads to Full Data Exfiltration

Discover how prompt injections can lead to zero-click exploits threatening AI agents built using Copilot Studio. Learn about real-world risks, including data leakage and security blind spots. Bypass Copilot Studio prompt shields.

labs.zenity.io/p/a-copilot-studio-story-2-when-aijacking-leads-to-full-data-exfiltration-bc4a

Right now, many prompt-based attacks and payloads that have been discovered feel somewhat experimental, meme-driven, maybe even childish, just like those first viruses in the ’90s. But that’s exactly why they matter. As models become more autonomous, embedded in real-world workflows, and increasingly agentic, the stakes change dramatically:

Exploits shift from code → behavior (coaxing systems into doing things their creators never intended)
Defense shifts from patching bugs → designing capabilities and defining hard guardrails
Instead of files doing “bad things,” now any chunk of text could be malware/promptware that manipulates a system with emergent capabilities

In other words: we need to learn more about real-world agentic exploits today. I feel we’re already watching the first sparks, and if we don’t pay attention to the trajectory of malware history, we may end up building the future’s biggest threats ourselves, only to turn them into glass-cased exhibits in yet another museum someday.

Reply

or to participate.