Archive | Zenity Labs

Zenity Labs
Archive
Page 3

Enabling Safety in AI Agents via Choice Architecture

How adding a single safety labeled tool to an LLM's toolset can sharply increase its defense

Tomer Wetzler

Tools of the Trade

0-click indirect prompt injection with tool use - a look through attribution graphs

Max Fomin

Modeling LLMs via Structured Self-Modeling (SSM)

How using structured prompts present findings of self-modeling in LLMs, which may benefit both attackers and defenders

Tomer Wetzler

Data-Structure Injection (DSI) in AI Agents

How controlling the structure of the prompt, not just the semantics, can exploit your AI agents and their tools

Tomer Wetzler

AgentFlayer: Versión en español.

Inbar Raz

Security researchSecurity research

Exploring the Risks of ChatGPT’s Atlas Browser

Tamir Ishay Sharbat

Raul Klugman-Onitza

Tamir Ishay Sharbat, +1

Security researchSecurity research

Appendix: Interpreting Jailbreaks and Prompt Injections with Attribution Graphs

Max Fomin

Security researchSecurity research

Interpreting Jailbreaks and Prompt Injections with Attribution Graphs

Max Fomin

Security researchSecurity research

Breaking down AgentKit's Guardrails

A deep dive into OpenAI's AgentKit guardrails, how they are implemented, and where they fail

Stav Cohen

Security researchSecurity research

Analyzing The Security Risks of OpenAI's AgentKit

Stav Cohen

Raul Klugman-Onitza

Exhibit & Exploit: Two DEF CON 33 Highlights from the Past & Future of Hacking

Humans, hacker culture and AI: Notes from Hacker Summer Camp

Avishai Efrat

Security researchSecurity research

Prompt Mines: 0-Click Data Corruption In Salesforce Einstein

Tamir Ishay Sharbat

Tamir Ishay Sharbat