The coming AI security crisis (and what to do about it) | Sander Schulhoff
1. Strictly Limit AI Agent Permissions
Ensure any AI agent or system capable of taking actions (e.g., sending emails, modifying databases) is granted only the absolute minimum necessary permissions, as malicious users can trick it into performing any action it’s allowed. This aligns with classical cybersecurity’s proper permissioning.
2. Invest in AI-Cybersecurity Expertise
Develop or hire expertise that bridges classical cybersecurity and AI security, as AI systems present fundamentally different security challenges compared to traditional software. This combined knowledge is vital for identifying unique vulnerabilities and implementing effective, AI-aware security measures.
3. Adopt “Angry God in Box” Mindset
When designing and securing AI systems, particularly agents, approach them with the mindset that the AI is a malicious entity trying to cause harm and escape control. This proactive mental model helps identify and mitigate risks by focusing on containing and controlling potentially dangerous AI.
4. Implement Context-Aware Permissioning
Utilize frameworks like Google’s Camel to dynamically restrict an agent’s permissions based on the user’s specific request, granting only the necessary read/write capabilities for the task at hand. This prevents prompt injection attacks by limiting the agent’s potential actions from the outset.
5. Avoid AI Guardrails & Red Teaming
Do not rely on AI guardrails or automated red teaming tools as primary defenses against prompt injection and jailbreaking. Guardrails are easily bypassed and ineffective against determined attackers, while automated red teaming offers little novel insight as all current models are vulnerable.
6. Do Not Deploy Prompt-Based Defenses
Refrain from using prompt engineering (e.g., adding explicit instructions within the prompt) as a defense mechanism for AI systems. These defenses are known to be highly ineffective and offer minimal protection against adversarial attacks.
7. Understand Simple Chatbot Limitations
If your AI system is merely a chatbot for FAQs or information retrieval without action-taking capabilities or access to sensitive data, extensive defensive measures are likely unnecessary. The primary risk is reputational harm, which can often be achieved by users through other means.
8. Educate Your Team on AI Security
Prioritize educating your team, including decision-makers, about the realities of AI security, prompt injection, and jailbreaking. Increased awareness helps prevent poor deployment decisions and fosters a deeper understanding of AI’s unique risks.
9. Monitor AI System Inputs/Outputs
Implement logging for all inputs and outputs of your AI systems. This practice allows for later review to understand user interaction, identify potential misuse, and continuously improve the system, even if it doesn’t directly prevent attacks.
10. Beware Guardrail Overconfidence
Be aware that deploying AI guardrails can create a false sense of security regarding your AI systems’ robustness. This overconfidence is a significant problem, especially as agentic AI capabilities increase the potential for real-world damage.
11. Avoid Offensive AI Security Research
Researchers and practitioners should refrain from publishing new methods for jailbreaking or prompt injection. The community already understands these vulnerabilities, and further offensive research primarily provides more attack vectors without aiding defensive progress.