Security

Agent Security

Definition

The discipline of securing autonomous AI agents against adversarial attacks, data leakage, unauthorized actions, and unintended behavior. Agent security encompasses: input security (prompt injection defense), output security (preventing information leakage), action security (constraining tool access and permissions), data security (protecting sensitive information in context), and operational security (securing the agent's runtime, credentials, and deployment). Agent security is harder than traditional application security because the agent's behavior is non-deterministic and its attack surface includes natural language.

Builder Context

Apply the principle of least privilege to everything: tools (minimum permissions), context (minimum data), actions (minimum scope). Never store credentials in prompts or tool descriptions — use environment variables and credential managers. Implement an audit log for all agent actions (who triggered, what was done, what data was accessed). The biggest agent security gap in practice: tool results that contain more data than the agent needs, leaking information through the model's context.