PII Detection & Redaction
Definition
The process of identifying and masking personally identifiable information (names, emails, phone numbers, addresses, SSNs, financial data) in text before it is processed by an AI model or stored in logs. PII detection is a critical compliance requirement (GDPR, CCPA, HIPAA) and a trust requirement for agent systems that handle user data. Detection methods include: regex patterns, named entity recognition (NER), and purpose-built classification models.
Builder Context
Implement PII detection at the boundary — before data enters the agent's context and before agent outputs are stored or transmitted. For input: redact PII from user messages and retrieved documents before they reach the model. For output: scan agent responses for accidentally included PII. For logs: never log full prompts or responses in production without PII redaction. The most common violation: logging tool call results that contain user data. Use a PII-aware logging wrapper, not ad-hoc redaction.