Agent Firewall Policy Specification v0.1#
Status: Draft
Version: 0.1.0
Authors: Pipelock maintainers
This specification defines a minimal, portable policy format for agent firewalls. The goal is to standardize how egress rules, DLP actions, audit events, and MCP scan hooks are expressed, so policies can be shared across tools and organizations.
Pipelock implements this spec natively. Other agent firewall implementations can adopt it to enable policy portability.
Design Principles#
- Minimal. Only fields that affect security decisions. No UI preferences, no deployment config, no vendor-specific extensions in the core spec.
- Declarative. Policies describe what to allow/deny, not how to implement it. Enforcement is up to the runtime.
- Composable. Multiple policies can be merged. Later rules override earlier ones by name.
- Auditable. Every policy decision should produce a structured log entry that references the rule that triggered it.
Policy Document#
A policy document is a YAML file with these top-level keys:
policy_version: "0.1.0"
name: "production-agents"
description: "Policy for production AI agent fleet"
egress: { ... }
dlp: { ... }
response: { ... }
mcp: { ... }
audit: { ... }
All sections are optional. Missing sections mean "no opinion" (defer to runtime defaults).
Egress Rules#
Control which domains and IP ranges the agent can reach.
egress:
default: deny # deny or allow
rules:
- name: "LLM providers"
domains:
- "*.anthropic.com"
- "*.openai.com"
- "api.together.xyz"
action: allow
- name: "Known exfiltration targets"
domains:
- "*.pastebin.com"
- "*.transfer.sh"
- "file.io"
- "requestbin.net"
action: deny
- name: "Internal networks"
cidrs:
- "10.0.0.0/8"
- "172.16.0.0/12"
- "192.168.0.0/16"
- "169.254.0.0/16"
action: deny
Egress Rule Fields#
| Field | Type | Required | Description |
|---|---|---|---|
name | string | yes | Human-readable rule identifier |
domains | string[] | no | Domain patterns (supports * wildcard prefix) |
cidrs | string[] | no | CIDR ranges (IPv4 and IPv6) |
action | string | yes | allow or deny |
Rules are evaluated top-to-bottom. First match wins. If no rule matches, default applies.
Domain matching is case-insensitive. *.example.com matches api.example.com but not example.com itself. Use both example.com and *.example.com to cover both.
DLP Rules#
Define patterns for detecting secrets and sensitive data in URLs, request bodies, and tool arguments.
dlp:
scan_environment: true # Check env vars for leaked values
min_env_length: 16 # Min env var value length to flag
patterns:
- name: "Anthropic API Key"
regex: 'sk-ant-[a-zA-Z0-9\-_]{10,}'
severity: critical
action: block
- name: "AWS Access Key"
regex: '(AKIA|ASIA)[A-Z0-9]{16,}'
severity: critical
action: block
- name: "Credential in URL"
regex: '(password|token|secret|api_?key)=[^\s&]{8,}'
severity: high
action: warn
DLP Pattern Fields#
| Field | Type | Required | Description |
|---|---|---|---|
name | string | yes | Pattern identifier (used in audit events) |
regex | string | yes | Case-insensitive regex |
severity | string | yes | critical, high, medium, low |
action | string | no | block or warn (overrides global DLP action) |
Regexes are always applied case-insensitive. Implementations should support at minimum: PCRE-compatible syntax, character classes, alternation, and quantifiers.
Encoding Handling#
Implementations MUST attempt to decode content before pattern matching:
- Base64 (standard and URL-safe, with and without padding)
- Hex encoding
- URL encoding (iterative, handling multi-layer encoding)
This is not optional. Without encoding-aware matching, DLP is trivially bypassed.
Response Scanning#
Detect prompt injection in fetched content and MCP tool results.
response:
action: warn # block, strip, warn, or ask
patterns:
- name: "Prompt Injection"
regex: '(?i)(ignore|disregard)\s+(all\s+)?(previous|prior)\s+(instructions|prompts)'
- name: "System Override"
regex: '(?i)(you\s+are|act\s+as)\s+(now\s+)?(a|an|my)\s+'
- name: "Jailbreak Attempt"
regex: '(?i)(DAN|developer)\s+mode'
Response Actions#
| Action | Behavior |
|---|---|
block | Reject the response entirely |
strip | Remove matched text, return the rest |
warn | Log the finding, return content unchanged |
ask | Pause for human approval (requires interactive terminal) |
Normalization Requirements#
Implementations SHOULD apply these normalizations before pattern matching (in order):
- Strip zero-width and invisible Unicode characters
- Apply NFKC Unicode normalization
- Map confusable characters to ASCII equivalents
- Strip combining marks
Without normalization, injection detection is trivially bypassed with Unicode tricks.
MCP Hooks#
Rules for MCP tool call scanning, tool description validation, and chain detection.
mcp:
input_scanning:
enabled: true
action: warn
on_parse_error: block # What to do with malformed JSON-RPC
tool_scanning:
enabled: true
action: warn
detect_drift: true # Alert on tool description changes
tool_policy:
action: warn
rules:
- name: "Block shell execution"
tool_pattern: "execute_command|run_terminal|bash"
action: block
- name: "Warn on file writes"
tool_pattern: "write_file|create_file"
arg_pattern: '/etc/.*|/usr/.*|~/.ssh/.*'
action: warn
session_binding:
enabled: true
unknown_tool_action: warn
chain_detection:
enabled: true
action: warn
window_size: 20
window_seconds: 300
max_gap: 3
Tool Policy Rule Fields#
| Field | Type | Required | Description |
|---|---|---|---|
name | string | yes | Rule identifier |
tool_pattern | string | yes | Regex matching tool name |
arg_pattern | string | no | Regex matching argument values |
arg_key | string | no | Regex scoping arg_pattern to top-level argument keys. Requires arg_pattern; specifying arg_key without arg_pattern is a validation error. |
action | string | no | block or warn |
Audit Event Format#
Every policy decision produces a structured audit event. This is the canonical format:
{
"timestamp": "2026-02-28T15:04:05.000Z",
"level": "warn",
"event": "blocked",
"scanner": "dlp",
"rule": "Anthropic API Key",
"severity": "critical",
"method": "GET",
"url": "https://example.com/?key=sk-ant-...",
"agent": "my-bot",
"client_ip": "127.0.0.1",
"request_id": "abc123",
"mitre_technique": "T1048",
"instance_id": "prod-agent-1"
}
Required Audit Fields#
| Field | Type | Description |
|---|---|---|
timestamp | ISO 8601 | When the event occurred |
level | string | info, warn, or critical |
event | string | Event type: allowed, blocked, anomaly, etc. |
scanner | string | Which scanner triggered the event |
rule | string | Pattern/rule name that matched |
Optional Audit Fields#
| Field | Type | Description |
|---|---|---|
severity | string | Finding severity (from DLP/response pattern) |
method | string | HTTP method |
url | string | Target URL (may be truncated) |
agent | string | Agent identifier |
client_ip | string | Source IP (HTTP transports only, omitted for stdio MCP) |
request_id | string | Unique request identifier (HTTP transports only, omitted for stdio MCP) |
mitre_technique | string | MITRE ATT&CK technique ID |
instance_id | string | Pipelock instance identifier |
score | number | Threat score (if adaptive enforcement is enabled) |
MITRE ATT&CK Mapping#
Implementations SHOULD include MITRE technique IDs in audit events where applicable:
| Detection | Technique ID | Name |
|---|---|---|
| DLP / secret exfiltration | T1048 | Exfiltration Over Alternative Protocol |
| SSRF / private IP access | T1046 | Network Service Discovery |
| Prompt injection | T1059 | Command and Scripting Interpreter |
| Tool poisoning | T1195.002 | Supply Chain: Software Supply Chain |
| Session anomaly | T1078 | Valid Accounts |
| Domain blocklist violation | T1071.001 | Application Layer Protocol: Web |
| Rate limit / data budget | T1030 | Data Transfer Size Limits |
Policy Merging#
When multiple policies are combined (org-wide + team + project), merge rules:
- Later policies override earlier ones by section
- Within
egress.rules,dlp.patterns, etc., rules merge byname - If two rules share a name, the later one wins entirely (no field-level merge)
egress.defaultis overridden by the last policy that sets it
This allows organizations to set a baseline policy and teams to customize it.
Validation#
Implementations MUST validate policies at load time:
- All regex patterns compile without error
- All
actionvalues are recognized - All
severityvalues are recognized - Strict mode (
egress.default: deny) has at least oneallowrule - CIDR values parse correctly
Invalid policies MUST be rejected entirely (fail-closed). Partial application of a broken policy is a security risk.
Versioning#
The policy_version field uses semver. Implementations MUST reject policies with a major version they don't support. Minor version bumps add optional fields. Patch version bumps are documentation-only.
Example: Minimal Production Policy#
policy_version: "0.1.0"
name: "minimal-production"
egress:
default: deny
rules:
- name: "LLM APIs"
domains: ["*.anthropic.com", "*.openai.com"]
action: allow
- name: "Package registries"
domains: ["registry.npmjs.org", "pypi.org", "pkg.go.dev"]
action: allow
dlp:
scan_environment: true
patterns:
- name: "API Keys"
regex: 'sk-[a-zA-Z0-9\-_]{20,}'
severity: critical
response:
action: block
mcp:
input_scanning:
enabled: true
action: block
tool_policy:
action: warn
rules:
- name: "No shell"
tool_pattern: "execute_command|bash|shell"
action: block
audit: {}
Relationship to Pipelock Config#
Pipelock's YAML config (pipelock.yaml) is a superset of this spec. It includes deployment fields (listen, timeout_seconds, etc.) alongside policy fields. The policy fields map directly:
| Policy Spec | Pipelock Config |
|---|---|
egress.default | mode: strict + api_allowlist |
egress.rules[].domains | api_allowlist + monitoring.blocklist |
dlp.patterns | dlp.patterns |
response.action | response_scanning.action |
mcp.* | mcp_input_scanning, mcp_tool_scanning, etc. |
audit | logging + emit |
See Configuration Reference for the full pipelock config documentation.