Documents
configuration
configuration
Type
External
Status
Published
Created
Mar 25, 2026
Updated
Apr 6, 2026
Updated by
Dosu Bot

Configuration Reference#

Pipelock uses a single YAML config file. Generate a starter config:

pipelock generate config --preset balanced > pipelock.yaml
pipelock run --config pipelock.yaml

Or scan your project and get a tailored config:

pipelock audit ./my-project -o pipelock.yaml

Hot Reload#

Config changes are picked up automatically via file watcher or SIGHUP signal (100ms debounce). Most fields reload without restart. Fields that require a restart are marked below.

On reload, the scanner and session manager are atomically swapped. Kill switch state (all 4 sources) is preserved. Existing MCP sessions retain the old scanner until the next request.

If a reload fails validation (invalid regex, security downgrade), the old config is retained and a warning is logged.

Top-Level Fields#

version: 1 # Config schema version (currently 1)
mode: balanced # "strict", "balanced", or "audit"
enforce: true # false = detect without blocking (warning-only)
explain_blocks: false # true = include fix hints in block responses
FieldTypeDefaultDescription
versionint1Config schema version
modestring"balanced"Operating mode (see Modes)
enforcebooltrueWhen false, all blocks become warnings
explain_blocksboolfalseInclude actionable hints in block responses

Block Hints (explain_blocks)#

When enabled, blocked responses include a hint explaining why the request was blocked and how to fix it. Fetch proxy responses get a hint field in the JSON body. CONNECT and WebSocket rejections get an X-Pipelock-Hint response header.

explain_blocks: true

Security note: Hints expose scanner names and config field names (e.g., "Add to api_allowlist", "Add a suppress entry"). This is useful for debugging but reveals your security policy to the agent. Default: false (opt-in). Enable when you trust your agent or need easier debugging. Leave disabled in production where untrusted agents could use hints to craft bypasses.

Modes#

ModeBehaviorUse Case
strictAllowlist-only. Only api_allowlist domains pass.Regulated industries, high-security
balancedBlocks known-bad, detects suspicious. All domains reachable.Most developers (default)
auditLogs everything, blocks nothing.Evaluation before enforcement

API Allowlist#

Domains that are always allowed in strict mode. In balanced/audit mode, these are exempt from the domain blocklist.

api_allowlist:
  - "*.anthropic.com"
  - "*.openai.com"
  - "*.discord.com"
  - "github.com"
  - "api.slack.com"

Supports wildcards (*.example.com matches api.example.com and the apex example.com itself). Case-insensitive.

Fetch Proxy#

The HTTP fetch proxy listens for requests on /fetch?url=... and returns extracted text content.

fetch_proxy:
  listen: "127.0.0.1:8888"
  timeout_seconds: 30
  max_response_mb: 10
  user_agent: "Pipelock Fetch/1.0"
  monitoring:
    max_url_length: 2048
    entropy_threshold: 4.5
    max_requests_per_minute: 60
    max_data_per_minute: 0 # bytes/min per domain (0 = disabled)
    blocklist:
      - "*.pastebin.com"
      - "*.hastebin.com"
      - "*.transfer.sh"
      - "file.io"
      - "requestbin.net"
FieldDefaultDescription
listen127.0.0.1:8888Listen address
timeout_seconds30HTTP request timeout
max_response_mb10Max response body size
user_agentPipelock Fetch/1.0User-Agent header sent upstream
monitoring.max_url_length2048URLs longer than this are blocked
monitoring.entropy_threshold4.5Shannon entropy threshold for path segments
monitoring.max_requests_per_minute60Per-domain rate limit
monitoring.max_data_per_minute0Per-domain byte budget (0 = disabled)
monitoring.blocklist5 domainsBlocked exfiltration targets
monitoring.subdomain_entropy_exclusions[]Domains excluded from subdomain and path entropy checks (query entropy still checked)

Entropy guidance:

  • English text: 3.5-4.0 bits/char
  • Hex/commit hashes: ~4.0
  • Base64-encoded data: 4.0-4.5
  • Random/encrypted: 5.5-8.0

The default threshold (4.5) allows commit hashes and base64-encoded filenames while flagging encrypted blobs. Lower it (3.5) for strict mode. Raise it (5.0) for development environments where base64 URLs are common.

Subdomain entropy exclusions skip subdomain and path entropy checks for specific domains, but query parameter entropy is still checked. Useful for APIs that embed tokens in URL paths (e.g., Telegram bot API). Supports wildcard matching (*.example.com).

fetch_proxy:
  monitoring:
    subdomain_entropy_exclusions:
      - "api.telegram.org"

Forward Proxy#

Standard HTTP CONNECT tunneling. Agents set HTTPS_PROXY=http://127.0.0.1:8888 and all traffic flows through pipelock. Zero code changes needed.

forward_proxy:
  enabled: false # Requires restart to change
  max_tunnel_seconds: 300
  idle_timeout_seconds: 120
  sni_verification: true # Verify TLS SNI matches CONNECT target
  redirect_websocket_hosts: [] # Redirect WS hosts to /ws proxy
FieldDefaultRestart?Description
enabledfalseYesEnable CONNECT tunnel proxy
max_tunnel_seconds300NoMax tunnel lifetime
idle_timeout_seconds120NoKill idle tunnels
sni_verificationtrueNoVerify TLS ClientHello SNI matches the CONNECT target hostname. Blocks domain fronting (MITRE T1090.004). Set to false to disable.
redirect_websocket_hosts[]NoRedirect matching hosts to /ws

TLS Interception#

Enables TLS MITM on CONNECT tunnels, allowing pipelock to decrypt, scan, and re-encrypt HTTPS traffic. When enabled, request bodies and headers are scanned for secret exfiltration, and responses are scanned for prompt injection, closing the CONNECT tunnel body-blindness gap.

Requires a CA certificate trusted by the agent. Generate one with pipelock tls init and install it with pipelock tls install-ca.

tls_interception:
  enabled: false
  ca_cert: "" # path to CA cert PEM (default: ~/.pipelock/ca.pem)
  ca_key: "" # path to CA key PEM (default: ~/.pipelock/ca-key.pem)
  passthrough_domains: # domains to splice (not intercept)
    - "*.anthropic.com"
  cert_ttl: "24h"
  cert_cache_size: 10000
  max_response_bytes: 5242880 # 5MB; responses larger than this are blocked
FieldDefaultDescription
enabledfalseEnable TLS interception on CONNECT tunnels
ca_cert""Path to CA certificate PEM. Empty resolves to ~/.pipelock/ca.pem
ca_key""Path to CA private key PEM. Empty resolves to ~/.pipelock/ca-key.pem
passthrough_domains[]Domains to splice (pass through without interception). Supports *.example.com wildcards (also matches apex example.com).
cert_ttl"24h"TTL for forged leaf certificates (Go duration string)
cert_cache_size10000Max cached leaf certificates. Evicts oldest when full.
max_response_bytes5242880Max response body to buffer for scanning. Responses exceeding this are blocked (fail-closed).

Setup:

# Generate a CA key pair
pipelock tls init

# Install the CA into the system trust store (macOS/Linux)
pipelock tls install-ca

# Or export the CA cert for manual installation
pipelock tls show-ca

Scanning behavior: When a CONNECT tunnel is intercepted, pipelock terminates TLS with the client using a forged certificate, then opens a separate TLS connection to the upstream server. Inner HTTP requests are served via Go's http.Server, enabling:

  • Request body DLP: same scanning as request_body_scanning (JSON, form, multipart extraction + DLP patterns)
  • Request header DLP: same scanning as request_body_scanning.scan_headers
  • Authority enforcement: the Host header must match the CONNECT target. Mismatches are blocked (prevents domain fronting inside encrypted tunnels).
  • Response injection scanning: buffered responses scanned through the response_scanning pipeline before forwarding to the agent
  • Compressed response blocking: responses with non-identity Content-Encoding are blocked (fail-closed, since compressed bytes evade regex DLP)

Fail-closed behaviors:

  • Responses exceeding max_response_bytes are blocked
  • Compressed responses (gzip, deflate, br) are blocked
  • Response read errors are blocked
  • Authority mismatch (Host header differs from CONNECT target) is blocked

Passthrough domains: Domains in passthrough_domains are spliced (bidirectional byte copy) without interception, preserving end-to-end TLS. Use this for domains where certificate pinning prevents interception or where you trust the destination. Supports exact match and wildcard prefix (*.example.com matches sub.example.com and the apex example.com).

Best practice -- package registries and LLM providers: Always add package registries (npm, pypi, Go proxy) and LLM API endpoints to passthrough_domains, not just exempt_domains. Using exempt_domains alone still MITM-s the connection, which breaks large downloads (response size limit), causes TLS handshake errors with clients that reject the generated certificate, and wastes CPU on cert generation for traffic you don't intend to scan. Passthrough skips interception entirely.

passthrough_domains:
  - "registry.npmjs.org" # npm packages
  - "pypi.org" # Python packages
  - "*.pypi.org"
  - "files.pythonhosted.org" # pip downloads
  - "proxy.golang.org" # Go modules
  - "*.anthropic.com" # LLM provider
  - "*.openai.com" # LLM provider

Request Body Scanning#

Scans request bodies and headers on the forward proxy path for secret exfiltration. Catches secrets in POST/PUT bodies and Authorization/Cookie headers that bypass URL-level scanning.

Scope: Forward HTTP proxy (HTTPS_PROXY absolute-URI requests), fetch handler headers, and intercepted CONNECT tunnels (when tls_interception.enabled is true).

request_body_scanning:
  enabled: false
  action: warn # warn or block (no strip for bodies)
  max_body_bytes: 5242880 # 5MB; fail-closed above this
  scan_headers: true # scan request headers for DLP
  header_mode: sensitive # "sensitive" (listed headers) or "all" (everything except ignore list)
  sensitive_headers:
    - Authorization
    - Cookie
    - X-Api-Key
    - X-Token
    - Proxy-Authorization
    - X-Goog-Api-Key
FieldDefaultDescription
enabledfalseEnable request body and header DLP scanning
actionwarnwarn logs only, block rejects (requires enforce mode)
max_body_bytes5242880Max body size to buffer; bodies exceeding this are always blocked (fail-closed)
scan_headerstrueScan request headers for DLP patterns
header_modesensitivesensitive: scan only listed headers. all: scan all headers except ignore list
sensitive_headers(see above)Headers to scan in sensitive mode
ignore_headers(hop-by-hop + structural)Headers to skip in all mode

Content-type dispatch: JSON bodies have string values extracted recursively. Form-urlencoded bodies are parsed as key-value pairs. Multipart form data has text fields extracted (binary parts skipped, max 100 parts). Text/* and XML bodies are scanned as raw text. Unknown content types get a fallback raw-text scan (never skipped, prevents Content-Type spoofing bypass).

Fail-closed behaviors (always blocked regardless of action setting):

  • Bodies exceeding max_body_bytes
  • Compressed bodies (Content-Encoding: gzip/deflate/br): compressed bytes evade regex DLP
  • Body read errors: prevents forwarding empty/corrupt bodies
  • Invalid JSON bodies
  • Invalid form-urlencoded bodies: prevents parser differential attacks
  • Multipart missing boundary parameter
  • Multipart with more than 100 parts
  • Multipart part exceeding max_body_bytes
  • Multipart filename exceeding 256 bytes: prevents secret exfiltration via long filenames

Header scanning: Headers are scanned regardless of destination host. An agent can exfiltrate secrets via Authorization: Bearer <secret> to any host, including allowlisted ones. The URL allowlist controls URL-level blocking, not header DLP bypass.

Note on scan_headers: The config default is true, but omitting the field from your YAML file gives false (Go's zero value overrides the default). Always set scan_headers: true explicitly in your config if you want header scanning enabled.

WebSocket Proxy#

Bidirectional WebSocket scanning via /ws?url=ws://upstream:9090/path. Text frames are scanned through the full DLP + injection pipeline. Fragment reassembly handles split messages.

websocket_proxy:
  enabled: false # Requires restart to change
  max_message_bytes: 1048576 # 1MB
  max_concurrent_connections: 128
  scan_text_frames: true
  allow_binary_frames: false
  strip_compression: true # Required for scanning
  max_connection_seconds: 3600
  idle_timeout_seconds: 300
  origin_policy: rewrite # rewrite, forward, or strip
  forward_cookies: false
FieldDefaultRestart?Description
enabledfalseYesEnable /ws endpoint
max_message_bytes1048576NoMax assembled message size
max_concurrent_connections128NoConnection limit
scan_text_framestrueNoDLP + injection on text frames
allow_binary_framesfalseNoAllow binary frames (not scanned)
strip_compressiontrueNoForce uncompressed (required for scanning)
max_connection_seconds3600NoMax connection lifetime
idle_timeout_seconds300NoIdle timeout
origin_policy"rewrite"NoOrigin header: rewrite, forward, or strip
forward_cookiesfalseNoForward client Cookie headers to upstream

DLP (Data Loss Prevention)#

Scans URLs for secrets and sensitive data using regex patterns. Built-in patterns cover API keys, tokens, credentials, and prompt injection indicators. Runs before DNS resolution to prevent exfiltration via DNS queries. Matching is always case-insensitive.

dlp:
  scan_env: true
  secrets_file: "" # path to known-secrets file
  min_env_secret_length: 16
  include_defaults: true # merge user patterns with built-in patterns
  patterns:
    - name: "Custom Token"
      regex: 'myapp_[a-zA-Z0-9]{32}'
      severity: critical
    - name: "Telegram Bot Token"
      regex: '[0-9]{8,10}:[A-Za-z0-9_-]{35}'
      severity: critical
      exempt_domains: # skip this pattern for these destinations
        - "api.telegram.org"
FieldDefaultDescription
scan_envtrueScan environment variables for leaked values
secrets_file""Path to file with known secrets (one per line)
min_env_secret_length16Min env var value length to consider
include_defaultstrueMerge your patterns with the 46 built-in patterns
patterns46 built-inDLP credential detection patterns
patterns[].validator""Post-match checksum validator: luhn, mod97, aba, or wif
patterns[].exempt_domains[]Domains where this pattern is not enforced (wildcard supported)

Validated Patterns (Financial DLP)#

Some patterns include a validator field for post-match checksum verification. When set, regex matches are passed through a checksum algorithm before being flagged. This eliminates false positives from random numbers that happen to match the pattern format.

Built-in validated patterns:

  • Credit Card Number (validator: luhn) — Visa, Mastercard (including 2-series), Amex, Discover, JCB. Luhn checksum rejects ~90% of false positives.
  • IBAN (validator: mod97) — International Bank Account Numbers. Validates ISO 13616 country codes and ISO 7064 mod-97 checksum. Rejects ~99% of false positives.
  • Bitcoin WIF Private Key (validator: wif) — Base58Check decoding with SHA-256d checksum verification. Validates mainnet version byte (0x80) and 32/33-byte payload. Eliminates false positives from text that happens to contain 51-52 characters of the base58 alphabet.

To add ABA routing numbers (not in defaults due to higher false positive rate):

dlp:
  patterns:
    - name: "ABA Routing Number"
      regex: '\b\d{9}\b'
      severity: low
      validator: aba

Pattern Merging#

When include_defaults is true (default), your patterns are merged with the built-in set by name. If you define a pattern with the same name as a built-in, yours overrides it. New built-in patterns added in future versions are automatically included.

Set include_defaults: false to use only your patterns.

Per-Pattern Domain Exemptions#

Use exempt_domains to skip a specific DLP pattern for specific destination domains. Other patterns still fire, and response scanning remains active. Supports wildcard matching (*.example.com matches sub.example.com and example.com).

Scope: exempt_domains applies to URL-based scanning only (fetch proxy, forward proxy, WebSocket, TLS intercept). It does not apply to MCP input scanning (which has no destination domain) or environment variable leak detection (scan_env). To suppress those, use the suppress section.

This is useful for APIs that embed credentials in URL paths by design (e.g., Telegram bot API uses /bot<token>/sendMessage). The token should be allowed when talking to Telegram but blocked if it appears in requests to other domains.

To exempt a built-in pattern, override it by name and add exempt_domains:

dlp:
  patterns:
    - name: "Anthropic API Key" # same name as built-in — overrides it
      regex: 'sk-ant-[a-zA-Z0-9\-_]{10,}'
      severity: critical
      exempt_domains:
        - "*.anthropic.com"

Built-in DLP Patterns (46)#

PatternRegex PrefixSeverity
Anthropic API Keysk-ant-critical
OpenAI API Keysk-proj-critical
OpenAI Service Keysk-svcacct-critical
Fireworks API Keyfw_critical
AWS Access Key IDAKIA|A3T|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIAcritical
Google API KeyAIzacritical
Google OAuth Client SecretGOCSPX-critical
Google OAuth Tokenya29.high
Google OAuth Client ID*.apps.googleusercontent.commedium
Stripe Key[sr]k_live|test_critical
Stripe Webhook Secretwhsec_critical
GitHub Tokengh[pousr]_critical
GitHub Fine-Grained PATgithub_pat_critical
GitLab PATglpat-critical
Slack Tokenxox[bpras]-critical
Slack App Tokenxapp-critical
Discord Bot Token[MN][A-Za-z0-9]{23,}critical
Twilio API KeySK[a-f0-9]{32}critical
SendGrid API KeySG.critical
Mailgun API Keykey-[a-zA-Z0-9]{32}critical
New Relic API KeyNRAK-critical
Hugging Face Tokenhf_critical
Databricks Tokendapicritical
Replicate API Tokenr8_critical
Together AI Keytok_critical
Pinecone API Keypcsk_critical
Groq API Keygsk_critical
xAI API Keyxai-critical
DigitalOcean Tokendop_v1_critical
HashiCorp Vault Tokenhvs.critical
Vercel Tokenvercel_|vc[piark]_critical
Supabase Service Keysb_secret_critical
npm Tokennpm_critical
PyPI Tokenpypi-critical
Linear API Keylin_api_high
Notion API Keyntn_high
Sentry Auth Tokensntrys_high
JWT Tokeney...\..*\.high
Private Key Header-----BEGIN.*PRIVATE KEY-----critical
Bitcoin WIF Private Key[5KL] + base58critical
Extended Private Key[xyzt]prv + base58critical
Ethereum Private Key0x + 64 hexcritical
Social Security Number\b\d{3}-\d{2}-\d{4}\bcritical
Credit Card NumberBIN prefix + Luhn checksummedium
IBAN[A-Z]{2}\d{2} + mod-97 checksummedium
Credential in URLpassword|token|secret=valuehigh
Prompt Injection(ignore|disregard|forget)...previous...instructionshigh
System Overridesystem:high
Role Overrideyou are now (DAN|evil|unrestricted)high
New Instructions(new|updated) (instructions|directives)high
Jailbreak AttemptDAN|developer mode|sudo modehigh
Hidden Instructiondo not reveal this to the userhigh
Behavior Overridefrom now on you (will|must)high
Encoded Payloaddecode this from base64 and executehigh
Tool Invocationyou must (call|execute) the (function|tool)high
Authority Escalationyou have (admin|root) (access|privileges)high
Instruction Downgradetreat previous instructions as (outdated|optional)high
Instruction Dismissalset the previous instructions asidehigh
Priority Overrideprioritize the (task|current) (request|input)high

Environment Variable Leak Detection#

When scan_env: true, pipelock reads all environment variables at startup and flags URLs containing any env value that is:

  • 16+ characters (configurable via min_env_secret_length)
  • Shannon entropy > 3.0 bits/char
  • Checked in raw form, base64, hex, and base32 encodings

This catches leaked API keys even without a specific DLP pattern for that provider.

Seed Phrase Detection#

Detects BIP-39 mnemonic seed phrases in URLs, request bodies, headers, MCP tool arguments, WebSocket frames, and cross-request fragment reassembly. Seed phrase compromise is permanent and irreversible, making this a critical detection layer for crypto-adjacent deployments.

seed_phrase_detection:
  enabled: true # default: true (security default)
  min_words: 12 # minimum consecutive BIP-39 words to trigger (12, 15, 18, 21, or 24)
  verify_checksum: true # default: true (validates BIP-39 SHA-256 checksum, eliminates FPs)
FieldTypeDefaultDescription
enabledbooltrueEnable BIP-39 seed phrase detection
min_wordsint12Minimum consecutive BIP-39 words to trigger. Must be 12, 15, 18, 21, or 24.
verify_checksumbooltrueValidate the BIP-39 SHA-256 checksum. Reduces false positives by 16x for 12-word phrases, 256x for 24-word.

The detector uses a dedicated scanner (not regex). It tokenizes text, runs a sliding window over the 2048-word BIP-39 English dictionary, and validates the checksum. Detection covers varied separators (spaces, commas, newlines, dashes, tabs, pipes).

Action follows the transport-level DLP action: URL scan always blocks, MCP input uses mcp_input_scanning.action, body/header uses request_body_scanning.action.

Response Scanning#

Scans fetched content for prompt injection before returning to the agent. Uses a 6-pass normalization pipeline: zero-width stripping, word boundary reconstruction, leetspeak folding, optional-whitespace matching, vowel folding, and encoding detection.

response_scanning:
  enabled: true
  action: warn # block, strip, warn, or ask
  ask_timeout_seconds: 30 # HITL approval timeout
  include_defaults: true
  exempt_domains: # skip injection scanning for these hosts
    - "api.openai.com"
    - "*.anthropic.com"
  patterns:
    - name: "Custom Injection"
      regex: 'override system prompt'
FieldDefaultDescription
enabledtrueEnable response scanning
action"warn"block, strip, warn, or ask (HITL)
ask_timeout_seconds30Timeout for human-in-the-loop approval
include_defaultstrueMerge with 25 built-in patterns
exempt_domains[]Hosts to skip injection scanning for (DLP still applies on outbound). Supports *.example.com wildcards (also matches the apex example.com).
patterns25 built-inInjection and state/control poisoning patterns

Built-in patterns (25): 13 prompt injection patterns (jailbreak phrases, system overrides, role overrides, instruction manipulation, encoded payloads, tool invocation commands, authority escalation), 6 state/control poisoning patterns (credential solicitation, credential path directives, auth material requirements, memory persistence directives, preference poisoning, silent credential handling), and 4 CJK-language override patterns (Chinese, Japanese, Korean instruction overrides and jailbreak mode). All patterns use DOTALL mode to match across newlines in multiline tool output.

Actions:

  • block: reject the response entirely, agent gets an error
  • strip: redact matched text, return cleaned content
  • warn: log the match, return content unchanged
  • ask: pause and prompt the operator for approval (requires TTY)

Exempt domains: LLM provider APIs (OpenAI, Anthropic, etc.) return instruction-like text as part of normal operation, which can trigger false positives. Use exempt_domains to skip injection scanning for trusted providers. DLP scanning on the outbound request still runs — only the response injection scan is skipped. Applies to fetch proxy, forward proxy, CONNECT (TLS intercept), WebSocket, and reverse proxy. Does not affect MCP response scanning (tool results use a separate trust model).

MCP Input Scanning#

Scans JSON-RPC requests from agent to MCP server for DLP leaks and injection in tool arguments.

mcp_input_scanning:
  enabled: true
  action: warn
  on_parse_error: block # block or forward
FieldDefaultDescription
enabledfalseEnable input scanning
action"warn"warn or block
on_parse_error"block"What to do with malformed JSON-RPC

Auto-enabled when running pipelock mcp proxy.

MCP Tool Scanning#

Scans tools/list responses for poisoned tool definitions and detects mid-session description changes (rug pulls). Extracts text from all schema fields that an LLM might ingest: description, title, default, const, enum, examples, pattern, $comment, and vendor extensions (x-*). Recurses through composition keywords (allOf, anyOf, oneOf, $defs, if/then/else) and extracts string leaves from nested objects and arrays.

mcp_tool_scanning:
  enabled: true
  action: warn
  detect_drift: true
FieldDefaultDescription
enabledfalseEnable tool description scanning
action"warn"warn or block
detect_driftfalseAlert on tool description changes

MCP Tool Policy#

Pre-execution rules that block or warn before tool calls reach the MCP server. Ships with 17 built-in rules covering destructive operations, credential access, network exfiltration, persistence mechanisms, and encoded command execution.

mcp_tool_policy:
  enabled: true
  action: warn
  rules:
    - name: "Block shell execution"
      tool_pattern: "execute_command|run_terminal"
      action: block
    - name: "Warn on sensitive writes"
      tool_pattern: "write_file"
      arg_pattern: '/etc/.*|/usr/.*'
      action: warn
    - name: "Block shadow file reads"
      tool_pattern: "read_file"
      arg_pattern: '/etc/shadow'
      arg_key: '^(file_?path|target)$'
      action: block
FieldDefaultDescription
enabledfalseEnable tool policy
action"warn"Default action for rules without override
rules17 built-inPolicy rule list

Rule fields:

  • name: rule identifier
  • tool_pattern: regex matching tool name
  • arg_pattern: regex matching argument values (optional; omit for tool-name-only rules)
  • arg_key: regex scoping arg_pattern to specific top-level argument keys (optional; requires arg_pattern). Without arg_key, arg_pattern checks values from all argument keys. Values under matching keys are extracted recursively.
  • action: per-rule override (warn, block, or redirect)
  • redirect_profile: reference to a named redirect profile (required when action: redirect)

Shell obfuscation detection is built-in: backslash escapes, $IFS substitution, brace expansion, and octal/hex escapes are decoded before matching. See Redirect Action (v2.0) for redirect profile configuration.

MCP Session Binding#

Pins tool inventory on the first tools/list response. Subsequent tool calls are validated against this baseline. Unknown tools trigger the configured action.

mcp_session_binding:
  enabled: true
  unknown_tool_action: warn
  no_baseline_action: warn
FieldDefaultDescription
enabledfalseEnable session binding
unknown_tool_action"warn"Action on tools not in baseline
no_baseline_action"warn"Action if no baseline exists

Tool baseline caps at 10,000 tools per session to prevent memory exhaustion.

MCP WebSocket Listener#

Controls inbound WebSocket connections when the MCP proxy runs in listener mode with a ws:// or wss:// upstream. Loopback origins are always allowed.

mcp_ws_listener:
  allowed_origins:
    - "https://example.com"
  max_connections: 100
FieldDefaultDescription
allowed_origins[]Additional browser origins to allow (loopback always allowed)
max_connections100Max concurrent inbound WebSocket connections

Session Profiling#

Per-session behavioral analysis that detects domain bursts and volume spikes.

session_profiling:
  enabled: true
  anomaly_action: warn
  domain_burst: 5
  window_minutes: 5
  volume_spike_ratio: 3.0
  max_sessions: 1000
  session_ttl_minutes: 30
  cleanup_interval_seconds: 60
FieldDefaultDescription
enabledfalseEnable profiling
anomaly_action"warn"warn or block on anomaly
domain_burst5New unique domains in window to flag
window_minutes5Rolling window duration
volume_spike_ratio3.0Spike threshold (ratio of avg)
max_sessions1000Hard cap on concurrent sessions
session_ttl_minutes30Idle session eviction
cleanup_interval_seconds60Background cleanup interval

Adaptive Enforcement#

Per-session threat score that accumulates across scanner hits and decays on clean requests. When the score exceeds the threshold, the session escalates through levels (elevated → high → critical). At each level, the levels configuration upgrades warn and ask actions to block, or denies all traffic.

adaptive_enforcement:
  enabled: true
  escalation_threshold: 5.0
  decay_per_clean_request: 0.5
  levels:
    elevated:
      upgrade_warn: block # warn→block when session is elevated
    high:
      upgrade_warn: block
      upgrade_ask: block # ask→block when session is high risk
    critical:
      upgrade_warn: block
      upgrade_ask: block
      block_all: true # deny all requests when session is critical
FieldDefaultDescription
enabledfalseEnable adaptive enforcement
escalation_threshold5.0Score before first escalation. Lower values escalate faster.
decay_per_clean_request0.5Score reduction per clean request. Lower values slow trust recovery.
levels(see below)Per-level enforcement upgrades

Escalation Levels#

Sessions progress through three levels as threat score accumulates past escalation_threshold multiples. Each level can independently upgrade action severity.

LevelTriggerDescription
elevatedScore ≥ threshold × 1First escalation. Session shows suspicious behavior.
highScore ≥ threshold × 2Second escalation. Session is actively concerning.
criticalScore ≥ threshold × 3Third escalation. Session is high-confidence threat.

Level Actions#

Each level accepts the following fields. All fields use pointer semantics:

  • Omit the field (or omit levels entirely) to apply the default behavior.
  • Set to "block" to upgrade that action class at this level.
  • Set to "" (empty string) to explicitly disable an upgrade (softening from a parent config).

Monotonic enforcement: higher levels must never be weaker than lower levels. If elevated.upgrade_warn: block, then high and critical must also have upgrade_warn: block (or omit it for the default, which is block). Pipelock validates this at config load time and rejects violations.

FieldTypeDefaultDescription
upgrade_warn*stringnil"block" at all levelsUpgrade warn actions to block at this level
upgrade_ask*stringnil"" at elevated; "block" at high and criticalUpgrade ask (HITL) actions to block at this level
block_all*boolnilfalse at elevated and high; true at criticalDeny all traffic for this session regardless of action

Default behavior when levels is omitted:

Levelupgrade_warnupgrade_askblock_all
elevatedblockfalse
highblockblockfalse
criticalblockblocktrue

De-escalation#

Sessions at block_all recover autonomously via a background sweep that runs
every 30 seconds. If a session has been at its current escalation level for
longer than 5 minutes, it is automatically stepped down one level. Recovery
also triggers on the next incoming request, WebSocket frame, or MCP message
after the timer expires (on-entry fast path). The session must accumulate new real signals to re-escalate.

De-escalation drops one level per 5-minute period. A session at critical with no activity takes 15 minutes (3 periods) to return to normal. Each de-escalation resets the threat score to half the current threshold to prevent immediate re-escalation from stale points.

When a session is at a block_all level, blocked retries do not refresh the session's idle timer. This allows idle eviction to eventually clean up sessions that are no longer generating traffic, preventing zombie sessions from persisting indefinitely.

Domain Burst Scoring#

Session profiling detects domain bursts (many unique domains in a short window). When the burst threshold is crossed, the anomaly is signaled once per window with the configured score. Subsequent requests in the same window still trigger the configured anomaly_action (block or warn) but do not add further adaptive score, preventing burst detection from driving sessions to critical on its own.

Kill Switch#

Emergency deny-all with four independent activation sources. Any one active blocks all traffic (OR-composed). See Kill Switch for operational details.

kill_switch:
  enabled: false
  sentinel_file: /tmp/pipelock-kill # example path; default is "" (disabled)
  message: "Emergency deny-all active"
  health_exempt: true
  metrics_exempt: true
  api_exempt: true
  api_token: "" # Required for API source
  api_listen: "" # Requires restart. Separate port for operator API.
  allowlist_ips: [] # IPs that bypass kill switch
FieldDefaultRestart?Description
enabledfalseNoConfig-based activation
sentinel_file""NoFile presence activates kill switch
message"Emergency deny-all active"NoRejection message
health_exempttrueNo/health bypasses kill switch
metrics_exempttrueNo/metrics bypasses kill switch
api_exempttrueNo/api/v1/* bypasses kill switch
api_token""NoBearer token for API endpoints. Can be overridden by PIPELOCK_KILLSWITCH_API_TOKEN env var.
api_listen""YesSeparate listen address for API
allowlist_ips[]NoIPs always allowed through

Port isolation: When api_listen is set, the kill switch and session admin APIs run on a dedicated port. The main proxy port has no API routes, preventing agents from deactivating their own kill switch or resetting their own sessions.

Environment variable override: Set PIPELOCK_KILLSWITCH_API_TOKEN to override api_token from the config file. This is useful for Kubernetes deployments where the config file lives in a ConfigMap (plaintext in etcd) but the token should come from a Secret:

env:
  - name: PIPELOCK_KILLSWITCH_API_TOKEN
    valueFrom:
      secretKeyRef:
        name: pipelock-secrets
        key: killswitch-api-token

Session Admin API#

When kill_switch.api_token is configured, the session admin API is available alongside the kill switch endpoints. Uses the same bearer token authentication and port isolation.

EndpointMethodDescription
/api/v1/sessionsGETList all tracked sessions with escalation state
/api/v1/sessions/{key}/resetPOSTReset enforcement state for a client identity

The {key} parameter is URL-encoded. For example, my-agent|10.0.0.1 becomes my-agent%7C10.0.0.1.

Reset scope: identity-family scoped. Resetting a session clears the session's threat score, escalation level, and block_all flag. It also clears shared IP-level burst tracking for the client IP and cross-request exfiltration (CEE) state. Other sessions on the same IP will have their burst state cleared as a side effect.

Rate limiting: only the POST /reset endpoint is rate-limited (10 requests/minute). GET /sessions is not rate-limited.

Sessions are classified as identity (operator-targetable, e.g. my-agent|10.0.0.1) or invocation (internal MCP sessions, e.g. mcp-stdio-42). Only identity sessions can be reset.

Event Emission#

Forward audit events to external systems. Three independent sinks (webhook, syslog, OTLP), each with its own severity filter. Emission is fire-and-forget and never blocks the proxy.

emit:
  instance_id: "prod-agent-1"
  webhook:
    url: "https://your-siem.example.com/webhook"
    min_severity: warn
    auth_token: ""
    timeout_seconds: 5
    queue_size: 64
  syslog:
    address: "udp://syslog.example.com:514"
    min_severity: warn
    facility: local0
    tag: pipelock
  otlp:
    endpoint: "http://otel-collector:4318"
    min_severity: warn
    headers:
      Authorization: "Bearer <token>"
    timeout_seconds: 10
    queue_size: 256
    gzip: false
FieldDefaultDescription
instance_idhostnameIdentifies this instance in events
webhook.url""Webhook endpoint URL
webhook.min_severity"warn"info, warn, or critical
webhook.auth_token""Bearer token for webhook
webhook.timeout_seconds5HTTP timeout
webhook.queue_size64Async buffer size (overflow = drop + metric)
syslog.address""Syslog address (e.g., udp://host:514)
syslog.min_severity"warn"info, warn, or critical
syslog.facility"local0"Syslog facility
syslog.tag"pipelock"Syslog tag
otlp.endpoint""OTLP collector base URL (e.g., http://collector:4318). /v1/logs appended automatically.
otlp.min_severity"warn"info, warn, or critical
otlp.headers{}Custom HTTP headers (authentication, tenant routing)
otlp.timeout_seconds10Per-request HTTP timeout
otlp.queue_size256Async buffer size (overflow = drop)
otlp.gzipfalseCompress request bodies with gzip

OTLP events are sent as log records over HTTP/protobuf. Each pipelock audit event maps to one OTLP LogRecord with service.name=pipelock as a resource attribute. Retries on 429, 502, 503, 504, and network errors with bounded exponential backoff (3 attempts, 1s/2s/4s). 500 and 501 are not retried. No gRPC, no batching timer.

Severity levels (hardcoded per event type, not configurable):

  • critical: kill switch deny, adaptive escalation to critical level (enforcement upgraded across all transports)
  • warn: blocked requests, anomalies, session events, MCP unknown tools, scan hits
  • info: allowed requests, tunnel open/close, WebSocket open/close, config reload

Tool Chain Detection#

Detects attack patterns in sequences of MCP tool calls using subsequence matching with gap tolerance.

tool_chain_detection:
  enabled: true
  action: warn
  window_size: 20
  window_seconds: 60
  max_gap: 3
  tool_categories: {} # map tool names to categories
  pattern_overrides: {} # per-pattern action overrides
  custom_patterns: []
FieldDefaultDescription
enabledfalseEnable chain detection
action"warn"warn or block
window_size20Tool calls retained in history
window_seconds60Time-based history eviction
max_gap3Max innocent calls between pattern steps
tool_categories{}Map tool names to built-in categories
pattern_overrides{}Per-pattern action override
custom_patterns[]Custom attack sequences

Ships with 10 built-in patterns covering reconnaissance, credential theft, data staging, persistence, and exfiltration chains.

Cross-Request Exfiltration Detection#

Detects secrets split across multiple requests within a session. Two independent mechanisms (entropy budget and fragment reassembly) can run together or separately. Both feed into adaptive enforcement scoring.

cross_request_detection:
  enabled: false
  action: warn
  entropy_budget:
    enabled: false
    bits_per_window: 4096
    window_minutes: 5
    action: block
  fragment_reassembly:
    enabled: false
    max_buffer_bytes: 65536
    window_minutes: 5
FieldDefaultDescription
enabledfalseEnable cross-request detection
action"block"Default action for sub-features that don't override

Entropy Budget#

Tracks cumulative Shannon entropy of all outbound payloads (URLs, request bodies, MCP JSON-RPC payloads, WebSocket frames) per session within a sliding time window. When total entropy bits exceed the budget, the configured action fires.

FieldDefaultDescription
entropy_budget.enabledfalseEnable entropy budget tracking
entropy_budget.bits_per_window4096Max entropy bits allowed per session per window before triggering
entropy_budget.window_minutes5Sliding window duration in minutes
entropy_budget.action"warn"Action when budget is exceeded (warn or block)
entropy_budget.exempt_domains[]Domains excluded from entropy budget recording. DLP pattern matching still runs on exempt domains. Supports exact hostnames and *.example.com wildcards (also matches apex example.com).

Tuning: The default 4096 bits per 5-minute window allows roughly 500 characters of random data across URL query parameters and path segments. This is appropriate when scanning URL-level traffic only.

With TLS interception enabled, request bodies are also scanned for entropy. A single LLM API call body (conversation context) can contain 100,000+ bits of entropy. Set bits_per_window to 500000 or higher when using tls_interception with cross-request detection, and add your LLM provider to exempt_domains:

cross_request_detection:
  enabled: true
  entropy_budget:
    enabled: true
    bits_per_window: 500000
    exempt_domains:
      - "*.anthropic.com"
      - "*.openai.com"
      - "*.minimax.io"

Fragment Reassembly#

Buffers outbound payloads (URLs, request bodies, MCP JSON-RPC payloads, WebSocket frames) per session and re-scans the concatenated content against DLP patterns on every request (synchronous, pre-forward). Catches secrets split across multiple requests that individually look clean.

FieldDefaultDescription
fragment_reassembly.enabledfalseEnable fragment reassembly
fragment_reassembly.max_buffer_bytes65536Max buffer size per session (64 KB). Older fragments are evicted when exceeded.
fragment_reassembly.window_minutes5Fragment retention window in minutes. Fragments older than this are pruned.

Memory: Each tracked session uses up to max_buffer_bytes. With 10,000 concurrent sessions (hard cap), the worst-case memory is max_buffer_bytes * 10000 (640 MB at defaults). Reduce max_buffer_bytes in memory-constrained environments.

Scope note: Cross-request detection scans all outbound content visible to the proxy: URLs, request bodies, MCP JSON-RPC payloads, and WebSocket frames. CONNECT tunnels without TLS interception only expose the target hostname (entropy tracking only). Enable tls_interception for full cross-request coverage on tunneled traffic.

Finding Suppression#

Suppress known false positives by rule name and path/URL pattern.

suppress:
  - rule: "Jailbreak Attempt"
    path: "*/robots.txt"
    reason: "robots.txt content triggers developer mode regex"
FieldDescription
rulePattern/rule name to suppress (required)
pathExact path, glob, or URL suffix (required)
reasonHuman-readable justification

Path matching: exact (foo.txt), glob (*.txt, vendor/**), directory prefix (vendor/), basename glob (*.txt matches dir/foo.txt).

See Finding Suppression Guide for the full reference.

Git Protection#

Git-aware scanning for pre-push secret detection and branch restrictions.

git_protection:
  enabled: false
  allowed_branches: ["feature/*", "fix/*", "main"]
  pre_push_scan: true
FieldDefaultDescription
enabledfalseEnable git protection
allowed_branches["feature/*", "fix/*", "main", "master"]Branch name patterns
pre_push_scantrueScan diffs before push

Logging#

Structured audit logging to stdout and/or file.

logging:
  format: json
  output: stdout
  file: ""
  include_allowed: true
  include_blocked: true
FieldDefaultDescription
format"json"json or text
output"stdout"stdout, file, or both
file""Log file path
include_allowedtrueLog allowed requests
include_blockedtrueLog blocked requests

Internal Networks (SSRF Protection)#

Private/reserved IP ranges blocked from agent access. Post-DNS check prevents SSRF via DNS rebinding.

internal:
  - "0.0.0.0/8"
  - "127.0.0.0/8"
  - "10.0.0.0/8"
  - "100.64.0.0/10"
  - "172.16.0.0/12"
  - "192.168.0.0/16"
  - "169.254.0.0/16"
  - "::1/128"
  - "fc00::/7"
  - "fe80::/10"
  - "224.0.0.0/4"
  - "ff00::/8"

All RFC 1918, RFC 4193, link-local, loopback, CGN (Tailscale/Carrier-Grade NAT), multicast, and cloud metadata ranges are blocked by default. IPv6 zone IDs (e.g. ::1%eth0) are stripped before IP parsing to prevent bypass.

Trusted Domains#

Domains exempt from SSRF internal-IP checks. Use this when a domain legitimately resolves to a private IP (e.g., an internal API behind a VPN) and you want pipelock to allow the connection.

trusted_domains:
  - "internal-api.example.com"
  - "*.corp.example.com"
FieldDefaultDescription
trusted_domains[]Top-level list. Supports *.example.com wildcards (also matches apex example.com).

Important: This is a top-level config field, not nested under forward_proxy. Placing it under forward_proxy will silently do nothing. DLP and other content scanning still runs on trusted domains -- only the SSRF IP check is bypassed.

Strict mode: trusted_domains does not override api_allowlist. In strict mode, a domain must be in both api_allowlist (to be reachable) and trusted_domains (to resolve to internal IPs). If a domain is only in api_allowlist and resolves internally, pipelock blocks it with a hint to add it to trusted_domains.

Per-agent trusted_domains overrides are available in agent profiles (Pro license).

SSRF IP Allowlist#

Exempt specific IP ranges from SSRF blocking. Use this when your internal services resolve to known IP ranges and you want to allow connections by IP rather than by hostname.

ssrf:
  ip_allowlist:
    - "192.168.1.0/24"
    - "10.0.0.5/32"
FieldDefaultDescription
ssrf.ip_allowlist[]CIDR ranges exempt from SSRF blocking. IPs in these ranges are still "internal" but explicitly trusted.

Complementary to trusted_domains: trusted_domains is hostname-based trust (the domain resolves to a private IP, but you trust the domain). ssrf.ip_allowlist is IP-based trust (you trust the IP range regardless of which domain resolves to it). Either one exempts from SSRF blocking.

Validation: Entries must be canonical CIDRs (network address, not host address). 10.0.0.5/24 is rejected because the host bits are set (use 10.0.0.0/24 instead). Catch-all prefixes (0.0.0.0/0, ::/0) are rejected because they would disable SSRF protection entirely.

Presets#

Seven starter configs in configs/:

PresetModeResponse ActionMCP PolicyBest For
balanced.yamlbalancedwarnwarnGeneral purpose
strict.yamlstrictblockblockHigh-security
audit.yamlauditwarnwarnLog-only monitoring
claude-code.yamlbalancedblockwarnClaude Code (unattended)
cursor.yamlbalancedblockwarnCursor IDE
generic-agent.yamlbalancedwarnwarnNew agents (tuning)
hostile-model.yamlstrictblockblockUncensored/abliterated models

Key differences between presets:

SettingBalancedStrictClaude Code
Max URL Length20485004096
Entropy Threshold4.53.55.0
Rate Limit60/min30/min120/min
API AllowlistLLM + commsLLM + commsLLM + dev tools

Hostile-Model Preset#

The hostile-model preset is for agents running uncensored, abliterated, or jailbroken models where the model itself has zero safety guardrails. It assumes the model will comply with any instruction, including exfiltrating secrets or executing injected prompts.

Use this preset for:

  • Red-team testing: exercising agent attack paths against the network layer
  • Self-hosted uncensored models: weight-ablated models (e.g. OBLITERATUS variants) with safety refusals removed
  • Jailbroken agents: any model that can be trivially steered past its own guardrails

What it enables beyond strict:

  • Every defense layer active: forward proxy, request body scanning, WebSocket scanning, MCP input/tool/policy scanning, session binding, session profiling, adaptive enforcement, tool chain detection
  • Aggressive entropy threshold (3.0): catches more encoded secrets at the cost of higher false-positive rates
  • Lower rate limit (15/min): constrains exfiltration bandwidth
  • Shorter URL limit (300 chars): reduces data budget per request
  • All MCP tool policy rules enabled: blocks shell obfuscation, file writes outside allowed paths, and network access patterns
  • TLS interception pre-configured (disabled by default; enable and generate a CA to activate)

The core principle: the model won't protect you, so the network layer must.

Agent Profiles#

Per-agent policy overrides. When multiple agents share one pipelock instance, each agent can have its own mode, allowlist, DLP patterns, rate limits, and request budgets. Scalar fields (mode, enforce) inherit from the base config when unset. mcp_tool_policy replaces the base section entirely when set on an agent profile (no deep merge). session_profiling replaces the per-agent fields (domain_burst, anomaly_action, volume_spike_ratio) unconditionally while preserving global-only fields (max_sessions, session_ttl_minutes, cleanup_interval_seconds). rate_limit overrides individual rate limit fields (non-zero values win). DLP merging follows separate rules (see below).

agents:
  claude-code:
    listeners: [":8889"]
    source_cidrs: ["10.42.3.0/24"]
    mode: strict
    api_allowlist: ["github.com", "*.githubusercontent.com"]
    dlp:
      include_defaults: true
      patterns:
        - name: "Internal Token"
          regex: 'internal_[a-zA-Z0-9]{32}'
          severity: critical
    rate_limit:
      max_requests_per_minute: 30
    session_profiling:
      domain_burst: 3
      anomaly_action: block
    mcp_tool_policy:
      enabled: true
      action: block
      rules:
        - name: "Block shell"
          tool_pattern: "bash|shell"
          action: block
    budget:
      max_requests_per_session: 500
      max_bytes_per_session: 52428800
      max_unique_domains_per_session: 50
      window_minutes: 60

  rook:
    listeners: [":8890"]
    mode: balanced
    enforce: false
    budget:
      max_unique_domains_per_session: 200

  _default:
    mode: balanced

Agent Resolution#

Pipelock resolves the agent name for each request using this priority order:

  1. Listener binding: matched by the port the request arrived on (injected as a context override, spoof-proof)
  2. Source CIDRs: matched by client IP against source_cidrs ranges defined on each agent profile
  3. Header (X-Pipelock-Agent): set by the calling agent or orchestrator
  4. Query parameter (?agent=name): appended to fetch/WebSocket URLs
  5. Fallback: _default profile if defined, otherwise base config

Listener-based resolution is the only method that cannot be spoofed by the agent. It injects a context override that takes priority over header and query param. Header and query param methods are convenient but trust the caller. Use listeners when isolation matters.

For MCP proxy mode, the --agent flag resolves the profile directly at startup (not through the HTTP resolution chain).

Override Fields#

Each agent profile can override these fields:

FieldTypeDescription
listeners[]stringDedicated listen addresses (e.g., ":8889"). Pipelock opens extra ports for these.
source_cidrs[]stringClient IP ranges that identify this agent (e.g., ["10.42.3.0/24"]).
modestringstrict, balanced, or audit
enforceboolOverride global enforce setting
api_allowlist[]stringReplaces the base allowlist entirely
dlpobjectDLP pattern overrides (see below)
rate_limitobjectPer-agent rate limits
session_profilingobjectPer-agent profiling thresholds
mcp_tool_policyobjectPer-agent MCP tool policy
trusted_domains[]stringPer-agent SSRF-exempt domains (overrides global list)
budgetobjectRequest budgets (see below)

DLP Merge Behavior#

Agent DLP overrides follow the same include_defaults pattern as the global DLP section:

  • include_defaults: true (or omitted): agent patterns are appended to the base config patterns. If an agent pattern shares a name with a base pattern, the agent version wins.
  • include_defaults: false: agent patterns replace the base patterns entirely.

Budget Config#

Budgets cap what an agent can do within a rolling time window. All fields default to 0 (unlimited).

FieldTypeDefaultDescription
max_requests_per_sessionint0Max HTTP requests per window
max_bytes_per_sessionint0Max response bytes per window
max_unique_domains_per_sessionint0Max distinct domains per window
window_minutesint0Rolling window duration in minutes. 0 means the budget never resets.
max_tool_calls_per_sessionint0Max MCP tool calls per session (0 = unlimited). Enforced.
max_retries_per_toolint0Max times the same tool+args can be called (0 = unlimited, default 5 when set). Detects retry storms. Enforced.
loop_detection_windowint0Number of recent tool calls to track for loop/cycle detection (0 = disabled, default 20 when set). Enforced.
max_wall_clock_minutesint0Max session duration in minutes (0 = unlimited). Enforced.
dow_actionstring"block"Action when a denial-of-wallet limit is exceeded: "block" (reject the tool call) or "warn" (log and allow)
max_concurrent_tool_callsint0Max parallel in-flight tool calls (0 = unlimited). Enforced.
max_retries_per_endpointint0Max calls to the same domain+path (0 = unlimited, default 20 when set). Enforced.
fan_out_limitint0Max unique endpoints within the fan-out window (0 = unlimited). Enforced.
fan_out_window_secondsint0Sliding window for fan-out detection (0 = disabled). Enforced.

When a budget limit is reached:

  • Request count and domain limits are checked before the outbound request. Exceeding either returns 429 Too Many Requests.
  • Byte limit (fetch proxy): the response body read is capped at the remaining byte budget. If the response exceeds the limit, it is discarded and a 429 is returned.
  • Byte limit (CONNECT/WebSocket): streaming connections track bytes after close. The byte budget is enforced on the next admission check, not mid-stream, because tunnel data cannot be recalled after transmission.
  • DoW limits (MCP proxy): tool call budgets are checked before each tools/call dispatch. When dow_action is "block", the call is rejected with a JSON-RPC error. When "warn", the call is logged and allowed through. Currently enforced: total tool call count, per-tool retry storms, loop/cycle detection, and wall-clock duration.

Listener Binding#

Each agent can bind to one or more dedicated ports via the listeners field. Pipelock opens these ports at startup alongside the main proxy port. Requests arriving on an agent's listener are automatically resolved to that agent without relying on headers or query params.

This is the only spoof-proof resolution method. The agent process connects to its assigned port, and pipelock knows which profile to apply based on the port alone.

agents:
  trusted-agent:
    listeners: [":8889"]
    mode: balanced
  untrusted-agent:
    listeners: [":8890"]
    mode: strict
    budget:
      max_requests_per_session: 100

Note: Listener bindings are set at startup. Changing listeners requires a process restart (not hot-reloadable).

Source CIDR Matching#

Each agent can define one or more source_cidrs entries. Pipelock matches the client IP of every incoming request against these CIDRs. This works for all traffic types including CONNECT tunnels, where header-based identification is not possible.

In Kubernetes, each pod has a unique IP. In Docker Compose, each container has its own. Source CIDR matching maps those IPs to agent profiles with zero agent-side configuration.

agents:
  claude-code:
    source_cidrs: ["10.42.3.0/24"]
    mode: strict
  cursor:
    source_cidrs: ["10.42.5.0/24", "10.42.6.0/24"]
    mode: balanced

Resolution priority: listener binding > source CIDR > header > query param > _default.

CIDRs must not overlap between different agents (containment and exact matches are both rejected). Overlapping CIDRs within the same agent are allowed.

The _default Profile#

If defined, _default applies to any request that does not match a named agent. Without _default, unmatched requests use the base config directly.

License Key#

Multi-agent profiles (the agents: section) require a signed license token. The token is an Ed25519-signed JWT-like string issued by pipelock license issue. At startup, pipelock verifies the signature, checks expiration, and confirms the token includes the agents feature. If any check fails, agent profiles are disabled with a warning. All single-agent protection remains active.

Loading Sources#

Pipelock checks three sources for the license token, in priority order:

PrioritySourceUse case
1 (highest)PIPELOCK_LICENSE_KEY env varContainers, CI, Kubernetes Secrets
2license_file config field (file path)Secret volume mounts, file-based workflows
3 (lowest)license_key config field (inline)Simple single-machine setups

The first non-empty source wins. Later sources are not checked. PIPELOCK_LICENSE_KEY values containing only whitespace are treated as empty and fall through to lower-priority sources. If license_file is configured but the file is empty or contains only whitespace, pipelock fails with an error rather than falling back to inline license_key. This is fail-closed by design: a misconfigured Secret mount should not silently downgrade to an inline fallback.

Env var (recommended for containers):

export PIPELOCK_LICENSE_KEY="pipelock_lic_v1_eyJ..."
pipelock run --config pipelock.yaml

File path:

license_file: /etc/pipelock/license.token # absolute path
license_file: license.token # relative to config file directory

The file should contain only the license token string. Leading and trailing whitespace is trimmed. The file must have owner-only permissions (0600); group- or world-readable files are rejected. The file is read at startup. Adding or changing a license requires a restart to take effect; a config-triggered reload will detect the change but will not apply it until restart. Removing the currently active license source takes effect immediately on reload (for example, unsetting PIPELOCK_LICENSE_KEY or removing the active license_file/license_key entry).

Inline (simplest):

license_key: "pipelock_lic_v1_eyJ..."

Full example with all license fields:

license_key: "pipelock_lic_v1_eyJ..." # inline token (lowest priority)
license_file: "/etc/pipelock/license.token" # file path (medium priority)
license_public_key: "a1b2c3d4..." # hex-encoded Ed25519 public key (dev builds only)

Kubernetes Secret Example#

Mount a license key from a Kubernetes Secret as an env var:

env:
  - name: PIPELOCK_LICENSE_KEY
    valueFrom:
      secretKeyRef:
        name: pipelock-license
        key: token

Or mount the Secret as a file and reference it in config:

license_file: /etc/pipelock/license/token

Key Verification#

Official release builds embed the signing public key at compile time via ldflags. The embedded key takes priority over license_public_key and cannot be overridden by config, preventing self-signing bypasses. The license_public_key config field is only used in development builds where no key is embedded.

CLI Commands#

pipelock license keygen # generates ~/.config/pipelock/license.key + license.pub
pipelock license issue --email customer@company.com --expires 2027-03-07
pipelock license inspect TOKEN # decode without verifying

A _default profile without any named agents does not require a license key.

Installing a License#

Use pipelock license install to write a license token to a file:

pipelock license install <TOKEN> # writes to ~/.config/pipelock/license.token
pipelock license install --path /etc/pipelock/license.token <TOKEN> # custom path

The command validates the token format, writes it atomically (temp file + rename), and prints setup instructions. Point your config at the file:

license_file: /etc/pipelock/license.token

Then restart pipelock to activate Pro features.

Renewal#

License tokens have a fixed expiry (typically 45 days). When your subscription renews, you receive a new token by email. To update:

  1. Run pipelock license install <NEW_TOKEN> (overwrites the existing file)
  2. Restart pipelock

The new token activates on restart. Your current token continues working until its expiry date, so there is no rush to update immediately. A config reload detects the changed license inputs but does not apply them until restart (activation requires restart; revocation is immediate).

Scan API#

Evaluation-plane HTTP listener for programmatic scanning. Disabled by default. When enabled, serves POST /api/v1/scan on a dedicated port with independent auth, rate limiting, and timeouts.

scan_api:
  listen: "127.0.0.1:9090"
  auth:
    bearer_tokens:
      - "your-secret-token"
  rate_limit:
    requests_per_minute: 600 # per token
    burst: 50
  max_body_bytes: 1048576 # 1MB
  field_limits:
    url: 8192
    text: 524288 # 512KB
    content: 524288
    arguments: 524288
  timeouts:
    read: "2s"
    write: "2s"
    scan: "5s"
  connection_limit: 100
  kinds:
    url: true
    dlp: true
    prompt_injection: true
    tool_call: true
FieldDefaultDescription
listen"" (disabled)Bind address. Listener only starts when set and at least one bearer token is configured.
auth.bearer_tokens[]Bearer tokens for Authorization header. Compared in constant time. Required when listen is set.
rate_limit.requests_per_minute600Per-token rate limit.
rate_limit.burst50Burst allowance above steady-state rate.
max_body_bytes1048576 (1MB)Maximum request body size.
field_limits.url8192Max bytes for input.url field.
field_limits.text524288 (512KB)Max bytes for input.text field.
field_limits.content524288 (512KB)Max bytes for input.content field.
field_limits.arguments524288 (512KB)Max bytes for input.arguments field.
timeouts.read"2s"HTTP read timeout.
timeouts.write"2s"HTTP write timeout.
timeouts.scan"5s"Per-scan deadline. Exceeded = scan_deadline_exceeded error, never partial allow.
connection_limit100Max concurrent connections.
kinds.urltrueEnable url scan kind.
kinds.dlptrueEnable dlp scan kind.
kinds.prompt_injectiontrueEnable prompt_injection scan kind.
kinds.tool_calltrueEnable tool_call scan kind.

All kinds are enabled by default. Set any to false to disable. Full API reference: docs/scan-api.md.

Address Protection#

Detects blockchain address poisoning attacks. Compares outbound addresses against a user-supplied allowlist of known-good destinations and flags similar-looking addresses using prefix/suffix fingerprinting. This is destination verification, not secret detection — separate from DLP.

Disabled by default. Users opt in explicitly.

address_protection:
  enabled: true
  action: block
  unknown_action: warn
  allowed_addresses:
    - "0x742d35Cc6634C0532925a3b844Bc9e7595f2bD18"
    - "bc1qxy2kgdygjrsqtzq2n0yrf2493p83kkfjhx0wlh"
  chains:
    eth: true
    btc: true
    sol: false
    bnb: true
  similarity:
    prefix_length: 4
    suffix_length: 4
FieldDefaultDescription
enabledfalseEnable address protection.
action"block"Action for poisoning/lookalike findings: block or warn.
unknown_action"allow"Action for valid addresses not in allowlist: allow, warn, or block.
allowed_addresses[]Known-good destination addresses (any supported chain format).
chains.ethtrueDetect Ethereum addresses (0x-prefixed, EIP-55 checksum validated).
chains.btctrueDetect Bitcoin addresses (P2PKH, P2SH, Bech32/Bech32m).
chains.solfalseDetect Solana addresses (base58, 32-44 chars). Disabled by default due to higher false positive risk from base58 regex.
chains.bnbtrueDetect BNB Smart Chain addresses (0x-prefixed, same format as ETH).
similarity.prefix_length4Characters to compare at the start of the address payload.
similarity.suffix_length4Characters to compare at the end of the address payload.

At least one chain must be enabled when address_protection.enabled is true. All-chains-disabled with the feature enabled is rejected at validation (silent no-op prevention).

Hot reload: disabling address protection triggers a reload warning. Re-enabling takes effect immediately.

File Sentry#

Real-time filesystem monitoring for agent subprocesses. Detects secrets written to disk that bypass the MCP tool call path. Applies to subprocess MCP mode only (pipelock mcp proxy -- COMMAND).

file_sentry:
  enabled: false
  watch_paths:
    - "."
  scan_content: true
  ignore_patterns:
    - "node_modules/**"
    - ".git/**"
    - "*.o"
    - "*.so"
FieldDefaultDescription
enabledfalseEnable filesystem monitoring. Opt-in.
watch_paths[]Directories to monitor recursively. Relative paths are resolved against the config file directory (not CWD). Required when enabled.
scan_contenttrueRun DLP scanner on modified file content.
ignore_patterns[]Glob patterns for files and directories to skip.

File sentry is alert-only in the current release. Findings are reported as stderr warnings and Prometheus metrics (pipelock_file_sentry_findings_total). Structured audit log emission (file_sentry_dlp event type) is defined but not yet wired to the webhook/syslog pipeline. On Linux, process lineage tracking attributes file writes to the agent's process tree via PR_SET_CHILD_SUBREAPER and /proc walking.

Files larger than 10MB are skipped. Write events are debounced (50ms quiet window) to avoid scanning partial writes.

Community Rules#

Optional signed rule bundles that extend built-in detection patterns. See docs/rules.md for the full user guide.

rules:
  rules_dir: ~/.local/share/pipelock/rules # default ($XDG_DATA_HOME/pipelock/rules)
  min_confidence: medium # skip low-confidence (experimental) rules
  include_experimental: false # only load stable rules by default
  trusted_keys: # additional signing keys (beyond embedded keyring)
    - name: "acme-security"
      public_key: "64-char-hex-encoded-ed25519-public-key"
FieldDefaultDescription
rules_dir~/.local/share/pipelock/rulesDirectory for installed bundles ($XDG_DATA_HOME/pipelock/rules)
min_confidence"" (all)Skip rules below this confidence level
include_experimentalfalseInclude experimental rules from bundles
trusted_keys[]Additional Ed25519 public keys to trust for signature verification

Hot reload: rule directory changes are not detected via hot-reload. Restart pipelock after installing or updating bundles.

Sandbox#

Process containment for agent commands using Linux kernel primitives. The agent runs in a restricted environment with controlled filesystem access, no direct network, and a filtered syscall set.

sandbox:
  enabled: true
  best_effort: false # degrade gracefully when namespace isolation unavailable
  strict: false # error if any layer unavailable (mutually exclusive with best_effort)
  workspace: /home/user/project # agent working directory (default: CWD)
  filesystem: # optional Landlock overrides (default policy works for most agents)
    allow_read:
      - /usr/share/data
      - /app/ # application code in containers
    allow_write:
      - /tmp/agent-work
FieldDefaultDescription
enabledfalseEnable sandbox containment
best_effortfalseSkip namespace isolation when unavailable (e.g. containers). Landlock + seccomp still apply.
strictfalseError if any containment layer is unavailable. Mutually exclusive with best_effort.
workspaceCWDAgent working directory (resolved to absolute at startup)
filesystem.allow_read[]Additional read-only filesystem paths
filesystem.allow_write[]Additional writable paths (workspace is always writable)

If filesystem is omitted, the default Landlock policy is used (safe for Python/Node/Go agents without config). Read access grants execute (Landlock bundling). Write paths are also executable.

Containment layers:

  • Landlock LSM: Restricts filesystem access to declared paths. Allowlist model. Protected directories (~/.ssh, ~/.aws, ~/.kube, etc.) are denied. Only dirs that exist on the system are checked.
  • Network namespaces: Agent runs in an isolated network namespace. All traffic is kernel-forced through pipelock's bridge proxy. Raw socket bypass is impossible. For MCP (stdio), no network is needed.
  • Seccomp BPF: Syscall allowlist (~130 safe syscalls for Go/Python/Node.js). Blocks ptrace, mount, module loading, kexec (KILL). io_uring returns EPERM (allows runtimes like Node.js 22 to fall back to epoll). Clone flags filtered to prevent namespace escape.

Usage:

# Sandbox an MCP server
pipelock mcp proxy --sandbox --config pipelock.yaml -- npx server

# Sandbox a standalone command
pipelock sandbox --config pipelock.yaml -- python agent.py

# Pass environment variables to sandboxed process
pipelock sandbox --env API_KEY --env HOME=/app -- node server.js

# Best-effort mode for containers (Landlock + seccomp, no namespace)
pipelock sandbox --best-effort -- python agent.py

# Check sandbox capabilities without launching
pipelock sandbox --dry-run --json -- python agent.py

Environments:

EnvironmentLayersNotes
Bare metal / VM (Linux)3/3Full containment: Landlock + seccomp + network namespace
Containers (--best-effort)2/3Landlock + seccomp. Network via HTTP_PROXY + NetworkPolicy.
macOSsandbox-execApple SBPL profiles for filesystem + network restriction

Requirements: Linux 5.13+ (Landlock ABI v1). Unprivileged on bare metal. macOS 13+ for sandbox-exec. Containers may need --best-effort if default seccomp blocks CLONE_NEWUSER.

Config Audit Scoring (v2.0)#

Score a pipelock configuration for security posture. Evaluates 12 categories and produces a 0-100 score with letter grade and actionable recommendations.

pipelock audit score --config pipelock.yaml
pipelock audit score --config pipelock.yaml --json

Categories scored: DLP (pattern count, env scanning, entropy), response scanning (enabled, action, pattern count), MCP tool scanning, MCP tool policy (rule count, blocking rules, overpermission), MCP input scanning, MCP session binding, kill switch (source count), enforcement mode, domain blocklist, adaptive enforcement, tool chain detection, sandbox.

Tool policy overpermission audit: flags wildcard arg_pattern values, high-risk tool patterns with non-blocking actions, and policies with no effective blocking rules. Respects section-level default action inheritance.

Redirect Action (v2.0)#

A policy action that rewrites dangerous tool execution to a safer target instead of blocking outright.

mcp_tool_policy:
  enabled: true
  action: warn
  redirect_profiles:
    fetch_proxy:
      exec: ["/proc/self/exe", "internal-redirect", "fetch-proxy"]
      preserve_argv: true
      reason: "Route outbound fetches through audited proxy"
  rules:
    - name: shell-egress
      tool_pattern: '(?i)^(bash|shell|exec)$'
      arg_pattern: '(?i)\b(curl|wget)\b'
      action: redirect
      redirect_profile: fetch_proxy
FieldDescription
redirect_profilesNamed redirect targets with exec command and reason
redirect_profilePer-rule reference to a named profile
action: redirectNew action alongside block, warn, ask, strip, forward

Redirect failure falls through to block (fail-closed). Every redirect emits a structured audit event with the original command, redirect target, policy rule, and reason.

Canary Tokens (v2.1)#

Synthetic secrets injected into the agent's environment. If pipelock detects a canary in any outbound request, it's irrefutable proof of compromise -- not a heuristic, but a known-fake value that should never appear in traffic.

canary_tokens:
  enabled: true
  tokens:
    - name: "aws_canary"
      value: "canary-aws-trap-value-0x42a7"
      env_var: "AWS_ACCESS_KEY_ID" # optional: inject as env var
    - name: "db_canary"
      value: "postgres://canary:trap@honeypot.internal/fake"
    - name: "api_canary"
      value: "sk_test_CANARY_4eC39HqLyjWDarjtT1zdp7dc"
FieldDefaultDescription
enabledfalseEnable canary token detection
tokens[].name(required)Human-readable name for the canary
tokens[].value(required)The exact string to detect in outbound traffic
tokens[].env_var(optional)Environment variable to inject the canary into

Canary checks run after DLP as a safety net (exact string match, O(1) per token). If a DLP pattern already matched, the canary check is skipped. Detection emits a high-severity event with full request context. Use pipelock canary generate to create sample configurations.

Flight Recorder (v2.1)#

Hash-chained, tamper-evident evidence log. Every scanner verdict, tool call, and session event is recorded to JSONL with SHA-256 hash chains and optional Ed25519 signed checkpoints.

flight_recorder:
  enabled: true
  dir: /var/lib/pipelock/evidence
  checkpoint_interval: 1000
  retention_days: 90
  redact: true
  sign_checkpoints: true
  signing_key_path: "/path/to/signing-key"
  max_entries_per_file: 10000
  raw_escrow: false
  escrow_public_key: ""
FieldDefaultDescription
enabledfalseEnable evidence recording
dir(required if enabled)Directory for evidence files
checkpoint_interval1000Entries between signed checkpoints
retention_days0Auto-expire files after N days (0 = keep forever)
redacttrueDLP-redact evidence content before writing. Receipt entries get field-level redaction (target/pattern scrubbed, signature preserved).
sign_checkpointstrueEd25519 sign checkpoint entries
signing_key_path(empty)Ed25519 private key for action receipts. When set, every proxy decision produces a signed receipt. Generate a key with pipelock keygen <name>. Verify receipts with pipelock verify-receipt <file>. Hot-reloadable: add, remove, or rotate keys via SIGHUP.
max_entries_per_file10000Rotate to a new file after this many entries
raw_escrowfalseEncrypt raw (pre-redaction) detail to sidecar files
escrow_public_key(required if raw_escrow)X25519 public key (hex) for escrow encryption

Evidence files are named evidence-<session>-<seq>.jsonl. Each entry contains a SHA-256 hash of its predecessor, forming a tamper-evident chain. Action receipts form a second chain within the evidence log (each receipt links to the previous receipt via chain_prev_hash). Breaking either chain is detectable by pipelock integrity verify.

A2A Scanning (v2.1)#

Scanning for Google A2A (Agent-to-Agent) protocol traffic. Detects A2A messages in forward proxy and MCP HTTP proxy paths. Applies field-aware content inspection with URL/text/secret classification.

a2a_scanning:
  enabled: true
  action: block
  scan_agent_cards: true
  detect_card_drift: true
  session_smuggling_detection: true
  max_context_messages: 100
  max_contexts: 1000
  scan_raw_parts: true
  max_raw_size: 1048576
FieldDefaultDescription
enabledfalseEnable A2A protocol detection and scanning
actionblockAction on findings: block or warn
scan_agent_cardstrueScan Agent Card skill descriptions for injection
detect_card_drifttrueDetect Agent Card modification mid-session (rug-pull)
session_smuggling_detectiontrueTrack contextId to detect session smuggling
max_context_messages100Per-context message cap
max_contexts1000Total tracked contexts
scan_raw_partstrueDecode and scan text-like Part.raw fields
max_raw_size1048576Max encoded size for Part.raw decoding (bytes)

A2A detection works on the forward proxy (CONNECT and plain HTTP) and MCP HTTP proxy paths. Agent Cards are scanned for skill description poisoning. Card drift detection tracks cards by URL + auth fingerprint and alerts on mid-session changes.

MCP Binary Integrity (v2.1)#

Pre-spawn SHA-256 hash verification for MCP server subprocesses. Prevents tampered or substituted binaries from being executed.

mcp_binary_integrity:
  enabled: true
  manifest_path: /etc/pipelock/binary-manifest.json
  action: warn
FieldDefaultDescription
enabledfalseEnable binary hash verification before spawn
manifest_path(required if enabled)Path to JSON hash manifest
actionwarnAction on hash mismatch: block or warn

The manifest is a JSON file mapping binary paths to expected SHA-256 hashes. Pipelock resolves shebangs and versioned interpreters (e.g., python3.11) before hashing.

Validation Rules#

The following are enforced at startup:

  • Strict mode requires a non-empty api_allowlist
  • All DLP and response patterns must compile as valid regex
  • secrets_file must exist and not be world-readable (mode 0600 or stricter)
  • MCP tool policy requires at least one rule if enabled
  • Kill switch api_listen must differ from the main proxy listen address
  • WebSocket strip_compression must be true when scanning is enabled
  • Reverse proxy upstream must be a valid http:// or https:// URL when enabled

Reverse Proxy#

Generic HTTP reverse proxy mode that sits in front of any service and scans traffic bidirectionally.

reverse_proxy:
  enabled: false
  listen: ":8890"
  upstream: "http://localhost:7899"
FieldDefaultDescription
enabledfalseEnable reverse proxy mode
listen(required)Listen address for the reverse proxy
upstream(required)Upstream service URL to forward to

CLI flags#

pipelock run --reverse-proxy --reverse-upstream http://localhost:7899 --reverse-listen :8890

Scanning behavior#

  • Request bodies: Scanned for DLP patterns (secret exfiltration) using the request_body_scanning config
  • Request headers: Scanned when request_body_scanning.scan_headers is enabled
  • Response bodies: Scanned for prompt injection using the response_scanning config
  • Binary content: Image, audio, and video content types skip scanning
  • Compressed bodies: Fail-closed (blocked) on both request and response
  • Oversized bodies: Bodies larger than 1MB pass through without scanning

Hot-reload#

The listen, enabled, and upstream fields cannot be changed via hot-reload (requires restart). All other scanning config (DLP patterns, response patterns, action, header mode) updates on reload.