Documents
Sanitizer Field Pattern Maintenance
Sanitizer Field Pattern Maintenance
Type
Topic
Status
Published
Created
Mar 22, 2026
Updated
Apr 19, 2026
Created by
Dosu Bot
Updated by
Dosu Bot

Sanitizer Field Pattern Maintenance#

Sanitizer Field Pattern Maintenance is a development maintenance pattern for the opnDossier project that ensures credential detection rules remain synchronized across two critical code files when adding multi-device support. The pattern addresses the architectural requirement that the sanitizer operates on raw XML element names via pattern matching, not on CommonDevice field names, making device-specific credential fields vulnerable to being silently missed unless explicitly cataloged in both pattern matching systems.

The sanitizer uses a dual-phase detection strategy: field-name pattern matching via FieldPatterns arrays in rule definitions (Phase 1), followed by value content analysis using detector functions (Phase 2). This architecture requires maintaining pattern lists in two separate locations—internal/sanitizer/rules.go and internal/sanitizer/patterns.go—to ensure comprehensive credential detection across OPNsense, pfSense, and future device types. Failure to update both files when adding new patterns can result in credential leakage when sanitizing configuration files for public sharing or support diagnostics.

The Two-File Update Requirement#

Core Maintenance Rule#

When adding credential field patterns for new device types or newly discovered credential fields, both files must be updated simultaneously:

  1. internal/sanitizer/rules.go — Update the relevant rule's FieldPatterns array (used by ShouldRedactField)
  2. internal/sanitizer/patterns.go — Update corresponding keyword slices like passwordKeywords (used for related detection functions)

Rationale: The sanitizer's field-name matching has priority over value detection. If a credential field's XML element name doesn't substring-match any existing pattern, it will only be caught if the value content triggers a ValueDetector—but most credential rules (password, secret, psk, snmp_community) rely solely on field-name patterns without value detectors.

Pattern Matching Mechanics#

Field-Name Matching Algorithm#

The fieldNameMatches function implements the pattern matching logic:

  • Default behavior: Case-insensitive substring matching via containsIgnoreCase
  • Exact match exception: Patterns in exactMatchPatterns (["key", "from", "to"]) require exact case-insensitive matches to prevent false positives on compound field names like sshkey, apikey, authkey
  • ASCII-only case folding: Uses custom toLower function instead of strings.ToLower for performance

Examples:

  • Pattern "password" matches: password, Password, userPassword, mypassword123
  • Pattern "bcrypt" would match: bcrypt-hash, bcrypt_hash, mybcrypt, bcryptPassword
  • Pattern "key" (exact match) matches only: key, Key, KEY (not sshkey or apikey)

Detection Execution Flow#

ShouldRedactValue coordinates the two-phase detection:

Phase 1: ShouldRedactField(fieldName)
  ├─ Iterate all active rules
  ├─ For each rule's FieldPatterns
  │ └─ Call fieldNameMatches(fieldName, pattern)
  └─ Return true + Rule on first match

Phase 2: Value Detection (only if Phase 1 fails)
  ├─ Iterate all active rules with ValueDetector != nil
  ├─ Call ValueDetector(value)
  └─ Return true + Rule on first match

Critical behavior: If a field name matches any FieldPattern, the ValueDetector is never consulted for that field. This makes field-name patterns the primary detection mechanism.

Credential Rules Catalog#

Current FieldPatterns in rules.go#

Rule NameFieldPatternsActive ModesRedacted Value
password["password", "passwd", "pass", "pwd"]All[REDACTED-PASSWORD]
secret["secret", "token", "apikey", "api_key", "api-key", "accesskey", "secretkey", "authkey", "auth_key", "otp_seed", "otpseed"]All[REDACTED-SECRET]
psk["psk", "preshared", "pre-shared", "ipsecpsk"]All[REDACTED-PSK]
snmp_community["community", "rocommunity", "rwcommunity"]All[REDACTED-SNMP-COMMUNITY]
private_key["privatekey", "private_key", "prv", "privkey", "key", "openvpn.tls", "openvpn-server.tls", "openvpn-client.tls", "openvpn.statickeys", "statickeys", "tls_crypt", "tls_auth"]All[REDACTED-PRIVATE-KEY]
ssh_authorized_keys["authorizedkeys", "authorized_keys", "sshkey", "ssh_key"]All[REDACTED-SSH-KEY]

Current passwordKeywords in patterns.go#

The passwordKeywords slice contains 14 keywords used by the LooksLikePassword function:

var passwordKeywords = []string{
    "password", "passwd", "pass", "secret", "key",
    "token", "credential", "auth", "prv", "private", "bindpw",
    "bcrypt-hash", "sha512-hash",
    "statickeys", "tls_crypt", "tls_auth",
}

The "bindpw" keyword was added to properly classify ldap_bindpw fields in authserver configurations as credentials, ensuring they are treated with appropriate security in the sanitizer's credential detection logic.

The "statickeys", "tls_crypt", and "tls_auth" keywords correspond to OpenVPN HMAC key material documented in GOTCHAS.md §11.3.

Note: While LooksLikePassword exists and could be used as a ValueDetector, the current rule engine does not directly use it. The actual credential detection relies on the FieldPatterns in rules.go.

Device-Specific Credential Field Catalogs#

OpenVPN Pattern Selection Strategy#

OpenVPN's <tls> and <StaticKeys> elements hold --tls-auth / --tls-crypt HMAC key material that must be redacted. The sanitizer uses path-anchored patterns (openvpn.tls, openvpn-server.tls, openvpn-client.tls, openvpn.statickeys, statickeys) to avoid false-positive collisions with unrelated <tls> elements in the schema:

  • Suricata IDS: opnsense.OPNsense.IDS.general.eveLog.tls.* wraps boolean enable/extended/sessionResumption configuration (not secrets)
  • IPsec strongSwan: opnsense.OPNsense.IPsec.charon.syslog.daemon.tls carries a log-level enum 0–5 (not a secret)

The substring tls alone would false-positive on these non-credential paths. Path anchoring (full lowercased element path includes parent container) ensures only OpenVPN HMAC keys are redacted. The unambiguous aliases tls_crypt and tls_auth are safe as bare patterns because they never appear in non-OpenVPN contexts.

This strategy is documented in GOTCHAS.md §11.3 and verified by TestSanitizeXML_OpenVPN_TLS_NoFalsePositives in internal/sanitizer/sanitizer_test.go.

ValueDetector: IsOpenVPNStaticKey#

The private_key rule's ValueDetector now includes IsOpenVPNStaticKey, which recognizes the PEM envelope format used by OpenVPN static keys:

-----BEGIN OpenVPN Static key V1-----
<hexadecimal key material>
-----END OpenVPN Static key V1-----

This detector complements the path-anchored FieldPatterns. The standard PEM detector (IsPrivateKey) looks for PRIVATE KEY labels and would miss OpenVPN's custom label. By chaining IsOpenVPNStaticKey into the IsPrivateKey function, the sanitizer catches OpenVPN HMAC keys through both the field-name path (Phase 1) and the value-content path (Phase 2).

See TestIsOpenVPNStaticKey and TestIsPrivateKey_OpenVPNStaticKey in internal/sanitizer/patterns_test.go for detector coverage.

OPNsense XML Element Names#

OPNsense uses <opnsense> as root element. Common credential fields:

  • User passwords: <passwd>matched by "passwd" pattern
  • SNMP: <community>, <rocommunity>, <rwcommunity>matched by community patterns
  • VPN: <psk>, <ipsecpsk>, <preshared>matched by psk patterns
  • Certificates: <prv> (private keys) — matched by "prv" pattern
  • SSH: <authorizedkeys>matched by ssh_authorized_keys patterns
  • OTP: <otp_seed>matched by "otp_seed" in secret rule
  • OpenVPN HMAC: <tls> (under <openvpn-server> / <openvpn-client>) — matched by path-anchored "openvpn-server.tls" / "openvpn-client.tls" patterns
  • OpenVPN MVC: <StaticKeys> (under <OpenVPN>) — matched by "openvpn.statickeys" and "statickeys" patterns

pfSense XML Element Differences#

pfSense uses <pfsense> root element with these credential field differences:

  • User passwords: <bcrypt-hash>, <sha512-hash>NOT matched by any current pattern
  • RADIUS: <radius_secret>matched by "secret" pattern
  • Auth: <auth_pass>matched by "pass" pattern
  • Certificates/keys: Same <prv> as OPNsense — matched by "prv" pattern

⚠️ Critical Gap Identified: pfSense's <bcrypt-hash> and <sha512-hash> elements do NOT substring-match any current FieldPattern in the password rule ("password", "passwd", "pass", "pwd"). These fields would be silently missed by the sanitizer unless:

  1. A ValueDetector for password-like content exists (currently none on the password rule)
  2. The patterns are explicitly added to rules.go

Real-World Example: The pfSense Password Field Gap#

Problem Discovery#

When adding pfSense support, the sanitizer's generic patterns covered OPNsense's <password> and <passwd> elements via the "pass" substring match, but completely missed pfSense's <bcrypt-hash> element because:

Solution#

Add device-specific hash element patterns to BOTH files:

In internal/sanitizer/rules.go:

{
    Name: "password",
    FieldPatterns: []string{
        "password", "passwd", "pass", "pwd",
        "bcrypt-hash", "bcrypt", // pfSense user passwords
        "sha512-hash", "sha512", // pfSense alternative hash format
    },
    // ...
}

In internal/sanitizer/patterns.go:

passwordKeywords = []string{
    "password", "passwd", "pass", "secret", "key", "token",
    "credential", "auth", "prv", "private",
    "bcrypt", "sha512", "hash", // pfSense hash element detection
}

XML Element Name Discovery Workflow#

When adding support for a new device type, follow this workflow to identify credential fields requiring pattern coverage:

Step 1: Examine Device Schema DTOs#

  1. Navigate to pkg/schema/<device>/ directory
  2. Search for credential-related struct tags in *.go files:
    grep -r 'xml:.*hash\|xml:.*pass\|xml:.*secret\|xml:.*key' pkg/schema/pfsense/
    
  3. Catalog all XML element names from `xml:"element-name"` tags

Example from pfSense:

type SystemUser struct {
    Name string `xml:"name"`
    BcryptHash string `xml:"bcrypt-hash"` // ← Credential field!
    UID string `xml:"uid"`
}

Step 2: Cross-Reference Schema Documentation#

  • Check pkg/schema/<device>/README.md for structural documentation
  • pfSense schema README contains 838 lines documenting 50+ configuration sections
  • Review docs/development/xml-structure-research.md for device comparison notes

Step 3: Verify Existing Pattern Coverage#

For each discovered credential field XML element name:

  1. Convert to lowercase
  2. Check if it substring-matches any pattern in rules.go credential rules
  3. If NO match found → add to maintenance backlog

Step 4: Pattern Addition Checklist#

For each unmatched credential field:

  • Determine which rule it belongs to (password, secret, psk, snmp_community, private_key, ssh_authorized_keys)
  • Add shortest effective substring pattern to rule's FieldPatterns array in rules.go
  • Add related keywords to corresponding slice in patterns.go (e.g., passwordKeywords)
  • Update test cases in rules_test.go and patterns_test.go
  • Verify with detection method (see next section)

Detection Method: Verification Workflow#

After updating pattern lists, verify coverage using this command-line detection method:

opndossier sanitize <config.xml> | grep -iE 'hash|secret|key|pass|community|token'

Interpretation:

  • All sensitive values redacted ([REDACTED-*] placeholders) → ✅ Complete coverage
  • Plain-text sensitive values visible → ❌ Missing pattern (field name printed in grep output indicates which pattern to add)

Example of missed field:

<bcrypt-hash>$2b$10$abcdef...</bcrypt-hash> ← Not redacted, shows "bcrypt-hash" needs pattern

Alternative verification using test fixtures:

go test -v ./internal/sanitizer -run TestSanitize.*PfSense

Categories of Credential Fields Across Device Types#

Universal Patterns (All Devices)#

  • Private keys: prv, privatekey, private_key
  • Certificates: crt, cert, certificate (handled by crypto rules)
  • Generic passwords: password, passwd, pass

OPNsense-Specific#

  • otp_seed — OTP seed values (covered by secret rule)
  • rocommunity, rwcommunity — SNMP (covered by snmp_community rule)

pfSense-Specific#

  • bcrypt-hash, sha512-hash — User password hashes (requires explicit addition)
  • auth_pass — Authentication passwords (covered by "pass" pattern)
  • radius_secret — RADIUS shared secrets (covered by "secret" pattern)

Future Device Considerations#

When adding support for devices like Cisco ASA, Juniper SRX, or Fortinet FortiGate:

  • Cisco: Look for <enable-password>, <secret>, <key-string>, <community-string>
  • Juniper: Check for <secret>, <encrypted-password>, <pre-shared-key>
  • Fortinet: Watch for <password>, <psksecret>, <private-key>

Audit each device's XML config schema for keywords: hash, password, secret, key, token, community, psk, auth, credential.

Maintenance Patterns and Best Practices#

Pattern Selection Guidelines#

  1. Shortest effective substring: Prefer "bcrypt" over "bcrypt-hash" to catch variants (bcrypt_hash, mybcryptpass)
  2. Avoid over-matching: Don't use "bc" (too generic) — balance coverage vs. false positives
  3. Consider exact match exception: If a pattern like "hash" causes false positives on compound names, add it to exactMatchPatterns

Test Coverage Requirements#

When adding patterns, update these test files:

  • internal/sanitizer/rules_test.go: Add test cases to TestShouldRedactField asserting the new pattern matches expected field names
  • internal/sanitizer/patterns_test.go: If adding to patterns.go keyword lists, add test cases for any new detector functions
  • Integration tests: Add device-specific XML fixtures to test end-to-end sanitization

Global Pattern Scanning Gotcha#

⚠️ Critical: ShouldRedactField scans ALL rules' FieldPatterns globally. Adding a FieldPattern to any rule can break "should not match" assertions for other rules.

Example conflict:

  • Adding pattern "id" to cloud_identifier rule
  • Breaks test asserting "userid" should NOT match password rule
  • Solution: Review all existing false assertions in rules_test.go before adding broad patterns

Implementation Checklist#

When adding credential field patterns for multi-device support:

  • Discover XML element names via schema DTOs (pkg/schema/<device>/*.go)
  • Verify current coverage against existing FieldPatterns in rules.go
  • Update internal/sanitizer/rules.go:
    • Add new patterns to appropriate rule's FieldPatterns array
    • Consider if new rule needed for novel credential type
  • Update internal/sanitizer/patterns.go:
    • Add related keywords to passwordKeywords or create new keyword slice
    • Implement new ValueDetector function if content-based detection needed
  • Update test suites:
    • Add positive match test cases to rules_test.go
    • Add negative match test cases (should NOT match) to rules_test.go
    • Add detector function tests to patterns_test.go if applicable
    • Review existing "should not match" assertions still pass
  • Verify with detection method:
    • Run opndossier sanitize on device fixture
    • Grep for unredacted sensitive keywords
    • Run integration test suite
  • Document in schema README (pkg/schema/<device>/README.md)
  • Sanitizer Rule Engine — Core architecture for dual-phase detection
  • Multi-Device Parser Registry — Self-registration pattern in pkg/parser/registry.go
  • CommonDevice Interface — Device-agnostic abstraction layer that sanitizer operates below
  • pfSense Schema Implementation838-line structural reference
  • GOTCHAS.md §11.3 — OpenVPN TLS pattern selection rationale and false-positive risks

Relevant Code Files#

FilePurpose
internal/sanitizer/rules.goRule definitions, FieldPatterns arrays, ShouldRedactField logic
internal/sanitizer/patterns.goValueDetector functions, passwordKeywords, LooksLikePassword
internal/sanitizer/sanitizer.goXML stream processing, sanitization execution
internal/sanitizer/rules_test.goPattern matching test cases
internal/sanitizer/patterns_test.goValue detector test cases
pkg/schema/pfsense/README.mdpfSense XML structure documentation (838 lines)
docs/development/xml-structure-research.mdDevice schema comparison notes