PfSense XML Listtags Schema Pattern

Lead Section#

The PfSense XML Listtags Schema Pattern refers to a critical architectural pattern for correctly parsing pfSense and OPNsense configuration XML files in Go (and other statically-typed languages). In pfSense's PHP codebase, the xmlparse.inc file contains an authoritative listtags() function that returns an array of 50+ XML element names that must always be treated as arrays during parsing, even when only a single instance appears in the XML document. This pattern exists because pfSense's XML parser interprets these elements as repeating structures regardless of cardinality.

When implementing parsers in Go, failing to map these elements to slice types ([]Type) instead of scalar types (Type) causes silent data loss—Go's encoding/xml package will overwrite the field on each encounter, retaining only the last occurrence. No error is returned, making this one of the most dangerous failure modes in configuration parsing.

The opnDossier project (which parses OPNsense configurations) provides extensive documentation of this pattern through its Schema Parser Synchronization and XML Presence Detection guides. Because OPNsense forked from pfSense, the two share substantially similar XML structures and parsing requirements, making the opnDossier implementation patterns directly applicable to pfSense parser development.

According to the user-provided context (sourced from pfSense's xmlparse.inc), the complete canonical list of XML elements that MUST use array types includes:

acls, alias, aliasurl, allowedip, allowedhostname, authserver, bridged, build_port_path, ca, cacert, cert, crl, clone, config, container, columnitem, checkipservice, depends_on_package, disk, dnsserver, dnsupdate, domainoverrides, dyndns, earlyshellcmd, element, encryption-algorithm-option, field, fieldname, gateway_item, gateway_group, gif, gre, group, hash-algorithm-option, hosts, ifgroupentry, igmpentry, interface_array, item, key, lagg, laggroup, lbaction, lbpool, l7rules, lbprotocol, member, menu, tab, mobilekey, mobilegroup, monitor_type, mount, npt, ntpserver, onetoone, openvpn-server, openvpn-client, openvpn-csc, option, package, passthrumac, phase1, phase2, ppp, pppoe, priv, proxyarpnet, pool, qinqentry, queue, pages, pipe, radnsserver, roll, route, row, rrddatafile, rule, schedule, service, servernat, servers, sshkeyfile, serversdisabled, shellcmd, staticmap, subqueue, switch, swport, timerange, tunnel, user, vip, virtual_server, vlan, vlangroup, voucherdbfile, vxlan, wgpeer, winsserver, wolentry, widget, xmldatafile

Critical examples:

Using string instead of []string for the priv field (user/group privileges) silently drops all but one privilege, compromising access control
Using string instead of []string for dnsserver fields drops all but the last DNS server, breaking name resolution
Using Rule instead of []Rule for firewall rules silently discards all but the last rule, catastrophically breaking firewall policy

pfSense-Specific Examples:

pfSense's Group.Priv correctly uses []string because pfSense's XML contains <priv>item</priv><priv>item</priv> repeating elements
pfSense's System.DNSServers uses []string for multiple <dnsserver> elements, unlike OPNsense which uses a single space-separated string

Why These Elements Must Be Arrays#

The Go `encoding/xml` Behavior#

Go's encoding/xml package has a critical limitation: when unmarshaling repeated XML elements into a scalar field, the package overwrites the field on each encounter, retaining only the last value. All previous occurrences are silently discarded without error.

// INCORRECT - Silent data loss
type Config struct {
    CA CertificateAuthority `xml:"ca"` // Only last <ca> survives
}

// CORRECT - All <ca> elements captured
type Config struct {
    CAs []CertificateAuthority `xml:"ca"` // All <ca> elements captured
}

PfSense/OPNsense XML Structure Patterns#

OPNsense (and pfSense) use two distinct structural patterns for multi-valued data:

Container/Child Pattern: A parent element wraps repeated children (e.g., <vlans> contains multiple <vlan> elements)
Top-Level Repeating Elements: Elements repeat directly at the document root without a container (e.g., multiple <ca>, <cert>, <rule> elements)

Both patterns require slice types in Go schemas, but the implementation differs.

Implementation Patterns#

Pattern A: Container + Child Slice (Bridge/VLAN Pattern)#

Used when XML has a plural container wrapping singular children:

// Container (Parent) Struct — XMLName MUST be first field
type VLANs struct {
    XMLName xml.Name `xml:"vlans"`
    VLAN []VLAN `xml:"vlan,omitempty"` // Slice of children
}

// Child Struct — XMLName MUST be first field
type VLAN struct {
    XMLName xml.Name `xml:"vlan"`
    If string `xml:"if,omitempty"`
    Tag string `xml:"tag,omitempty"`
    Descr string `xml:"descr,omitempty"`
}

Critical Rule: XMLName must be the first field in both container and child structs for proper marshaling. Placing it second silently breaks XML generation.

Known Container Patterns (from opnDossier's OPNsense implementation):

Pattern B: Top-Level Repeating Elements (Manual Append)#

Go's encoding/xml cannot auto-populate slices for sibling root-level elements, requiring manual append in the parser:

// From internal/cfgparser/xml.go
case "ca":
    var ca schema.CertificateAuthority
    if err := decodeSection(dec, &ca, se); err != nil {
        return err
    }
    doc.CAs = append(doc.CAs, ca) // Manual append required
    return nil

The document-level field must be declared as a slice:

type OpnSenseDocument struct {
    CAs []CertificateAuthority `xml:"ca"`
    Certs []Certificate `xml:"cert"`
    // ... other top-level repeating elements
}

Pattern C: Comma-Separated Values (Custom UnmarshalXML)#

Some listtag elements store multiple values as comma-separated strings in a single XML element. This requires a custom type with UnmarshalXML:

// InterfaceList represents a comma-separated list that unmarshals to []string
type InterfaceList []string

func (il *InterfaceList) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
    var content string
    if err := d.DecodeElement(&content, &start); err != nil {
        return err
    }

    if content == "" {
        *il = InterfaceList{}
        return nil
    }

    parts := strings.Split(content, ",")
    interfaces := make([]string, 0, len(parts))
    for _, part := range parts {
        trimmed := strings.TrimSpace(part)
        if trimmed != "" {
            interfaces = append(interfaces, trimmed)
        }
    }
    *il = InterfaceList(interfaces)
    return nil
}

Usage in structs:

type Rule struct {
    XMLName xml.Name `xml:"rule"`
    Interface InterfaceList `xml:"interface,omitempty"` // wan,lan,opt1
}

Exception: Interface groups use space separators, not commas:

Members: splitNonEmpty(e.Members, " ") // Note space, not comma

Consequences of Incorrect Schema Mapping#

Choosing the wrong Go type silently breaks semantics:

No error or panic — encoding/xml simply overwrites on repeat encounter
Data loss is invisible — program continues as if nothing went wrong
Downstream effects — converters, reports, and audit plugins all receive truncated data model
Security implications — dropped firewall rules, missing privileges, lost DNS servers

Real-world example: The opnDossier project documents ~40+ known schema gaps where string fields should be BoolFlag or []string, representing silent data loss risks.

GOTCHAS.md Cross-Reference#

The GOTCHAS.md file documents two related XML parsing gotchas:

Section 3.2: XML Presence vs. Absence:

The encoding/xml package treats self-closing tags (e.g., <disabled/>) and missing tags identically for string fields.
Gotcha: Use *string (pointer to string) when you need to distinguish between "element present but empty" ("") and "element absent" (nil).

Section 3.3: Repeated XML Elements and string Fields:

When an XML element appears multiple times (e.g., <priv>a</priv><priv>b</priv>), a string field only captures the first occurrence — all others are silently dropped. Use []string for elements that can repeat.

Both gotchas relate to listtags because presence detection (using *string or BoolFlag) and array detection (using []Type) are both critical for preventing silent semantic violations.

Correct vs. Incorrect Implementations#

✅ Correct: Container with Slice#

type Bridges struct {
    XMLName xml.Name `xml:"bridges"` // XMLName FIRST
    Bridge []Bridge `xml:"bridge,omitempty"`
}

type Bridge struct {
    XMLName xml.Name `xml:"bridge"` // XMLName FIRST
    Members string `xml:"members,omitempty"`
    Descr string `xml:"descr,omitempty"`
}

❌ Incorrect: Scalar Field (Silent Data Loss)#

// WRONG: Only last <bridge> child retained
type Bridges struct {
    XMLName xml.Name `xml:"bridges"`
    Bridge Bridge `xml:"bridge,omitempty"` // Missing []
}

❌ Incorrect: XMLName Not First Field#

// WRONG: Breaks XML marshaling
type Rule struct {
    Protocol string `xml:"protocol"`
    XMLName xml.Name `xml:"rule"` // Must be FIRST
}

✅ Correct: Top-Level Repeating with Manual Append#

// In parser switch:
case "cert":
    var cert schema.Certificate
    if err := decodeSection(dec, &cert, se); err != nil {
        return err
    }
    doc.Certs = append(doc.Certs, cert)
    return nil

The Temp-Variable-Append Pattern (Converter Layer)#

Once schemas correctly capture arrays via []Type, converters transform them using temp-variable-append:

func (c *Converter) convertBridges(doc *schema.OpnSenseDocument) []common.Bridge {
    if len(doc.Bridges.Bridge) == 0 {
        return nil // Return nil, not []Bridge{}
    }
    result := make([]common.Bridge, 0, len(doc.Bridges.Bridge)) // Pre-allocate
    for _, b := range doc.Bridges.Bridge {
        result = append(result, common.Bridge{
            BridgeIf: b.Bridgeif,
            Members: splitNonEmpty(b.Members, ","),
            Description: b.Descr,
            STP: bool(b.STP),
        })
    }
    return result
}

Pattern rules:

Check for empty source → return nil (not []Type{})
Pre-allocate with make([]Type, 0, len(source))
Loop and append transformed elements
Return result

Three-Layer Architecture Synchronization#

Using slice types correctly requires coordination across all three layers:

XML Tag → Schema Field: Element names in config.xml must match struct tags in schema package
Schema Field → Parser Switch: Switch cases must use XML tag names (not Go field names)
Schema Field → Converter Logic: Converters must use temp-variable-append for slices

// Parser uses XML tag name (not Go field name)
case "nat": // XML tag name
    return decodeSection(dec, &doc.Nat, se) // Go field name

These are distinct from the array/slice pattern but commonly confused:

OPNsense Pattern	Go Type	Example	Purpose
Presence-based boolean	`BoolFlag`	`<disabled/>`, `<log/>`	Element exists = true
Value-based boolean	`string`	`<enable>1</enable>`	Content `== "1"`
Presence with value access	`*string`	`<any/>` in Source/Dest	Distinguish absent from empty
Container + children	`[]ChildType`	`VLANs` with `[]VLAN`	Repeating elements
Comma-separated values	`InterfaceList`	`wan,lan,opt1`	Custom UnmarshalXML

Critical: Using BoolFlag for value-based fields silently breaks semantics because BoolFlag.UnmarshalXML treats any present element as true regardless of content—so <enabled>0</enabled> incorrectly becomes true.

Decision Tree: Choosing the Right Go Type#

From XML Presence Detection documentation:

Does this element appear in the listtags list?
- YES → Continue to question 1a
- NO → Continue to question 2
1a. Does the XML contain repeating elements or a single element with delimiters?
- Repeating elements (pfSense pattern: <priv>a</priv><priv>b</priv>) → Use []Type
- Single element with delimiters (OPNsense pattern: space/comma-separated in one tag) → Use string with custom parsing
Does element presence (vs absence) convey meaning?
- YES + boolean → BoolFlag
- YES + needs value → *string
- NO → string
Does upstream PHP use isset() or !empty()? → BoolFlag or *string
Does upstream PHP use == "1" value comparison? → string

Adding New Schema Fields#

When adding fields that might be listtags:

Check upstream pfSense/OPNsense PHP source (esp. xmlparse.inc)
Verify if element is in listtags array
Add field to appropriate schema struct with correct type ([]Type)
Add XML round-trip tests
Update validator if field has constraints
Document in development notes

While pfSense and OPNsense share common ancestry and similar XML structures, their handling of certain listtag elements differs significantly. The pfSense schema documentation provides detailed coverage of these platform-specific differences.

Confirmed pfSense Implementations#

pfSense correctly implements these critical listtag elements as arrays:

Group.Priv (pfsense/system.go) — Uses []string for repeating <priv> elements
System.DNSServers (pfsense/system.go) — Uses []string for multiple <dnsserver> elements

OPNsense Schema Gaps#

The OPNsense schema currently has these fields that may need review:

Group.Priv (opnsense/system.go:107) — Currently string, OPNsense may use space-separated format instead of repeating elements
Interface.Dnsserver (opnsense/interfaces.go:188) and DhcpdInterface.Dnsserver (opnsense/dhcp.go:117) — Currently string, OPNsense may use space-separated DNS servers

Decision Guidance: When to Use `[]string` vs `string`#

When implementing or reviewing schema fields that appear in pfSense's listtags array:

Check the actual XML structure — pfSense's listtags list is authoritative for pfSense, but OPNsense may have diverged
Examine real configuration files — Compare XML from both platforms to identify structural differences
pfSense pattern: Repeating XML elements (<priv>a</priv><priv>b</priv>) → use []string
OPNsense pattern: Some fields use space/comma-separated values in a single element → use string with custom parsing
Cross-reference the platform schema documentation:
- pfSense schema README for pfSense-specific patterns
- OPNsense Configuration Format for OPNsense divergence notes

Usage Examples#

Example 1: Parsing VLANs (Container Pattern)#

// XML:
// <vlans>
// <vlan><if>em0</if><tag>100</tag></vlan>
// <vlan><if>em0</if><tag>200</tag></vlan>
// </vlans>

type Config struct {
    VLANs VLANs `xml:"vlans"`
}

type VLANs struct {
    XMLName xml.Name `xml:"vlans"`
    VLAN []VLAN `xml:"vlan,omitempty"`
}

type VLAN struct {
    XMLName xml.Name `xml:"vlan"`
    If string `xml:"if,omitempty"`
    Tag string `xml:"tag,omitempty"`
}

// Parsing:
var config Config
xml.Unmarshal(xmlData, &config)
// config.VLANs.VLAN now contains both VLAN entries

Example 2: Parsing CAs (Top-Level Repeating)#

// XML:
// <pfsense>
// <ca><refid>1</refid><descr>Root CA</descr></ca>
// <ca><refid>2</refid><descr>Intermediate CA</descr></ca>
// </pfsense>

type Document struct {
    CAs []CertificateAuthority `xml:"ca"`
}

// Requires manual append in parser:
case "ca":
    var ca schema.CertificateAuthority
    if err := decodeSection(dec, &ca, se); err != nil {
        return err
    }
    doc.CAs = append(doc.CAs, ca)
    return nil

Example 3: Comma-Separated Interfaces#

// XML:
// <rule>
// <interface>wan,lan,opt1</interface>
// </rule>

type Rule struct {
    XMLName xml.Name `xml:"rule"`
    Interface InterfaceList `xml:"interface,omitempty"`
}

// InterfaceList custom type handles splitting
// Result: []string{"wan", "lan", "opt1"}

Example 4: pfSense vs OPNsense Group Privileges#

// OPNsense - may use space-separated privileges in a single element
type Group struct {
    Name string `xml:"name"`
    Priv string `xml:"priv,omitempty"` // "priv1 priv2 priv3"
}

// pfSense - uses repeating <priv> elements (listtag pattern)
type Group struct {
    Name string `xml:"name"`
    Priv []string `xml:"priv,omitempty"` // Multiple <priv> elements
}

// XML comparison:
// OPNsense: <group><name>admin</name><priv>user-shell-access system-config</priv></group>
// pfSense: <group><name>admin</name><priv>user-shell-access</priv><priv>system-config</priv></group>

// pfSense System struct with repeating <dnsserver> elements
type System struct {
    Hostname string `xml:"hostname"`
    DNSServers []string `xml:"dnsserver"` // Repeating elements
}

// XML:
// <system>
// <hostname>firewall</hostname>
// <dnsserver>8.8.8.8</dnsserver>
// <dnsserver>8.8.4.4</dnsserver>
// <dnsserver>2001:4860:4860::8888</dnsserver>
// </system>

// All three DNS servers captured in DNSServers slice

Relevant Code Files#

File	Purpose	Lines
OPNsense Schema
`pkg/schema/opnsense/opnsense.go`	Root document with all top-level XML tags	9-46
`pkg/schema/opnsense/interfaces.go`	VLANs, Bridges, GIF, GRE, LAGG patterns	208-240
`pkg/schema/opnsense/network.go`	Gateways/Gateway, StaticRoutes	29-79
`pkg/schema/opnsense/security.go`	InterfaceList custom type	10-58
`pkg/schema/opnsense/system.go`	Group.Priv field (potential gap)	107
`pkg/schema/opnsense/common.go`	BoolFlag type definition	14-55
pfSense Schema
`pkg/schema/pfsense/README.md`	Complete pfSense schema documentation, listtags reference	Full doc
`pkg/schema/pfsense/document.go`	pfSense root document structure	Full file
`pkg/schema/pfsense/system.go`	Group.Priv, System.DNSServers arrays	Full file
Parser Implementation
`internal/cfgparser/xml.go`	Parser switch-cases, manual append	132-252
`pkg/parser/opnsense/converter.go`	Main converter, convertVLANs	168-278
`pkg/parser/opnsense/converter_network.go`	Temp-variable-append pattern	11-154
Documentation
`GOTCHAS.md`	Section 3.2: XML Presence vs Absence, Section 3.3: Repeated XML Elements	50-65

XML Presence Detection: Related but distinct pattern using BoolFlag and *string types
Three-Layer Architecture: Schema-Parser-Converter synchronization requirements
Custom UnmarshalXML: Implementing custom XML parsing for non-standard formats
OPNsense Configuration Parsing: Broader context for listtags pattern usage
Silent Data Loss Prevention: Type safety strategies in statically-typed parsers

PfSense XML Listtags Schema Pattern#