Trace Contracts and TraceConfig

Overview#

The trace contracts system in insideLLMs provides deterministic execution tracing and validation for LLM agent runs. It enables reproducible, CI-enforceable traces by recording ordered events, normalizing payloads, and validating traces against configurable contracts. The system is configured via YAML, supports fine-grained control over what is recorded and validated, and integrates with agent probes for automated testing and enforcement.

Purpose#

Trace contracts ensure that agent executions are reproducible and conform to expected behaviors. They detect trace drift, enforce event boundaries, validate tool payloads and order, and support CI workflows by surfacing violations deterministically. This system is especially useful for regression testing, debugging, and enforcing invariants in agent pipelines.

TraceConfig Dataclasses#

The trace contracts system is configured using a hierarchy of Python dataclasses, which can be loaded from YAML. The main dataclasses are:

TraceConfig: Top-level configuration, includes version, enabled flag, storage, fingerprinting, contracts, and violation handling.
TraceStoreConfig: Controls trace storage mode, payload inclusion, and legacy redaction.
FingerprintConfig: Controls trace fingerprinting and payload normalization.
TraceContractsConfig: Enables/disables specific contract validators and their options.
OnViolationConfig: Specifies what to do when a contract is violated (e.g., record or fail the probe).

Example structure (source):

@dataclass
class TraceConfig:
    version: int = 1
    enabled: bool = True
    store: TraceStoreConfig = field(default_factory=TraceStoreConfig)
    fingerprint: FingerprintConfig = field(default_factory=FingerprintConfig)
    contracts: TraceContractsConfig = field(default_factory=TraceContractsConfig)
    on_violation: OnViolationConfig = field(default_factory=OnViolationConfig)

Each nested config (e.g., TraceStoreConfig, TraceContractsConfig) has its own fields for fine-grained control.

YAML Configuration and Loading#

Trace contracts are typically configured in YAML. The configuration is loaded and parsed into the TraceConfig dataclass using the load_trace_config function. This function handles defaults, backward compatibility, and nested configuration parsing (source).

Example YAML:

version: 1
enabled: true
store:
  mode: full
  include_payloads: true
contracts:
  enabled: true
  fail_fast: false
  stream_boundaries:
    enabled: true
  tool_payloads:
    enabled: true
    tools:
      calculator:
        args_schema:
          required: [x, y]
          properties:
            x: {type: integer}
            y: {type: integer}
fingerprint:
  enabled: true
  algorithm: sha256
  normaliser:
    kind: builtin
    name: structural_v1
    config:
      drop_keys: [request_id, response_id, created, timestamp, latency_ms]
      hash_paths: [result, raw]
      hash_strings_over: 512
on_violation:
  mode: record

Loading YAML:

from insideLLMs.trace_config import load_trace_config

with open("trace_config.yaml") as f:
    import yaml
    config_dict = yaml.safe_load(f)
config = load_trace_config(config_dict)

Compiling to Contracts#

Once loaded, the TraceConfig can be compiled into validator inputs using the to_contracts() method. This produces the internal structures needed for validation: tool schemas, tool order rules, toggles for each validator, and the fail_fast flag (source).

contracts = config.to_contracts()
# contracts = {
# "tool_schemas": ...,
# "tool_order_rules": ...,
# "toggles": ...,
# "fail_fast": ...
# }

Validation and the Agent Probe#

The AgentProbe class integrates tracing and contract validation for agent runs. It is available as AgentProbe in the main probes package and can be imported directly from the main package:

from insideLLMs import AgentProbe, TraceConfig, load_trace_config, validate_with_config

Workflow#

Load the trace configuration (from YAML/dict or TraceConfig).
Run the agent using AgentProbe, which records trace events using a TraceRecorder.
Payloads are normalized using TracePayloadNormaliser if configured.
A deterministic trace fingerprint is computed.
Trace events are validated using validate_with_config, which applies all enabled contracts.
Violations are collected and, if configured, the probe can fail on violation.

Key methods:

validate_with_config(events, config): Runs all enabled validators and returns a list of violations.
TracePayloadNormaliser: Drops noisy keys, hashes large blobs, and canonicalizes payloads for stable fingerprinting.

Example usage:

from insideLLMs import AgentProbe, load_trace_config

config = load_trace_config({...}) # or from YAML
tools = {...} # dict of tool functions
probe = AgentProbe(tools=tools, trace_config=config)
result = probe.run(model, data)
if result.metadata.get("custom", {}).get("trace", {}).get("violations"):
    print("Violations found:", result.metadata["custom"]["trace"]["violations"])

Violations and trace metadata are stored in the metadata["custom"]["trace"] field of the probe result. This includes the trace fingerprint, violations, tool sequence, and (optionally) the full trace events, depending on configuration.

For more advanced usage, you can use the TraceConfig, TracePayloadNormaliser, and validate_with_config helpers directly from the main package.

Example Configuration Files#

Minimal configuration (contracts off):

enabled: false

Full validation with tool schema:

version: 1
enabled: true
contracts:
  enabled: true
  tool_payloads:
    enabled: true
    tools:
      search:
        args_schema:
          required: [query]
          properties:
            query: {type: string}

Fail probe on violation:

on_violation:
  mode: fail_probe

Usage Scenarios#

CI Enforcement: Enable all contracts and set on_violation.mode: fail_probe to fail tests if any trace contract is violated. This ensures regressions or unexpected agent behaviors are caught automatically.
Trace Drift Detection: Use trace fingerprints to detect changes in agent execution traces between runs. The CLI supports --fail-on-trace-drift to enforce this.
Debugging and Development: Enable selective contracts (e.g., only tool payloads or stream boundaries) to focus on specific invariants during agent development.
Payload Redaction and Normalization: Use the normalizer to drop or hash sensitive or noisy fields, ensuring trace fingerprints are stable and privacy-preserving.

Trace Contracts and Validators#

The system provides pure, deterministic validators for:

Stream boundaries: Ensures every stream_start has a matching stream_end, chunks are sequential, and no chunks appear outside boundaries.
Tool payloads: Validates tool call arguments against configured schemas (required fields, types).
Tool order: Enforces constraints on the order of tool calls (must precede/follow, forbidden sequences).
Generate boundaries: Ensures generate_start/generate_end events are properly paired and not nested.
Tool results: Ensures every tool call has a corresponding result and results do not appear before calls.

Violations are reported as Violation objects with code, event sequence, detail, and context (source).

CLI Integration#

Trace contracts are integrated into the CLI. Flags like --fail-on-trace-violations and --fail-on-trace-drift allow enforcing trace validation and drift detection in automated workflows. All trace data is stored in ResultRecord.custom to avoid schema changes (source).

Testing and Extensibility#

The trace contracts system is well-tested with unit and integration tests. It is designed to be extensible: new contracts and normalizers can be added as needed.

For further details, see the implementation in trace_config.py, trace_contracts.py, and probes/agent_probe.py.