Overview#
The trace contracts system in insideLLMs provides deterministic execution tracing and validation for LLM agent runs. It enables reproducible, CI-enforceable traces by recording ordered events, normalizing payloads, and validating traces against configurable contracts. The system is configured via YAML, supports fine-grained control over what is recorded and validated, and integrates with agent probes for automated testing and enforcement.
Purpose#
Trace contracts ensure that agent executions are reproducible and conform to expected behaviors. They detect trace drift, enforce event boundaries, validate tool payloads and order, and support CI workflows by surfacing violations deterministically. This system is especially useful for regression testing, debugging, and enforcing invariants in agent pipelines.
TraceConfig Dataclasses#
The trace contracts system is configured using a hierarchy of Python dataclasses, which can be loaded from YAML. The main dataclasses are:
- TraceConfig: Top-level configuration, includes version, enabled flag, storage, fingerprinting, contracts, and violation handling.
- TraceStoreConfig: Controls trace storage mode, payload inclusion, and legacy redaction.
- FingerprintConfig: Controls trace fingerprinting and payload normalization.
- TraceContractsConfig: Enables/disables specific contract validators and their options.
- OnViolationConfig: Specifies what to do when a contract is violated (e.g., record or fail the probe).
Example structure (source):
@dataclass
class TraceConfig:
version: int = 1
enabled: bool = True
store: TraceStoreConfig = field(default_factory=TraceStoreConfig)
fingerprint: FingerprintConfig = field(default_factory=FingerprintConfig)
contracts: TraceContractsConfig = field(default_factory=TraceContractsConfig)
on_violation: OnViolationConfig = field(default_factory=OnViolationConfig)
Each nested config (e.g., TraceStoreConfig, TraceContractsConfig) has its own fields for fine-grained control.
YAML Configuration and Loading#
Trace contracts are typically configured in YAML. The configuration is loaded and parsed into the TraceConfig dataclass using the load_trace_config function. This function handles defaults, backward compatibility, and nested configuration parsing (source).
Example YAML:
version: 1
enabled: true
store:
mode: full
include_payloads: true
contracts:
enabled: true
fail_fast: false
stream_boundaries:
enabled: true
tool_payloads:
enabled: true
tools:
calculator:
args_schema:
required: [x, y]
properties:
x: {type: integer}
y: {type: integer}
fingerprint:
enabled: true
algorithm: sha256
normaliser:
kind: builtin
name: structural_v1
config:
drop_keys: [request_id, response_id, created, timestamp, latency_ms]
hash_paths: [result, raw]
hash_strings_over: 512
on_violation:
mode: record
Loading YAML:
from insideLLMs.trace_config import load_trace_config
with open("trace_config.yaml") as f:
import yaml
config_dict = yaml.safe_load(f)
config = load_trace_config(config_dict)
Compiling to Contracts#
Once loaded, the TraceConfig can be compiled into validator inputs using the to_contracts() method. This produces the internal structures needed for validation: tool schemas, tool order rules, toggles for each validator, and the fail_fast flag (source).
contracts = config.to_contracts()
# contracts = {
# "tool_schemas": ...,
# "tool_order_rules": ...,
# "toggles": ...,
# "fail_fast": ...
# }
Validation and the Agent Probe#
The AgentProbe class integrates tracing and contract validation for agent runs. It is available as AgentProbe in the main probes package and can be imported directly from the main package:
from insideLLMs import AgentProbe, TraceConfig, load_trace_config, validate_with_config
Workflow#
- Load the trace configuration (from YAML/dict or
TraceConfig). - Run the agent using
AgentProbe, which records trace events using aTraceRecorder. - Payloads are normalized using
TracePayloadNormaliserif configured. - A deterministic trace fingerprint is computed.
- Trace events are validated using
validate_with_config, which applies all enabled contracts. - Violations are collected and, if configured, the probe can fail on violation.
Key methods:
validate_with_config(events, config): Runs all enabled validators and returns a list of violations.TracePayloadNormaliser: Drops noisy keys, hashes large blobs, and canonicalizes payloads for stable fingerprinting.
Example usage:
from insideLLMs import AgentProbe, load_trace_config
config = load_trace_config({...}) # or from YAML
tools = {...} # dict of tool functions
probe = AgentProbe(tools=tools, trace_config=config)
result = probe.run(model, data)
if result.metadata.get("custom", {}).get("trace", {}).get("violations"):
print("Violations found:", result.metadata["custom"]["trace"]["violations"])
Violations and trace metadata are stored in the metadata["custom"]["trace"] field of the probe result. This includes the trace fingerprint, violations, tool sequence, and (optionally) the full trace events, depending on configuration.
For more advanced usage, you can use the TraceConfig, TracePayloadNormaliser, and validate_with_config helpers directly from the main package.
Example Configuration Files#
Minimal configuration (contracts off):
enabled: false
Full validation with tool schema:
version: 1
enabled: true
contracts:
enabled: true
tool_payloads:
enabled: true
tools:
search:
args_schema:
required: [query]
properties:
query: {type: string}
Fail probe on violation:
on_violation:
mode: fail_probe
Usage Scenarios#
- CI Enforcement: Enable all contracts and set
on_violation.mode: fail_probeto fail tests if any trace contract is violated. This ensures regressions or unexpected agent behaviors are caught automatically. - Trace Drift Detection: Use trace fingerprints to detect changes in agent execution traces between runs. The CLI supports
--fail-on-trace-driftto enforce this. - Debugging and Development: Enable selective contracts (e.g., only tool payloads or stream boundaries) to focus on specific invariants during agent development.
- Payload Redaction and Normalization: Use the normalizer to drop or hash sensitive or noisy fields, ensuring trace fingerprints are stable and privacy-preserving.
Trace Contracts and Validators#
The system provides pure, deterministic validators for:
- Stream boundaries: Ensures every
stream_starthas a matchingstream_end, chunks are sequential, and no chunks appear outside boundaries. - Tool payloads: Validates tool call arguments against configured schemas (required fields, types).
- Tool order: Enforces constraints on the order of tool calls (must precede/follow, forbidden sequences).
- Generate boundaries: Ensures
generate_start/generate_endevents are properly paired and not nested. - Tool results: Ensures every tool call has a corresponding result and results do not appear before calls.
Violations are reported as Violation objects with code, event sequence, detail, and context (source).
CLI Integration#
Trace contracts are integrated into the CLI. Flags like --fail-on-trace-violations and --fail-on-trace-drift allow enforcing trace validation and drift detection in automated workflows. All trace data is stored in ResultRecord.custom to avoid schema changes (source).
Testing and Extensibility#
The trace contracts system is well-tested with unit and integration tests. It is designed to be extensible: new contracts and normalizers can be added as needed.
For further details, see the implementation in trace_config.py, trace_contracts.py, and probes/agent_probe.py.