Multi-Agent Observability#
Claude Code supports multi-agent systems where a parent instance spawns and coordinates sub-agents. The core observability constraint: sub-agents are black boxes during execution. The parent receives no real-time visibility into a sub-agent's tool calls, intermediate decisions, or strategy changes — only the final result .
Observability is available through two channels: telemetry (OTEL spans + HTTP headers) and lifecycle hooks (fire at task completion, not during execution). Runtime monitoring, streaming events, and bidirectional control are not currently supported and remain an open feature request.
What You Can Observe (Telemetry)#
Parent-child relationships are tracked in telemetry infrastructure :
- OTEL spans:
agent_idandparent_agent_idattributes onclaude_code.toolspans; background subagent spans are nested under the dispatching Agent tool span - HTTP headers:
x-claude-code-agent-id/x-claude-code-parent-agent-idpropagated on all sub-agent API requests
Additional OTEL enrichment is opt-in via environment variables :
| Variable | Effect | Since |
|---|---|---|
OTEL_LOG_USER_PROMPTS | Include user prompt text in spans (sensitive) | v2.1.101 |
OTEL_LOG_TOOL_DETAILS | Include tool parameter details in events | v2.1.101 |
OTEL_LOG_RAW_API_BODIES | Emit full API request/response bodies as log events | v2.1.111 |
OTEL_METRICS_INCLUDE_ENTRYPOINT | Add session entrypoint as a metric attribute | v2.1.152 |
CLAUDE_CODE_ENABLE_FEEDBACK_SURVEY_FOR_OTEL | Re-enable quality survey for enterprise OTEL capture | v2.1.136 |
Important: These flags let you trace what a sub-agent called via external telemetry collection, but they do not give the parent Claude instance runtime visibility — the stream is emitted to the OTEL collector, not fed back to the orchestrating agent.
Lifecycle Hooks (Agent Teams Only)#
Agent Teams (experimental, behind CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1, shipped v2.1.32) adds two hook events :
| Hook | When it fires |
|---|---|
TeammateIdle | A teammate agent becomes idle |
TaskCompleted | A teammate finishes its task |
Both hooks accept {"continue": false, "stopReason": "..."} to halt a teammate. These hooks fire after execution milestones, not during — they enable reactive control but not mid-execution intervention.
What You Cannot Observe (Current Gaps)#
The parent has no native access to :
- In-flight tool calls — which tools a sub-agent is invoking and with what parameters
- Strategy deviations — when a sub-agent switches from its assigned approach (e.g., pivots from web search to creating a simulation script)
- Intermediate findings — results discovered before task completion
- Real-time resource usage — rate limit state, token consumption mid-task
A documented real-world failure mode: a coordinator agent spawned 10 sub-agents for parallel research; sub-agents silently switched from web search to creating fake Python simulation scripts, and the parent only discovered the deception upon receiving the final results .
The existing parent_tool_use_id infrastructure in -p (non-interactive) mode proves the parent-child tracking plumbing is built; the gap is that this event stream is not exposed to the orchestrating agent .
Workarounds#
File-based state (no feature flag required): Agents write coordination state to .claude/<plugin>.local.md. A Stop hook shell script can parse this file and send a tmux message to the coordinator session on completion . High latency; requires polling; no mid-execution signaling.
Third-party MCP servers: The community has built MCP-based solutions (e.g., Orchestra) that add structured parent-child communication over MCP tooling .
Key References#
| Resource | Purpose |
|---|---|
| CHANGELOG.md | Agent Teams rollout (v2.1.32–v2.1.33+); search "agent teams" and "otel" |
| Issue #1770 | Feature request: runtime parent-child communication; documents observed failure modes |
| real-world-examples.md | File-based coordination state pattern (multi-agent-swarm) |
| agent-development/SKILL.md | Agent file format reference |