DaemonEye Spec – SQL-to-IPC Detection Architecture#

Status: Draft v0.2 Owner: EvilBit Labs / DaemonEye Team Scope: Collector–Agent contract for SQL-driven detection across stream + database layers Applies To: procmond (and other collectors), daemoneye-agent, daemoneye-lib, daemoneye-cli

Mirror notice: This page mirrors spec/daemon_eye_spec_sql_to_ipc_detection_architecture.md in the DaemonEye repository. The repository copy is authoritative. Last sync: 2026-04-17.
Superseded sections: §9.1–§9.4 are superseded by ADR-0006 — Detection Query Execution (redb + DataFusion). See §9 for the full supersede note.

1) Purpose & Non-Goals#

Purpose. Define a precise model for how SQL rules are translated into collector instructions and evaluated to produce alerts. This spec standardizes the SQL → IPC → storage → Alert path so collectors and the agent evolve independently without breaking rules.

Non-Goals#

Building a general RDBMS. The agent executes a constrained, read-only SQL dialect for detections.
Requiring kernel hooks. Baseline collectors may be user-mode (e.g., sysinfo), with advanced sources added by tier (eBPF/ETW/EndpointSecurity).

2) Mental Model (Virtual Database)#

Each collector contributes a virtual schema—logical, read-only tables backed by event streams. The agent exposes a global catalog (namespaced) used by rules. Examples:

processes.* (e.g., processes.snapshots, processes.events.exec)
network.* (e.g., network.connections)
kernel.* (e.g., kernel.image_loads)
Collectors "insert" records into these tables by streaming IPC messages to the agent. The agent persists events in a lightweight embedded store and runs validated SQL rules over the persisted store only when the entire rule cannot be pushed down to the collector. If the collector's data exactly matches the rule result, it can be passed directly to the alert system without alteration (write-through).
Audit Trail: The collector is responsible for logging matches in the write-only audit log using the collector-core framework, independent of the agent's further analysis.

Key Architectural Principles#

Logical Views: SQL rules are treated as logical views, not materialized tables
Pushdown Optimization: Simple predicates and projections are pushed to collectors
Operator Pipeline: Complex operations (joins, aggregations) execute in the agent's operator pipeline
Bounded Memory: All operations use cardinality caps and time windows to prevent unbounded growth

3) End-to-End Flow#

Two Layers#

Stream Layer (IPC) — Low-latency filtering; collectors push only the needed records.
Operator Pipeline Layer — SQL parsed into logical plans, mapped onto scan/query operators over redb. This enables joins, grouping, time windows, and dedupe.

4) Query Lifecycle & Pushdown#

Pushdown Rules#

Agent derives from the AST:

Tables/aliases → which collectors to engage
Columns → projection list for each collector
Predicates → pre-filter on collector side when feasible
Windows/Aggregations/Joins → kept for the agent's operator pipeline
If a collector cannot pushdown a predicate, it sends its minimal projection and the agent filters on ingest.

4.1) Collector Schema Contract (Virtual Catalog)#

Purpose#

Standardize how collectors announce their tables, columns, and pushdown capabilities so the agent can plan rules deterministically.

Descriptor (sent at startup & on change)#

{
  "collector_id": "<name>@<platform-arch>",
  "version": "semver",
  "tables": [
    {
      "name": "<domain>.<table>",
      "columns": {
        "col": "type[?]",
        "collection_time": "datetime"
      },
      "keys": {
        "primary": ["collection_time", "seq"],
        "secondary": ["pid", "ppid", "name"]
      },
      "join_hints": [
        {"left": "ppid", "right_table": "processes.snapshots", "right": "pid", "relation": "parent_of"}
      ],
      "pushdown": {
        "predicates": ["=", "!=", ">", "<", ">=", "<=", "IN", "LIKE", "REGEXP"],
        "project": true,
        "order_by": ["collection_time"],
        "limits": {"max_rate": 200000}
      }
    }
  ]
}

Agent Behavior#

Validates schema → registers/updates global catalog
Unknown optional columns are ignored; missing required columns reject the table
Capabilities become the upper bound for pushdown during planning

Evolution#

Additive columns allowed anytime
Breaking changes require version bump and a deprecation grace period

Health & Metrics#

Collectors MUST expose: send rate, drops, queue depth, last ACK seq_no, and schema version.

4.2) Minimum Contracts by Collector Type#

Baseline tables/columns and pushdown minimums per collector. Enterprise variants may add columns; the baseline must be stable.

Processes Collector (procmond)#

Table: processes.snapshots — Required columns: pid:u32, ppid:u32?, name:string, executable_path:string?, command_line:string?, start_time:datetime?, executable_hash:hex256?, collection_time:datetime — Secondary keys: pid, ppid, name, executable_hash — Join hints: ppid -> processes.snapshots.pid (parent_of) — Pushdown: =, !=, IN, LIKE, REGEXP, projection; optional > >= < <= on numeric/time.

Network Collector#

Table: network.connections — Required: pid:u32?, proto:enum{tcp,udp}, src_ip:ip, src_port:u16, dst_ip:ip, dst_port:u16, state:enum?, bytes_sent:u64?, bytes_recv:u64?, collection_time:datetime — Secondary keys: pid, dst_ip, dst_port — Join hints: pid -> processes.snapshots.pid — Pushdown: equality on proto, IP/port, pid; LIKE/REGEXP on IPs/ports; projection.

Image/Module Load Collector#

Table: kernel.image_loads — Required: pid:u32?, image_path:string, image_hash:hex256?, collection_time:datetime — Secondary keys: pid, image_hash — Join hints: pid -> processes.snapshots.pid — Pushdown: equality on pid, image_hash; LIKE/REGEXP on image_path; projection.

File System Collector (optional baseline)#

Table: fs.events — Required: pid:u32?, op:enum{create,write,read,delete,rename}, path:string, size:i64?, collection_time:datetime — Secondary keys: pid, op — Pushdown: equality on op, pid; LIKE/REGEXP on path; projection.

Auth/Security Events Collector#

Table: security.auth — Required: user:string, host:string?, event:enum{logon,logoff,lock,unlock,auth_fail}, status:enum{success,fail}?, pid:u32?, collection_time:datetime — Secondary keys: user, event — Join hints: pid -> processes.snapshots.pid — Pushdown: equality on user, event, status; LIKE/REGEXP on user/host; projection.

General Requirements (All Collectors)#

Clock: monotonic per host; include clock_skew_ms in heartbeat
Sequence: seq_no strictly increasing per task_id
Types: adhere to documented scalar types; ? denotes optional
Backpressure: respect agent credits; cap local buffers; report drops with counters

Regex Requirements (All Collectors)#

Regex engine with (?i), configurable per-pattern latency and memory bounds, DoS prevention, cache, compile-time validation.
Defaults (configurable): per-pattern latency 10ms, compilation timeout 100ms, memory limit 1MB per pattern, LRU cache size 1000.
Rejection criteria: exceeds compilation timeout, excessive memory, exponential backtracking, nested quantifiers beyond safe limits. Prefer RE2-like engines; pre-compile at startup.

4.7) Complex Pattern Matching (YARA Integration)#

YARA rules don't fit the simple key-value pushdown model. Handled via a specialized collector:

{
  "collector_id": "yara-engine@linux-x86_64",
  "tables": [
    {
      "name": "yara.scan_results",
      "columns": {
        "file_path": "string",
        "rule_name": "string",
        "matches": "json",
        "scan_time": "datetime",
        "file_hash": "hex256"
      },
      "pushdown": {"predicates": ["=", "!=", "IN"], "project": true}
    }
  ]
}

Hybrid Detection#

SELECT p.pid, p.name, f.path, y.rule_name, y.matches
FROM processes.snapshots p
JOIN fs.events f ON f.pid = p.pid
JOIN yara.scan_results y ON y.file_path = f.path
WHERE p.name = 'rundll32.exe'
  AND f.path LIKE '/tmp/%'
  AND y.rule_name = 'suspicious_behavior'
  AND f.collection_time > datetime('now', '-30 minutes');

YARA scanning happens per-file in the collector; only scan results stream to agent. Sampling and rate-limiting at the collector.

4.8) Supplemental Rule Data for Specialty Collectors#

Specialty collectors (YARA, eBPF, network analysis, platform-specific) receive supplemental rule data alongside standard SQL pushdown. Example eBPF payload:

{
  "task_id": "t_9482",
  "rule_id": "r_network_anomaly",
  "table": "network.connections",
  "project": ["pid", "dst_ip", "dst_port", "payload_hash"],
  "where": {"and": [{"eq": ["proto", "tcp"]}, {"gt": ["dst_port", 1024]}]},
  "supplemental_rules": {
    "ebpf_program": {
      "type": "ebpf",
      "version": "1.0",
      "program": "base64_encoded_ebpf_bytecode",
      "maps": {"suspicious_ips": "BPF_MAP_TYPE_HASH"},
      "attachments": [{"hook": "xdp", "interface": "eth0", "priority": 100}]
    }
  },
  "rate_limit": {"per_sec": 50000},
  "ttl_ms": 300000
}

Collector-core exposes a SupplementalRuleData enum (YaraRules, NetworkAnalysis, PlatformSpecific) and a SpecialtyRuleEngine trait. Results stream in a unified format with supplemental_type and platform fields for SQL-side filtering.

4.9) Dynamic Reactive Pipeline (Non-DAG Architecture)#

The detection pipeline is reactive, not a static DAG:

Initial triggers from base collectors
JOIN-driven collection: SQL JOINs trigger collection of the right-hand side
Cascading analysis: results trigger additional analysis
Dynamic JOINs: agent orchestrates cross-collector correlation
Feedback loops
Example flow: process event → rule requires JOIN with pe.analysis_results → agent triggers PE collector → PE metadata arrives → JOIN result triggers YARA scan → final rule evaluation → alert.
Agent uses a ReactiveOrchestrator with TriggerRules, an analysis queue, and a DynamicJoinResolver. Analysis results cached with TTL; concurrent analysis capped; timeouts enforced.

4.10) Automatic JOIN Triggers (Implicit Correlation)#

Dialect extensions for implicit correlation:

-- AUTO JOIN: always collect related data
SELECT p.pid, p.name, n.dst_ip, n.dst_port, n.protocol
FROM processes.snapshots p
AUTO JOIN network.connections n ON n.pid = p.pid
WHERE p.name = 'suspicious.exe';

-- Conditional AUTO JOIN with WHEN
SELECT p.pid, p.name, n.dst_ip
FROM processes.snapshots p
AUTO JOIN network.connections n ON n.pid = p.pid
WHEN p.name LIKE '%suspicious%' OR p.cpu_usage > 80.0;

-- Implicit correlation (no explicit FROM for secondary tables)
SELECT p.pid, p.name, n.dst_ip, f.path, m.memory_regions
FROM processes.snapshots p
WHERE p.name = 'malware.exe'
  AND n.dst_port = 4444 -- Implicit network collection
  AND f.path LIKE '%temp%' -- Implicit fs collection
  AND m.patterns LIKE '%shellcode%'; -- Implicit memory collection

AutoJoinRule structs define source/target tables, join conditions, trigger conditions, and priority. Orchestrator processes events, matches against rules, triggers collection, integrates results into virtual joined records.

SQL Parser Extensibility#

sqlparser is extended via a custom DaemonEyeDialect wrapping SQLiteDialect:

pub struct DaemonEyeDialect {
    base: SQLiteDialect,
    extensions: DaemonEyeExtensions,
}

impl Dialect for DaemonEyeDialect {
    fn is_identifier_start(&self, ch: char) -> bool { self.base.is_identifier_start(ch) }
    fn is_identifier_part(&self, ch: char) -> bool { self.base.is_identifier_part(ch) || ch == '_' }
    fn supports_filter_during_aggregation(&self) -> bool { self.base.supports_filter_during_aggregation() }
}

AUTO JOIN, WHEN clauses, and implicit correlation are compile-time directives: they (a) emit additional protobuf collection tasks and (b) lower to standard SQL in the Phase 2 derived query.

9) Storage & Execution Model#

Superseded by ADR-0006 — Detection Query Execution (redb + DataFusion) (2026-04-17).
The "custom operator pipeline" direction in §9.2 is retained below for historical context and for its redb schema / indexing guidance (§11.6–11.7), which remains authoritative. However, Phase 2 SQL execution is no longer a hand-rolled operator pipeline — it is Apache DataFusion layered over redb via per-collector TableProvider implementations. See ADR-0006 for rationale, alternatives considered (sled, GlueSQL, Turso, rusqlite/duckdb, Polars), and the compile-time contract between the dialect lowering stage and DataFusion-compatible SQL.
Remains authoritative after ADR-0006:

§11.5 Smart Joins (INLJ / SHJ / MRC strategies)

§11.6 Write-Through & Persistence Semantics

§11.7 redb Performance Playbook (partitioning, key encoding, indexes, writer architecture)
Superseded by ADR-0006:

§11.1–§11.4 (Why Not a Full RDBMS, Chosen Approach: Operator Pipeline, Store Abstraction, Operator Examples)

11.1 Why Not a Full RDBMS?#

Embedded SQL engines like SQLite are heavyweight, require unsafe code, and don't align with the zero-network, operator-focused design.
redb was initially considered, but it is a key-value store, not a relational engine.

11.2 Chosen Approach: Operator Pipeline (SUPERSEDED)#

Originally: parse SQL via sqlparser → translate AST → internal logical plan → chain of operators (scan, filter, project, join, aggregate) executed directly against redb.
Per ADR-0006, this is replaced by: DataFusion SessionContext with a custom TableProvider per collector domain, wrapping redb tables. The compiler emits DataFusion-compatible SQL; the executor runs it. No hand-rolled operator pipeline.

11.3 Store Abstraction#

KV API: put(key, value), scan(range), iter(prefix).
Indexing: time-based and ID-based composite keys.
Pluggability: abstracted storage API, but redb is the canonical backend.

11.4 Operator Examples (SUPERSEDED by DataFusion)#

Prior plan: Scan, Filter, Project, Join (nested loop / hash), Aggregate (hash map). Per ADR-0006, DataFusion provides these operators.

11.5 Smart Joins (No Mandatory Time Window) — AUTHORITATIVE#

We support joins without requiring time windows, keeping memory and latency bounded via data layout, indexes, and bounded state.
Executor Plan Requirements: explicit time windows for aggregations; joins use cardinality caps. Index hot keys (name, pid, ppid, exe_hash), primary key (ts, seq). Bounded hash-group and equi-joins with per-rule memory caps. Collectors support LIKE/REGEXP pushdown; complex patterns execute in operator pipeline. Strict function allow-list with pushdown matrix.
Join Scope & Keys: Only equi-joins on declared keys (pid, ppid, exe_hash, image_base). Join hints in the catalog (e.g., parent_of: processes.snapshots.ppid -> processes.snapshots.pid).
Physical Strategies:

Index Nested-Loop (INLJ) (default, selective) — Build from smaller/filtered input; probe via secondary index. ~O(n log m).
Bounded Symmetric Hash Join (SHJ) — Hash both sides in bounded LRU maps; cardinality-budget evictions (default 100k keys); optional spill to KV.
Materialized Relation Cache (MRC) (parent/child fast path) — Compact map (pid → {ppid, parent_name, start_time}) updated on ingest.
Bounding: cardinality caps, backpressure-aware SHJ→INLJ switching on overflow (emits JOIN_PARTIAL metric), optional spill to KV scratch. Hard guarantee: joins always bounded; results may be partial but never unbounded.
Selectivity Heuristics: INLJ for highly selective predicates; SHJ for similar-sized inputs; MRC when parent/child mapping exists.
Late/Missing Matches: Deferred match window retains unmatched rows briefly. LEFT JOINs emit NULL-extended rows immediately; late matches emit corrections with correlation_id (Enterprise).
Diagnostics: counters join_plan_selected, join_evictions, join_spills, join_partial_results.

11.6 Write-Through & Persistence Semantics — AUTHORITATIVE#

Ingest Transaction: validate & normalize, append to redb base table + secondary indexes, ACK after commit. Idempotent via (task_id, seq_no).
No-Join Path: evaluate predicate on ingested row; dedupe; emit to sinks and append to alerts.events.
Join Path: execute INLJ/SHJ/MRC over redb + in-memory state; join results are ephemeral (not materialized as a table); satisfying tuples dedupe and emit alert.
Aggregation Path: in-memory hash aggregates; periodic snapshots to agg.state; emit on HAVING-satisfaction transitions. Requires explicit time window or configured default.
Idempotence & Dedupe: (task_id, seq_no) write id; rule-scoped dedupe key with TTL.
Crash Safety: atomic ingest commit; restart rebuilds secondary indexes; aggregation state from agg.state snapshots; MRC rebuilt from recent events.
Materialization Policy: base tables + secondary indexes persisted always; alert log with rule ID + dedupe key + source-row pointers; operator state (aggregation snapshots, MRC). Join outputs NOT persisted by default — reconstruct via stored refs.
Persisted (Narrow Cases): Enterprise debugging stores compact alerts.correlations. MRC always kept (small, removes common parent/child join cost).
Redb Schema:

processes.events (primary: (ts_ms, seq) → payload)
processes.idx:pid (multimap: pid → (ts_ms, seq))
image_loads.events, image_loads.idx:image_base, etc.
alerts.events (append-only)
alerts.correlations (optional)
agg.state (aggregation snapshots)
mrc.parent_map (pid → {ppid, parent_name, start_time})

11.7 redb Performance Playbook — AUTHORITATIVE#

Partition by time, fixed-width keys, selective secondary indexes, set-based intersections.
Physical Layout: one base table per logical source per time bucket (processes.events@2025-09-22). Primary key (ts_ms:u64, seq:u32) (16 bytes). Compact fixed-field value (postcard/bincode with version byte). Secondary multimap indexes inside the same bucket: idx:pid, idx:ppid, idx:name (hashed), idx:exe_hash, optional idx:path_prefix.
Query Planning: time window first → partition range. Pick driving index by selectivity. Set-based posting-list intersections in memory; early LIMIT. Prefix LIKE via idx:path_prefix; contains LIKE/REGEXP are agent-only, use driving index to shrink candidate set.
Writer Architecture: single writer thread, group commit (N=2000 records or T=5–10ms). Pipeline: IPC → lock-free MPSC → pre-serialized ring buffer → write base + secondaries + commit. MVCC snapshots for readers (no writer stalls reads). ACK after commit.
Caches: posting-list page cache (LRU 64–256 MB) for hot terms; MRC (lock-free map, rebuilt on start); plan cache (AST → plan, revalidated on stats drift).
Joins Applied: INLJ default (drive from selective side, probe via secondary index); SHJ bounded (optional spill); MRC for ON c.ppid=p.pid.
Index Maintenance: budget 4–6 secondaries per table; cold indexes marked and skipped; partition TTL (7–30 days community, 90+ enterprise); periodic redb checkpoints; backfill scans hot partitions first.
Tunables (defaults): partition_kind=hourly if >~2M events/day else daily, writer_batch_records=2000, writer_batch_ms=7, posting_cache_bytes=128MiB, mrc_window=30m, join_cardinality_cap=100k keys, rows_per_key_cap=64, idx_budget_per_table=6, contains_like_max_scan=10k rows.
Key Encoding:

#[repr(C, packed)]
struct RowKey { ts_ms: u64, seq: u32, pad: u32 }

#[repr(C, packed)]
struct NameIdxKey { name_hash128_hi: u64, name_hash128_lo: u64, ts_ms: u64, seq: u32 }

Never store whole name/path in secondary key — only stable hash; collisions verified against primary row.
Performance Claim: every operation is a range scan over tiny partition or seek + short posting-list walk. Group commit amortizes fsync. Set intersections cut candidates from millions to tens. MRC eliminates the common join. Zero-copy readers; writer doesn't block them.

4.3) Rule Lifecycle & Management#

Rule Packs: collections of related SQL rules with metadata (version, author, description)
Hot Reloading: add/remove/update without restart
Validation: AST parsing and safety checks before activation
Dependency Resolution: rule → collector capability dependencies
States: Draft → Active → Disabled → Archived, with versioning and rollback
A/B Testing: multiple rule versions simultaneously
Rule metadata includes rule_id, version, name, description, author, created/updated, tags, dependencies (collectors, tables, capabilities), and performance estimates.

4.4) Error Handling & Recovery#

Collector failures: connection loss, schema mismatch, rate limiting, data corruption (CRC32, seq-no gaps).
Recovery: automatic reconnection (exponential backoff + jitter), schema reconciliation, replay from last good seq_no, graceful degradation.

4.5) Configuration & Discovery#

Service discovery via registry/config, capability negotiation, health-based load balancing, failover
Hierarchical config: System → Agent → Rule → Collector; env overrides; hot-reload; schema validation

4.6) Monitoring & Observability#

Metrics: rule eval latency/match/errors, collector IPC rates, redb write latency, alert delivery
Endpoints: /health, /metrics (Prometheus), /debug, /catalog
Logging: structured JSON, consistent fields, correlation IDs, automatic redaction

10) Implementation Guide#

Two-layer architecture: Stream Layer (IPC) for low-latency filtering; Operator Pipeline Layer for joins/aggregations over redb.
Key components: SQL Parser (sqlparser), Pushdown Planner, Operator Pipeline, Storage Engine (redb + secondary indexes + partitioning), Alert System.
Note per ADR-0006: "Operator Pipeline" in this section is replaced by DataFusion. The description above is retained for context.

11) IPC Message Contracts#

DetectionTask (Agent → Collector)#

{
  "task_id": "t_9482",
  "rule_id": "r_process_name_exact",
  "table": "processes.snapshots",
  "project": ["pid", "name", "executable_path", "cpu_usage", "collection_time"],
  "where": {"and": [{"eq": ["name", "minidump.exe"]}, {"gt": ["cpu_usage", 0.0]}]},
  "rate_limit": {"per_sec": 25000},
  "ttl_ms": 600000
}

StreamRecord (Collector → Agent)#

Envelope: { seq_no, task_id, table, checksum, record }. Typed row matching advertised schema; unknown columns rejected with metrics.

Persistence & Query (Agent-Side)#

Append-only redb writes with periodic checkpoints. Read path uses prepared read-only statements with timeouts and memory quotas. Rule-scoped dedupe key avoids alert storms.

12) SQL Dialect (Constrained)#

Important: DaemonEye does NOT embed SQLite. It uses the SQLite dialect for rule authoring and parsing. sqlparser produces an AST; the compiler constructs an internal logical plan that (1) pushes down projections/predicates to collectors and (2) emits derived standard SQL for Phase 2 execution via DataFusion (per ADR-0006).
Constraints:

Allowed Statements: SELECT only
Banned Functions: load_extension, readfile, system, random, printf, etc.
Allowed Functions: aggregations, string ops (substr, length, instr, hex), date/time helpers
Security: AST validation enforces constraints before execution

13) Examples#

-- Simple name match
SELECT pid, name, executable_path, collection_time
FROM processes.snapshots
WHERE name = 'minidump.exe';

-- Regex
SELECT pid, name, executable_path, collection_time
FROM processes.snapshots
WHERE name REGEXP '(?i)malware|trojan|virus|backdoor';

-- Parent/child
SELECT c.pid, c.name, p.pid AS ppid, p.name AS parent
FROM processes.snapshots c
LEFT JOIN processes.snapshots p ON p.pid = c.ppid
WHERE c.name = 'rundll32.exe' AND parent = 'winword.exe';

-- Aggregation
SELECT name, COUNT(*) AS launches
FROM processes.snapshots
WHERE collection_time BETWEEN :t_start AND :t_end
GROUP BY name
HAVING launches > 50;

14) Performance & Backpressure#

Targets: <100ms/rule eval, >1k records/sec sustained write per agent; sub-second for 100k+ events/min (Enterprise)
Channels: bounded MPSC with credit-based flow control
Rate caps via rate_limit hints per task
Overload: drop oldest buffered streams per-task before global backoff

15) Reliability & Security#

Framing: length-delimited protobuf, CRC32, monotonically increasing seq_no
AuthZ: per-collector identity; table-level ACLs in catalog
Sandbox: read-only query engine, allow-listed functions, strict limits
Tamper-evidence: audit ledger chained with cryptographic integrity
Graceful degradation: minimal projection + agent-side filtering

16) Testing & Validation#

AST fuzzing for parser/validator/planner
Golden tests: SQL → DetectionTask JSON snapshots
Soak tests: high-rate streams + backpressure assertions
E2E: collector → agent → redb → sinks; verify dedupe and retries

17) Versioning & Compatibility#

Contracts: catalog@vN, detection_task@vN, stream_record@vN
Forward compat: tolerate unknown optional fields
Deprecation announced in catalog.changes feed

18) Open Items#

Implement DataFusion-based Phase 2 executor (per ADR-0006)
Optimize join and aggregation strategies for high event rates
Validate redb performance with multiple secondary indexes
Sliding/windowed aggregations helper primitives
Collector-side lightweight top-K sampling
Cross-table time correlation helper
Query plan caching and statistics collection

19) TL;DR (Operator-Facing)#

Collectors act like table providers. The agent turns SQL rules into simple tasks so collectors ship only what matters. Derived SQL is executed by DataFusion over redb (ADR-0006), emitting deduped alerts with pointers to source rows for reconstruction.

Source: spec/daemon_eye_spec_sql_to_ipc_detection_architecture.md. This Confluence page is a mirror; the repository copy is authoritative.