ADR-0003 - Polyglot Collector SDK Strategy

Status#

Deferred (2026-04-27) — re-evaluation trigger: when Enterprise tier kernel collectors begin development and a Go-for-eBPF or other non-Rust SDK need actually arises. Per OSS-first sequencing, no commercial-tier collectors begin development until OSS v1.0.0 ships (aspirational target 2027-03-31), so this decision will be re-opened then.

Status history#

2026-04-17 — Proposed (during open-core hygiene pass)
2026-04-27 — Deferred. ADR-0001's reaffirmation chose Rust monolith over Go-Rust hybrid for solo-maintainer reasons; per the OSS-first sequencing decision, no third-party or non-Rust collectors are being built in the v1.0 roadmap. The architecture supports polyglot (the protobuf IPC boundary is language-neutral by design); the SDK investment is not justified without demand.

Context#

DaemonEye's three-component architecture communicates via protobuf over Unix sockets / named pipes. The contracts are defined in:

ipc.proto: CollectionCapabilities, DetectionTask, DetectionResult, CapabilityRequest, CapabilityResponse
eventbus.proto: EventBusMessage with oneof payload, CollectionEventPayload, RpcRequestPayload, heartbeats, alerts, control messages
These contracts are language-neutral by design. A collector speaking protobuf over a socket is a valid participant regardless of implementation language.

collector-core (Rust) already exists#

The Rust SDK (collector-core/) provides all shared operational infrastructure for a collector — IPC transport, capability negotiation, health monitoring, shutdown coordination, load balancing, event bus integration, config management, binary hashing, result aggregation, task distribution. A new collector implements only its platform-specific collection logic; the SDK handles everything else.

Module	Purpose
`source.rs`	`EventSource` trait + `SourceCaps` bitflags for capability negotiation
`event.rs`	`CollectionEvent` enum (Process, Network, Filesystem, Performance, TriggerRequest)
`ipc.rs`	IPC transport layer
`config.rs` / `config_manager.rs`	Configuration loading and management
`health_monitor.rs`	Health check and heartbeat infrastructure
`shutdown_coordinator.rs`	Graceful shutdown coordination
`load_balancer.rs`	Task distribution across collector instances
`event_bus.rs` / `high_performance_event_bus.rs`	Event publishing and subscription
`binary_hasher.rs`	Executable hash computation
`result_aggregator.rs`	Collection result aggregation
`task_distributor.rs`	Detection task routing
`process_manager.rs`	Collector lifecycle management
`rpc_services.rs`	RPC request/response handling
`capability_router.rs`	Capability-based message routing
`trigger.rs` / `triggerable.rs`	Cross-collector trigger framework
A Rust collector imports this crate, implements `EventSource`, and gets all operational infrastructure for free.

Per-platform language strengths differ#

Analysis from ADR-0001 (2026-04-16 review) identified that different platforms have different optimal languages for kernel-level collection:

Platform	Best Language	Rationale
Linux eBPF	Go	cilium/ebpf-go powers Tetragon (CNCF, <1% CPU). Strongest production precedent.
Windows ETW	Go or Rust	Pure-Go ETW consuming is proven (mkwinsyscall + NewCallbackCDecl). Rust via ferrisetw also works.
macOS EndpointSecurity	Rust or Swift	`es_new_client()` requires an Objective-C block. Rust has active bindings; Go needs a CGO shim or IPC.
Process enumeration	Rust	sysinfo provides parallel enumeration; gopsutil requires a custom fork for comparable performance.
FreeBSD	Rust	Rust cross-compile is cleaner; Go CGO cross-compile to FreeBSD is poorly supported.
Forcing all collectors into one language means accepting the weakest ecosystem for some platforms.

Decision#

Offer and support two first-class collector-core SDK implementations — Rust and Go — so that new collectors can be written in whichever language best fits their platform reality. Optionally provide a C FFI wrapper around the Rust SDK to enable C/C++-based collectors.

First-class SDKs: Rust and Go#

collector-core (Rust) — the existing SDK. Continues as the primary implementation. Used by procmond and any collector where Rust is the natural fit (macOS ESF, FreeBSD, process enumeration).
collector-core-go (new) — a Go module providing the same SDK surface. Implements the same protobuf contracts, provides the same operational infrastructure (IPC client, health heartbeats, graceful shutdown, config loading, event publishing, capability negotiation). Used by collectors where Go is the natural fit (Linux eBPF, Windows ETW).
Both SDKs are first-class citizens: documented, tested, and maintained with feature parity for the shared infrastructure layer. The platform-specific collection logic is the only part that differs.

Contract is the boundary#

The protobuf definitions in ipc.proto and eventbus.proto are the source of truth. Both SDKs generate code from the same .proto files. A collector built with either SDK is interchangeable from the agent's perspective.

Optional: C FFI wrapper for collector-core (Rust)#

Expose collector-core's Rust SDK as a C-compatible shared library to enable C/C++-based collectors. This would allow integration with existing C/C++ security tools or kernel modules.
C FFI approach. The standard pattern for exposing async Rust as a C SDK:

cbindgen or diplomat generates C/C++ headers from annotated Rust functions.
Tokio runtime is hidden behind the FFI boundary — C callers see synchronous or callback-based APIs.
EventSource trait is flattened to a struct of function pointers (C vtable pattern).
Protobuf messages cross the boundary as opaque byte buffers.
Collector lifecycle managed via opaque handle: collector_new() / collector_start() / collector_stop() / collector_free().
C FFI level of effort. Estimated 2-3 weeks for one experienced Rust developer. The patterns are well-established (rustls-ffi ~2,500 lines of C wrapper; libsignal has substantial FFI bridge layers). The main complexity is bridging async Rust to C's synchronous/callback model.
C FFI considerations. Introduces unsafe code at the FFI boundary (unavoidable for any C interop). Must be clearly isolated and audited. Does not affect the unsafe_code = "forbid" policy for non-FFI code — the FFI module would have a scoped #[allow(unsafe_code)] with justification. C callers are responsible for memory management (allocate/free discipline). Testing C collectors requires a C test harness in addition to Rust tests.
Decision on C FFI is deferred until a concrete C/C++ collector use case materializes. The architecture supports it; the investment is not justified without demand.

Implementation strategy#

Phase 1 — collector-core-go (priority):

Define the Go module structure mirroring collector-core's SDK surface.
Generate Go protobuf code from the same ipc.proto and eventbus.proto.
Implement core SDK services: IPC client, capability negotiation, health heartbeats, shutdown coordination, config loading, event publishing.
Write a reference Go collector (candidate: Linux eBPF process monitor using cilium/ebpf-go).
Integration test: Go collector + Rust agent communicating over protobuf IPC.
Phase 2 — validate with the Linux eBPF collector: The first production Go collector would be the kernel-level eBPF data collector (ENDI-3). Validates the polyglot SDK in a real use case where Go is genuinely stronger than Rust.
Phase 3 — C FFI wrapper (deferred): Only if a C/C++ collector use case materializes.

Consequences#

Positive#

Best language per platform. Each collector uses the ecosystem strongest for its platform, not a compromise language.
Broader contributor pool. Go developers can write collectors without learning Rust; C/C++ developers (with FFI wrapper) can integrate existing security tools.
Proven architecture. The protobuf IPC boundary already exists and is language-neutral — no new protocol design needed.
Incremental adoption. Go SDK does not require rewriting any existing Rust code; it's additive.
eBPF advantage. The Linux eBPF collector benefits from Go's stronger ecosystem (Tetragon precedent).
Collector independence validated. Proves the collector-core SDK design achieves its goal of making collectors independently implementable.
Compliance. Both Rust and Go are CISA/NSA-recognized memory-safe languages (June 2025 CSI).

Negative#

Two SDK codebases. Feature parity between collector-core (Rust) and collector-core-go must be maintained. Mitigation: proto-first development — all contract changes start with .proto files; code generation ensures both SDKs stay synchronized.
Integration testing complexity. E2E tests must cover Rust-agent-to-Go-collector and Rust-agent-to-Rust-collector paths. Mitigation: a shared language-neutral integration test suite that validates any collector implementation against the protobuf contracts.
Documentation burden. Both SDKs need "write your first collector" guides, API references, and examples.
Proto contract evolution. Changes to .proto files must be coordinated across both SDKs and both CI pipelines. Mitigation: maintain a feature-flag parity matrix across implementations.
C FFI unsafe surface. If pursued, the FFI boundary is inherently unsafe and requires careful auditing. Mitigation: FFI module lives in a separate crate (collector-core-ffi) with its own audit scope, not mixed into collector-core.

Neutral#

Success metrics for the strategy: collector-core-go achieves feature parity with collector-core (Rust) for the shared infrastructure layer; a Go collector passes the same integration test suite as a Rust collector; first production Go collector (eBPF) meets DaemonEye's performance budgets (<5% CPU, <100MB RSS); protobuf contract changes propagate to both SDKs within the same release cycle; if C FFI pursued, a reference C collector compiles and passes integration tests.
Review trigger: revisit when ENDI-3 (Linux eBPF collector) work begins, or when a concrete C/C++ collector use case materializes.
Non-FFI unsafe policy unchanged. unsafe_code = "forbid" continues to apply outside the scoped FFI crate.

Alternatives Considered#

Rust-only (status quo)#

Keep collector-core as the sole SDK. Every new collector must be Rust. Pros: single codebase to maintain; single test matrix; full type-safety story across all collectors. Cons: Linux eBPF is weaker in Rust (Aya-rs vs Tetragon-level maturity); some contributor populations (Go shops, C/C++ integration targets) are excluded; forcing everything through Rust accepts the ecosystem's weakest coverage for each platform. Rejected — the polyglot approach matches ecosystem strengths and lowers the adoption barrier for specific platforms.

Hybrid within one process (CGO / embedding)#

Write collectors in Go but embed them in the Rust agent process (or vice versa) via CGO / FFI. Pros: no IPC boundary between collector and agent. Cons: defeats the privilege-separation architecture (collectors and agent must run with different privileges); complicates build (CGO cross-compile is notoriously poor for embedded targets); reintroduces unsafe surface that the current protobuf IPC boundary avoids. Rejected — the IPC boundary is the feature, not the bug.

Define a custom DSL or WASM-based collector interface#

Introduce a domain-specific collector runtime (Lua, WASM, a custom script engine) so collectors aren't full executables. Pros: sandboxing, hot-reload, single runtime to manage. Cons: eBPF and ETW require kernel-API access that doesn't translate to a sandboxed runtime; performance overhead; massive engineering lift for an uncertain contributor pool; forecloses on using mature platform SDKs. Rejected — full-process collectors with protobuf IPC is both simpler and more capable for security use cases.

Pick a single "winner" language per platform (polyglot without Rust-or-Go SDK parity)#

Write Linux eBPF in Go, Windows ETW in Go, macOS ESF in Rust, FreeBSD in Rust — but implement the operational infrastructure (IPC, health, config, event bus) independently in each language. Pros: each collector has a minimal dependency footprint. Cons: worst of both worlds — operational infrastructure (not platform logic) is where bugs hide, and forcing every collector to reimplement IPC/health/shutdown/config is exactly the cost collector-core was built to eliminate. Rejected — the SDK shared layer is the value; polyglot means two shared layers, not no shared layer.