Migration Overview: Legacy IPC to Interprocess Crate#
DaemonEye migrated its IPC layer from a legacy, custom implementation to a unified, cross-platform solution using the interprocess crate and Tokio async runtime. The legacy system relied on hand-rolled Unix domain sockets and Windows named pipes, with manual varint framing and CRC32 validation. This approach led to high maintenance overhead, platform-specific complexity, and limited test coverage. The migration replaced approximately 600 lines of custom code with 200 lines leveraging standard APIs, improving reliability, maintainability, and debugging capabilities [ADR, PR #75].
The new IPC implementation uses the interprocess crate for local sockets (Unix domain sockets on Linux/macOS, named pipes on Windows), fully integrated with Tokio for asynchronous, non-blocking operations. All communication remains local-only, with strict permission controls and connection limits preserved [Technical Doc].
Updated IPC Protocol Design#
Frame Structure#
The IPC protocol uses a length-delimited frame with CRC32C checksum for integrity validation. Each frame consists of:
[Length: u32][CRC32C: u32][Protobuf Message: N bytes]
- Length: Little-endian u32, size of the protobuf message in bytes.
- CRC32C: Little-endian u32, checksum of the message bytes.
- Message: Protobuf-encoded payload (using
prost).
This design ensures robust error detection and compatibility across platforms [codec.rs].
CRC32C Checksum Framing#
The CRC32C checksum is computed over the message bytes and validated on receipt. If the checksum does not match, the message is rejected with a CrcMismatch error. This mechanism protects against data corruption and transmission errors [codec.rs].
Tokio-Based Transport#
IPC communication uses Tokio's async runtime for non-blocking IO. The transport layer is abstracted by the interprocess crate, which provides:
- Unix domain sockets on Linux/macOS.
- Named pipes on Windows.
All operations (accept, read, write) are asynchronous and support configurable timeouts. Connection management uses Tokio semaphores to enforce limits [interprocess_transport.rs].
Error Handling#
Error handling is comprehensive, with detailed error types for all IPC operations:
pub enum IpcError {
Timeout,
TooLarge { size, max_size },
CrcMismatch { expected, actual },
Io(io::Error),
Decode(prost::DecodeError),
Encode(String),
PeerClosed,
InvalidLength { length },
ServerNotFound { endpoint: String },
ConnectionRefused { endpoint: String },
PermissionDenied { endpoint: String },
ConnectionTimeout { endpoint: String },
CircuitBreakerOpen,
// ... plus rate limiting and flow control errors
}
Timeouts, CRC mismatches, oversized messages, and IO errors are all handled gracefully. The system supports automatic reconnection with exponential backoff, graceful degradation on failures, and server-side connection limiting [ipc-implementation.md, codec.rs].
Security and Configuration#
Security features are preserved and enhanced:
- Local-only communication (no network exposure).
- Unix socket permissions: 0700 for directories, 0600 for sockets.
- Connection limits (default: 16 concurrent).
- Per-operation timeouts (accept: 5s, read: 30s, write: 10s).
- Maximum frame size (default: 1MB).
- Input validation and resource limits to prevent DoS attacks.
Configuration is managed via the IpcConfig struct:
pub struct IpcConfig {
pub endpoint_path: String,
pub max_frame_bytes: usize,
pub read_timeout_ms: u64,
pub write_timeout_ms: u64,
pub max_connections: usize,
pub crc32_variant: Crc32Variant,
// Optional: rate_limiting, flow_control, enable_sequence_numbers
}
[ipc-implementation.md, issues/86]
Building the IPC Implementation#
- Ensure Rust (edition 2021+) is installed.
- Add dependencies in
Cargo.toml:interprocess = "2.2" tokio = { version = "1.47", features = ["full"] } prost = "0.14" crc32c = "0.6" - Build the project:
cargo build --release - Configuration is managed via YAML/TOML files or environment variables. See
IpcConfigfor tunable parameters.
CI/CD pipelines run cross-platform tests to ensure reliability [PR #75].
Testing IPC Communication#
Testing strategies include unit, integration, and property-based tests:
- Unit tests: Validate encoding/decoding, CRC32C calculation, error scenarios (timeouts, CRC mismatch, oversized messages).
- Integration tests: Start IPC server and client using Tokio, send requests, verify responses.
- Property-based tests: Use
proptestfor serialization and protocol validation. - Performance benchmarks: Ensure minimal overhead.
Example integration test:
#[tokio::test]
async fn test_ipc_communication() {
let temp_dir = TempDir::new().unwrap();
let socket_path = temp_dir.path().join("test.sock");
// Start server
let mut server = InterprocessServer::new(IpcConfig {
endpoint_path: socket_path.to_str().unwrap().to_string(),
..Default::default()
});
server.set_handler(|task: DetectionTask| async move {
Ok(DetectionResult::success(&task.task_id, vec![]))
});
let server_handle = tokio::spawn(async move { server.start().await });
tokio::time::sleep(Duration::from_millis(100)).await;
// Connect client
let mut client = InterprocessClient::new(IpcConfig {
endpoint_path: socket_path.to_str().unwrap().to_string(),
..Default::default()
});
let request = DetectionTask::new_test_task("test-123", TaskType::EnumerateProcesses, None);
let response = client.send_task(request).await.unwrap();
assert!(response.success);
server_handle.abort();
}
Debugging IPC Communication#
Common troubleshooting steps:
- Permission denied: Check socket file and directory permissions; ensure processes run as the same user.
- Connection refused: Verify the server is running; check endpoint path configuration.
- Timeout errors: Increase timeout values in configuration; check system load.
- CRC mismatch: Inspect for data corruption or protocol version mismatch.
Diagnostic commands:
# Check IPC status
daemoneye-cli health-check --verbose
# Verify endpoint accessibility (Unix)
ls -la /var/run/daemoneye/
# Test IPC connectivity
daemoneye-cli ipc test
Structured logging, health check endpoints, and Prometheus metrics provide additional observability. For advanced debugging, enable verbose logging and use integration tests to simulate error scenarios.
Migration Plan and Rollback#
The migration consolidated duplicate IPC code, migrated advanced features like rate limiting and flow control, and removed deprecated code. Rollback is supported via a feature flag (ipc-legacy) and configuration toggle, allowing hot rollback to the legacy implementation if needed [ADR, issues/86].
Architecture Diagram#
For further details, see the ADR, Technical IPC Documentation, and Testing Guide.