AI Agent Contribution Guidelines

This summary distills the comprehensive Copilot instructions for AI agents contributing to the codebase, focusing on architecture, development practices, module structure, error handling, testing strategies, and current implementation status. These guidelines ensure that AI-generated contributions align with project standards and maintain code quality.

Architecture Overview#

The architecture is modular and trait-oriented, emphasizing clear separation of concerns and extensibility. For example, in output formatting, a Formatter trait defines the interface for all output modules, supporting multiple formats (such as human-readable tables, JSON Lines, and YARA rules). Each formatter implements methods for formatting collections and single items, as well as writing headers and footers. The system leverages configuration structures to control output options, including format selection, color support, truncation, filtering, and result limits. The core parsing engine supports multi-format binary analysis (ELF, PE, Mach-O) using the goblin crate, with a type system and error handling framework integrated throughout the pipeline. Section classification, semantic analysis, and ranking are designed as composable, testable components, with a CLI interface providing flexible filtering and output control .

Development Practices#

Development practices are codified to ensure consistency and maintainability. Contributors—human or AI—must define clear interfaces using Rust traits, employ configuration structures for extensibility, and adhere to strict code quality standards enforced by automation hooks. The project uses continuous integration (CI) to check formatting, linting, compilation, testing, security, license compliance, documentation, and coverage. Automation hooks (such as .kiro/hooks/*) enforce code quality, markdown formatting, Rust analysis, and documentation synchronization. All changes must be verified against the actual codebase, and documentation must be kept in sync with implementation. The use of memory-mapped file I/O is recommended for performance, with careful handling of edge cases such as empty files, special files, and permission errors .

Module Structure#

Modules are organized by responsibility. For output, the structure is:

src/output/
├── mod.rs // Public API, Formatter trait, OutputConfig
├── human.rs // HumanFormatter (interactive table view)
├── json.rs // JsonFormatter (JSON Lines)
└── yara.rs // YaraFormatter (YARA rules)

Core data types (such as FoundString, Encoding, Tag), container types (SectionType, StringSource, ContainerInfo), and parser stubs for each supported binary format are defined in dedicated modules. The documentation includes architecture diagrams, AST structures, CLI references, and compatibility matrices to clarify module boundaries and integration points .

Error Handling#

Error handling is implemented using Rust’s Result type throughout all interfaces. All formatter and parser methods return Result, ensuring that errors are propagated and handled explicitly. The error handling framework is designed to provide clear diagnostics and prevent panics, with comprehensive bounds checking in I/O operations and parser logic. For binary parsing, malformed or corrupted inputs are handled gracefully, with specific error variants for common failure modes. The I/O layer uses safe buffer helpers to enforce bounds and prevent overflows .

Testing Strategies#

Testing is comprehensive and multi-layered. Each formatter and parser module includes unit tests covering edge cases such as empty collections, special characters, long strings, UTF-16 encoding, and missing fields. Integration tests validate end-to-end functionality using real binary analysis output. The CI pipeline enforces test coverage thresholds and runs all tests on multiple platforms (Linux, Windows, macOS). Test fixtures are maintained for all supported formats, and new features must include corresponding tests. Automation ensures that documentation and code remain synchronized .

Current Implementation Status#

The codebase has completed core components such as project structure, core data types, container types, error handling framework, format detection, and parser stubs for ELF, PE, and Mach-O. In-progress work includes section classification, string extraction engines, semantic classification, ranking, output formatters, and CLI implementation. Documentation is actively maintained to reflect parser progress, I/O safety guarantees, and test coverage. The implementation roadmap and milestones are tracked in dedicated documentation files and task lists, with regular updates as features are completed .

For further details, refer to the .github/copilot-instructions.md file and supporting documentation in the repository.