Code Quality Standards And Conventions#

The libmagic-rs project enforces comprehensive code quality standards and conventions to ensure memory safety, maintainability, and reproducibility. These standards are implemented through a combination of workspace-level lint configuration, development guidelines documented in AGENTS.md, and automated CI/CD quality gates.

The project adopts a zero-warnings policy where all code must pass cargo clippy -- -D warnings with unsafe code globally forbidden. Beyond basic correctness, the project enables clippy's pedantic, nursery, and cargo lint groups to catch potential issues ranging from performance problems to API design concerns.

These standards support the project's OSSF Best Practices badge compliance, requiring signed commits, comprehensive testing (>85% coverage target), and reproducible builds through committed lock files.

Zero-Warnings Policy#

Enforcement Mechanism#

The zero-warnings policy is enforced at the Rust compiler level through workspace lint configuration:

[workspace.lints.rust]
# Security: Forbid unsafe code globally
unsafe_code = "forbid"
# Zero warnings policy
warnings = "deny"

This configuration means:

All warnings are treated as compilation errors - the build will fail if any warnings are present
Applies workspace-wide - affects all crates in the monorepo
Developers must fix clippy suggestions unless they conflict with project requirements

CI/CD Integration#

The zero-warnings policy is enforced as a CI quality gate:

All code must pass clippy with -D warnings before merge
No compilation warnings or errors allowed
All tests must pass
Security audit must pass

Clippy Lint Configuration#

Lint Groups#

The project enables multiple clippy lint groups at the workspace level:

[workspace.lints.clippy]
correctness = { level = "deny", priority = -1 }
suspicious = { level = "warn", priority = -1 }
pedantic = { level = "warn", priority = -1 }
nursery = { level = "warn", priority = -1 }
cargo = { level = "warn", priority = -1 }

The priority = -1 setting ensures lint group configurations can be overridden by more specific lint settings.

Lint Group Purposes:

correctness (deny): Catches code that is likely to be incorrect or panic
suspicious: Flags code that might be written incorrectly
pedantic: Enforces stricter style and best practices (e.g., prefer trailing_zeros() over bitwise masks)
nursery: Experimental lints that may catch additional issues
cargo: Lints for Cargo.toml metadata and dependencies

Security-Focused Lints#

Security-related lints warn on potentially unsafe patterns:

as_conversions, cast_ptr_alignment: Catch risky type casts
indexing_slicing: Prevent panics from unchecked array access
arithmetic_side_effects, integer_division, modulo_arithmetic: Catch overflow/underflow
expect_used: Discourage expectation-based error handling

Anti-Pattern Enforcement (Deny Level)#

Two critical anti-patterns are set to deny:

panic = "deny" # DON'T PANIC IN PRODUCTION
unwrap_used = "deny" # DON'T UNWRAP IN PRODUCTION

Additional deny-level lints:

await_holding_lock = "deny": Prevents async deadlocks
undocumented_unsafe_blocks = "deny": Requires safety documentation (though unsafe is forbidden globally)
missing_panics_doc = "deny": Forces documentation of panic conditions

Performance and Correctness Lints#

Extensive performance and correctness lints set to warn:

String-related: str_to_string, string_add, string_slice
Memory/concurrency: clone_on_ref_ptr, mutex_atomic, rc_buffer
Pattern matching: match_same_arms, wildcard_enum_match_arm
Shadowing: shadow_reuse, shadow_same, shadow_unrelated

Pedantic Category Lints#

Pedantic-category lints include:

Documentation: missing_errors_doc = "warn"
Cargo metadata: cargo_common_metadata, multiple_crate_versions
Cast safety: cast_possible_truncation, cast_precision_loss, cast_sign_loss
Code quality: module_name_repetitions, similar_names, too_many_lines
Async: async_yields_async, large_futures, result_large_err

Allowed Exceptions#

One lint is explicitly allowed:

missing_docs_in_private_items = "allow": Private items don't require documentation

File Organization And Size Limits#

File Size Guidelines#

The project recommends keeping source files under 500-600 lines:

Current Practice: The codebase shows flexible adherence to this guideline:

Most files stay under 800 lines
Core files like src/lib.rs (1,261 lines) exceed the threshold
Large files typically include extensive inline test coverage using #[cfg(test)] blocks

Successful Refactoring Example: src/evaluator/mod.rs was previously 2,639 lines (4.4× the guideline). It was split into:

mod.rs (~720 lines) - Public API surface (EvaluationContext, RuleMatch, re-exports)
engine.rs (~2,096 lines) - Core evaluation logic (evaluate_single_rule, evaluate_rules, evaluate_rules_with_config)

This demonstrates how to address oversized files while maintaining API compatibility through re-exports.

Module Organization Pattern#

When files grow large, the codebase uses a module-per-directory pattern:

Parser Module (src/parser/):

mod.rs (576 lines) - Main parser coordination
ast.rs (801 lines) - AST data structures
grammar.rs (2,448 lines) - Grammar parsing logic
Additional submodules: preprocessing.rs, hierarchy.rs, loader.rs, format.rs

Evaluator Module (src/evaluator/):

mod.rs (~720 lines) - Public API surface and context management
engine.rs (~2,096 lines) - Core evaluation logic
types.rs (1,505 lines) - Type reading and interpretation
Focused submodules: operators.rs, offset.rs, strength.rs

File Headers#

All .rs files must have copyright and SPDX headers. The standard format is:

// Copyright (c) 2025-2026 the libmagic-rs contributors
// SPDX-License-Identifier: Apache-2.0

This format appears consistently across all source files including src/lib.rs, src/main.rs, src/parser/mod.rs, and src/error.rs.

Rustdoc Requirements#

Public API Documentation#

All public APIs require rustdoc with examples:

Include error conditions and recovery strategies
Provide usage examples for common patterns
Document performance characteristics

Enum Variant Documentation#

All public enum variants need # Examples rustdoc sections. Examples from the codebase:

OffsetSpec Enum (src/parser/ast.rs):

pub enum OffsetSpec {
    /// Absolute offset from file start
    ///
    /// # Examples
    ///
    /// ```
    /// use libmagic_rs::parser::ast::OffsetSpec;
    ///
    /// let offset = OffsetSpec::Absolute(0x10);
    /// ```
    Absolute(i64),
    // ... other variants
}

LibmagicError Enum (src/error.rs):

pub enum LibmagicError {
    /// Invalid configuration parameter.
    ///
    /// # Examples
    ///
    /// ```
    /// use libmagic_rs::LibmagicError;
    ///
    /// let error = LibmagicError::ConfigError {
    /// reason: "invalid timeout value".to_string(),
    /// };
    /// ```
    ConfigError { reason: String },
    // ... other variants
}

Documentation Pattern#

All rustdoc examples follow a consistent pattern:

Triple-slash doc comments (///) above the item
Markdown # Examples header
Fenced code blocks with executable examples
Fully qualified imports (use libmagic_rs::...)
Demonstrations of construction/usage

Naming Conventions#

The project follows standard Rust naming conventions:

Item Type	Convention	Example
Files	snake_case	`magic_rule.rs`
Types	PascalCase	`MagicRule`, `TypeKind`
Functions	snake_case	`resolve_offset`, `evaluate_rule`
Constants	SCREAMING_SNAKE_CASE	`DEFAULT_BUFFER_SIZE`
Modules	snake_case	`evaluator`, `output`

Case-Insensitive Matching Pattern#

Current Implementation: The libmagic-rs library performs all string matching in a case-sensitive manner. Research of the codebase found:

No use of .to_lowercase(), .to_ascii_lowercase(), or similar normalization methods
String equality is explicitly case-sensitive
No normalization at API entry points like evaluate_file() or evaluate_buffer()

Documented Guidelines#

Despite the current case-sensitive implementation, AGENTS.md documents a case-insensitive pattern for future implementation:

When implementing case-insensitive string matching:

Lowercase inputs at ALL entry points (constructors, setters)
Store normalized values internally
Document the case-insensitivity in public API docs

Note: This represents a design guideline for future features, not current behavior.

Character Usage Policy#

The project avoids non-ASCII characters in code, comments, and documentation:

Prohibited: Emojis and other non-ASCII characters in source code, comments, or rustdoc
Exception: When the code is handling non-plaintext characters (e.g., em dash, en dash in text parsing)

This policy ensures compatibility across different environments and prevents encoding issues.

Committed Lock Files For Reproducibility#

Cargo.lock#

The project commits Cargo.lock intentionally to ensure:

Reproducible binary builds for the rmagic command-line tool
Auditable dependencies with pinned versions
Note: Library consumers are unaffected because cargo publish ignores Cargo.lock when resolving transitive dependencies

Both Cargo.lock and mise.lock are committed and tracked in version control.

OSSF Best Practices Requirement#

Every release must be built reproducibly:

Pinned toolchain versions
Committed lock files (Cargo.lock, mise.lock)
Unique SemVer identifiers (vX.Y.Z tags)

Memory Safety Standards#

Memory safety is the first development principle:

Core Requirements#

No unsafe code except in vetted dependencies (memmap2, byteorder)
Bounds checking for all buffer access using .get() methods
Safe resource management with RAII patterns
Graceful error handling for malformed inputs
Safe string operations: Use strip_prefix()/strip_suffix() instead of direct slicing to avoid UTF-8 panics

Enforcement#

unsafe_code = "forbid" is enforced project-wide through Cargo.toml configuration.

Error Handling Standards#

Error handling must follow specific patterns:

Requirements#

Library errors should be descriptive and actionable
Use thiserror::Error for structured error types
Use Result types consistently
No panics in library code
No unwrap() or expect() in library code

Architecture Constraints#

Specific architectural constraints:

src/error.rs is shared with build.rs -- cannot reference lib-only types
FileError(String) wraps structured I/O errors as strings to work around the build.rs constraint
Use ParseError::IoError for I/O errors in parser code
Use LibmagicError::ConfigError for config validation

Testing Requirements#

Comprehensive testing is mandatory:

Coverage and Quality#

Target >85% test coverage with cargo llvm-cov
All code changes must include comprehensive tests
Use cargo nextest for faster, more reliable test execution
Include property tests with proptest for fuzzing
Benchmark critical path components with criterion
Verify doc examples with cargo test --doc

Code Review Checklist#

All pull requests must satisfy:

Tests: New functionality has tests, edge cases covered, property tests for complex structures
Correctness: Edge cases handled
Memory safety: No unsafe code blocks, bounds checking with .get() methods
Error handling: Proper use of Result types, no panics/unwrap/expect in library code
Performance: No unnecessary allocations in hot paths, no benchmark regressions

OSSF Best Practices Compliance#

The project maintains OSSF Best Practices badge with these requirements:

Every PR Must#

Sign off commits with git commit -s (DCO enforced)
Pass CI (clippy, fmt, tests, CodeQL, cargo audit) before merge
Include tests for new functionality (policy, not optional)
Be reviewed for correctness, safety, and style
Not introduce unsafe code, unwrap()/expect() in library code, or panics

Every Release Must#

Have human-readable release notes via git-cliff
Use unique SemVer identifiers (vX.Y.Z tags)
Be built reproducibly (pinned toolchain, committed lock files)

Security Requirements#

Vulnerabilities through private reporting only
cargo audit and cargo deny run daily in CI
Medium+ severity vulnerabilities: fix within 90 days
docs/src/security-assurance.md must be updated when new attack surface is introduced

Relevant Code Files#

File	Purpose	Lines	URL
`Cargo.toml`	Workspace lint configuration	191	View
`AGENTS.md`	Comprehensive development guidelines	557	View
`.gitignore`	Lock file commit rationale	61	View
`src/lib.rs`	Main library entry point with file headers	1,261	View
`src/error.rs`	Error types with rustdoc examples	68	View
`src/parser/ast.rs`	AST with enum variant documentation	801	View
`src/evaluator/operators.rs`	Case-sensitive string comparison	377	View

Memory Safety in Rust: The forbidden unsafe code policy and bounds checking requirements
Clippy Lint System: Understanding Rust's advanced static analysis capabilities
OSSF Best Practices: Open source security and supply chain standards
Reproducible Builds: Techniques for ensuring deterministic build outputs
Documentation Testing: Using cargo test --doc to verify example correctness
Property-Based Testing: Using proptest for comprehensive fuzzing