Evaluation Configuration And Resource Limits#

Overview#

EvaluationConfig is the configuration struct that controls how magic rules are evaluated in libmagic-rs. The system provides five configuration fields with safe defaults, three preset configurations for common scenarios, and automatic validation at database construction time. Resource limits prevent stack overflow, memory exhaustion, and denial-of-service attacks while maintaining flexibility for diverse use cases.

All EvaluationConfig instances are validated before database creation through the validate() method, which enforces four categories of security constraints: recursion depth bounds (1-1000), string length limits (1-1,048,576 bytes), timeout bounds (1-300,000ms when specified), and a resource combination check that rejects high recursion (>100) combined with large strings (>65KB). There is no way to create a MagicDatabase with invalid configuration.

The configuration distinguishes evaluation behavior across multiple dimensions: depth versus speed (recursion 10-50), single versus multiple matches (stop_at_first_match), string reading capacity (1024-32768 bytes), output format (description versus MIME type), and timeout protection (none to 30 seconds). Three presets optimize for general use (default), high-throughput batch processing (performance), and comprehensive forensic analysis (comprehensive).

Configuration Struct and Fields#

The EvaluationConfig struct contains five public fields that control evaluation behavior:

pub struct EvaluationConfig {
    pub max_recursion_depth: u32, // Default: 20, bounds: 1-1000
    pub max_string_length: usize, // Default: 8192, bounds: 1-1,048,576
    pub stop_at_first_match: bool, // Default: true
    pub enable_mime_types: bool, // Default: false
    pub timeout_ms: Option<u64>, // Default: None, bounds: 1-300,000
}

max_recursion_depth#

Limits nested rule traversal depth to prevent stack overflow. The evaluator uses a depth-first hierarchical algorithm where child rules are evaluated only if parent matches. EvaluationContext tracks recursion depth and calls increment_recursion_depth() before evaluating children, returning LibmagicError::EvaluationError with RecursionLimitExceeded if the limit would be exceeded.

Default: 20. Bounds: 1-1000. A value of 0 is rejected because evaluation cannot proceed without at least one level; values above 1000 risk stack overflow.

max_string_length#

Caps bytes read for TypeKind::String reads (both unflagged and with /c//C//w//W//T//f flags); does NOT apply to TypeKind::PString or TypeKind::String16. Prevents memory exhaustion when a string x rule encounters an attacker-controlled NUL-free buffer by bounding the scan-mode read. Without this cap, the unflagged (None, _) arm could allocate up to the full buffer length (CWE-770). String reads are bounded by the max_length parameter in read_string(), which uses SIMD-accelerated memchr for null byte scanning within the limit. Type reading with bounds checking prevents buffer overruns.

Default: 8192 bytes. Bounds: 1-1,048,576 (1MB). A value of 0 is rejected because no string matching could occur; values above 1MB risk memory exhaustion.

PString returns TypeReadError::BufferOverrun rather than truncating when the length prefix exceeds the remaining buffer. String16 is capped at a hardcoded 8192-unit ceiling (16,384 bytes).

stop_at_first_match#

Controls whether evaluation stops after the first matching rule or continues to collect all matches. "First match" refers to the first top-level rule that matches. Children of the first matching top-level rule are always evaluated before the stop check; the stop check applies to subsequent top-level rules. Checked via context.should_stop_at_first_match() after recording each match; when true, evaluation breaks immediately.

In other words, stop_at_first_match = true does not truncate the child subtree of the matching rule -- it only prevents later sibling top-level rules from being evaluated. A successful top-level match therefore returns one parent match plus any descendant matches its children produced.

Default: true (stop after first match for performance).

enable_mime_types#

Maps file type descriptions to standard MIME types using an internal lookup table when true. When false (the default), the mime_type field is always None. This is opt-in for efficiency.

Default: false (descriptions only).

timeout_ms#

Specifies per-file evaluation timeout in milliseconds. None disables the timeout. Checked every 16 rules during evaluation using bit manipulation (rule_count.trailing_zeros() >= 4) to reduce syscall overhead. Returns LibmagicError::Timeout if exceeded.

Default: None (no timeout). Bounds: 1-300,000ms (5 minutes) when Some. A value of 0 is rejected as meaningless; values above 5 minutes risk denial-of-service.

Default Configuration#

EvaluationConfig::default() returns balanced defaults suitable for most workloads:

EvaluationConfig {
    max_recursion_depth: 20,
    max_string_length: 8192,
    stop_at_first_match: true,
    enable_mime_types: false,
    timeout_ms: None,
}

Provides moderate recursion depth for typical file hierarchies, reasonable string buffer size for most file types, early exit on first match for efficiency, description-only output, and no timeout limit.

Configuration Presets#

Performance Preset#

EvaluationConfig::performance() is optimized for high-throughput scenarios:

EvaluationConfig {
    max_recursion_depth: 10,
    max_string_length: 1024,
    stop_at_first_match: true,
    enable_mime_types: false,
    timeout_ms: Some(1000), // 1 second
}

Lower limits reduce per-file overhead. Early exit on first match speeds up batch processing. 1-second timeout prevents hanging on problematic files. Suitable for batch processing many files and untrusted input where tight bounds reduce attack surface.

Comprehensive Preset#

EvaluationConfig::comprehensive() enables deep analysis with all matches:

EvaluationConfig {
    max_recursion_depth: 50,
    max_string_length: 32768,
    stop_at_first_match: false,
    enable_mime_types: true,
    timeout_ms: Some(30000), // 30 seconds
}

Higher recursion depth enables deeper rule traversal. Larger string limit captures more context. stop_at_first_match: false collects all matches. MIME types provide standardized output. 30-second timeout allows complex forensic analysis.

Validation Rules and Security#

The validate() method enforces four categories of security constraints:

1. Recursion Depth Validation#

validate_recursion_depth() checks bounds 1-1000:

Must be greater than 0 (prevents evaluation deadlock)
Must not exceed 1000 (prevents stack overflow)
Constant: MAX_SAFE_RECURSION_DEPTH = 1000

2. String Length Validation#

validate_string_length() checks bounds 1-1,048,576:

Must be greater than 0 (prevents inability to match strings)
Must not exceed 1,048,576 (1MB) (prevents memory exhaustion)
Constant: MAX_SAFE_STRING_LENGTH = 1_048_576

3. Timeout Validation#

validate_timeout() checks bounds when Some:

If Some, must be greater than 0 (prevents meaningless timeout)
If Some, must not exceed 300,000ms (5 minutes) (prevents DoS)
None is always valid (no timeout)
Constant: MAX_SAFE_TIMEOUT_MS = 300_000

4. Resource Combination Validation#

validate_resource_combination() prevents compound exhaustion:

Rejects recursion depth greater than 100 combined with string length greater than 65,536
Deep recursion with large string reads at every level can compound into excessive resource consumption even when each value individually falls within safe bounds
Constants: HIGH_RECURSION_THRESHOLD = 100, LARGE_STRING_THRESHOLD = 65536

Automatic Validation at Database Creation#

All MagicDatabase constructors call config.validate() internally and return an error if configuration is invalid. There is no way to create a database with invalid configuration.

Constructor Methods#

Built-in rules with default config:

let db = MagicDatabase::with_builtin_rules()?;

Built-in rules with custom config:

MagicDatabase::with_builtin_rules_and_config() validates config before use:

let db = MagicDatabase::with_builtin_rules_and_config(
    EvaluationConfig::performance()
)?;

Load from file or directory:

let db = MagicDatabase::load_from_file("/usr/share/misc/magic")?;

Load from file with custom config:

MagicDatabase::load_from_file_with_config() validates config during construction:

let config = EvaluationConfig::comprehensive();
let db = MagicDatabase::load_from_file_with_config(
    "/usr/share/misc/magic.d", 
    config
)?;

Custom Configuration#

Use struct update syntax to override individual fields from any preset:

let config = EvaluationConfig {
    max_recursion_depth: 30,
    enable_mime_types: true,
    timeout_ms: Some(5000),
    ..EvaluationConfig::default()
};

CLI Integration#

The rmagic CLI exposes timeout_ms via --timeout-ms flag. All other configuration values use defaults.

# No timeout (default)
rmagic sample.bin

# 5-second timeout per file
rmagic --timeout-ms 5000 sample.bin

If evaluation exceeds the timeout, exit code 5 is returned with an error message to stderr:

Error: Evaluation timeout
File analysis timed out after 5000ms
The file may be too large or complex to analyze within the time limit.
Try using a simpler magic file or increasing the timeout limit.

Exit code 5 is designated for timeout and resource limit errors, following Unix conventions.

Runtime Enforcement Mechanisms#

Recursion Depth Enforcement#

EvaluationContext maintains a recursion_depth field tracking nesting level. The increment_recursion_depth() method checks against config.max_recursion_depth before incrementing, returning LibmagicError::EvaluationError with RecursionLimitExceeded if the limit would be exceeded. Called before evaluating child rules.

A RecursionGuard RAII type wraps the increment/decrement pair, ensuring the depth is always decremented when the guard goes out of scope and eliminating the risk of mismatched calls that could allow stack overflow.

String Length Enforcement#

The max_string_length config value is accessed via context.max_string_length() and threaded through both type-read dispatchers. The unflagged (None, _) arm of read_typed_value_with_pattern passes Some(max_string_length) to read_string(); the flagged-string arm of read_pattern_match caps the scan buffer to max_string_length when the AST max_length is None. Without these caps, string x rules against attacker-controlled NUL-free buffers could allocate up to the full buffer length (CWE-770).

The cap applies specifically to TypeKind::String (both unflagged and flagged with /c//C//w//W//T//f). It does NOT govern TypeKind::PString, which returns TypeReadError::BufferOverrun on oversized length prefixes, or TypeKind::String16, which has a hardcoded 8192-unit ceiling.

Timeout Enforcement#

The timeout is checked every 16 rules using bit manipulation (rule_count.trailing_zeros() >= 4) to reduce syscall overhead. The evaluator compares elapsed time against context.timeout_ms() using std::time::Instant. Returns LibmagicError::Timeout if the timeout is exceeded. Timeout errors propagate immediately, halting evaluation and returning partial results.

Stop-at-First-Match Enforcement#

Checked via context.should_stop_at_first_match() after recording each match. When true, evaluation breaks immediately after the first match. Child rules are fully evaluated before the exit decision, maintaining hierarchical rule semantics.

Usage Examples#

Choosing a Preset#

Scenario	Preset	Why
General file identification	`default()`	Balanced depth and limits
Batch processing many files	`performance()`	Low limits, 1s timeout, early exit
Forensic analysis	`comprehensive()`	Deep traversal, all matches, MIME types
Untrusted input	`performance()`	Tight bounds reduce attack surface
Custom requirements	Struct update syntax	Override specific fields from any preset

Single File Evaluation#

let db = MagicDatabase::with_builtin_rules()?;
let result = db.evaluate_file("sample.bin")?;
println!("File type: {}", result.description);
println!("Confidence: {:.0}%", result.confidence * 100.0);

Batch Processing with Performance Config#

let config = EvaluationConfig::performance();
let db = MagicDatabase::with_builtin_rules_and_config(config)?;

for path in &["image.png", "archive.tar.gz", "binary.elf"] {
    match db.evaluate_file(path) {
        Ok(result) => println!("{}: {}", path, result.description),
        Err(e) => eprintln!("{}: error: {}", path, e),
    }
}

Comprehensive Analysis with MIME Types#

let config = EvaluationConfig {
    enable_mime_types: true,
    stop_at_first_match: false,
    ..EvaluationConfig::default()
};
let db = MagicDatabase::with_builtin_rules_and_config(config)?;
let result = db.evaluate_file("photo.jpg")?;

println!("Description: {}", result.description);
if let Some(mime) = &result.mime_type {
    println!("MIME type: {}", mime);
}
for m in &result.matches {
    println!(" offset={}, level={}, message={}", 
             m.offset, m.level, m.message);
}

Handling Configuration Errors#

let config = EvaluationConfig {
    max_recursion_depth: 5000,
    ..EvaluationConfig::default()
};
match config.validate() {
    Ok(()) => println!("Config is valid"),
    Err(LibmagicError::ConfigError { reason }) => 
        eprintln!("Bad config: {}", reason),
    Err(e) => eprintln!("Unexpected: {}", e),
}

Security Architecture#

The configuration system is part of libmagic-rs's multi-layered security approach:

Input Validation – Magic files are validated during loading (magic file size 16MB max, regular file type checks, bounds on rule fields)
Configuration Validation – Prevent bad configs before use
Bounds Checking – Safe buffer access via .get() methods
Resource Limits – Runtime protection via config constraints
Recursion Guards – RAII guards ensure recursion depth is always decremented on scope exit, preventing mismatched increment/decrement calls
Error Handling – Graceful degradation on errors

Threat Mitigation#

Threat	Mitigation
Stack overflow via deep nesting	`max_recursion_depth` limit enforced by `RecursionGuard` RAII wrapper
Memory exhaustion via large strings	`max_string_length` caps `TypeKind::String` reads (both unflagged and flagged with `/c`/`/C`/`/w`/`/W`/`/T`/`/f`); does NOT govern `PString` or `String16`
Memory exhaustion via oversized magic files	16MB magic file size limit enforced by parser loader
DoS via infinite evaluation	`timeout_ms` limit
Buffer overrun	Bounds checking everywhere
Malformed input	Graceful error handling
Integer overflow	Checked arithmetic

The evaluator uses graceful degradation for non-critical errors (buffer overrun, type read errors) by skipping the rule and continuing with the next rule, while propagating critical errors (timeout, recursion limit) immediately.

Relevant Code Files#

File	Purpose
src/config.rs (entire file - 307 lines)	`EvaluationConfig` struct definition, defaults, presets, validation logic
src/evaluator/mod.rs (lines 40-474)	`EvaluationContext` implementation, `RecursionGuard` RAII wrapper, recursion tracking, timeout checking
src/evaluator/types.rs (lines 275-308)	String length enforcement in `read_string()`
src/parser/loader.rs	Magic file size validation (16MB max), `read_magic_file_bounded()`
src/main.rs (lines 77-366)	CLI integration for `--timeout-ms` flag, exit code 5 handling
src/error.rs (lines 27-32)	`LibmagicError::Timeout` variant definition

Magic Rule Evaluation – The hierarchical evaluation engine that processes rules using configuration constraints
MagicDatabase API – Primary interface that holds parsed rules and evaluation configuration
Error Handling – Graceful degradation strategy for non-critical versus critical errors
Security Architecture – Multi-layered security approach including configuration validation, input validation, and runtime bounds checking