Comparison Operators And Compare_Values Helper Pattern#
Overview#
The Comparison Operators And Compare_Values Helper Pattern is a reusable architectural pattern for implementing comparison operators in libmagic-rs. The pattern uses a shared compare_values helper function that handles all Value type pairs with cross-type coercion, consolidating duplicated type-matching logic. This architectural approach avoids code duplication across the four comparison operator implementations (<, >, <=, >=) by delegating to a single source of truth for type-pair matching and ordering logic.
The pattern addresses Issue #34, which requests comparison operators to improve magic file compatibility. According to the issue, comparison operators unlock approximately 40% more magic file compatibility and are critical for version checks, size validation, and range matching in magic rules. Magic files frequently use comparison operators to detect file format versions, validate size constraints, and perform range-based matching.
The implementation builds on the foundation of cross-type integer coercion using i128, which safely compares u64 and i64 values. The compare_values helper extends this pattern to return Option<Ordering> instead of bool, enabling all four comparison operators to share identical type-handling logic while implementing operator-specific ordering semantics.
Implementation Status#
The comparison operators and compare_values helper function were implemented in PR #104, which was merged. The implementation includes all four comparison operators (LessThan, GreaterThan, LessEqual, GreaterEqual) added to the Operator enum, the shared compare_values helper function returning Option<Ordering>, parser support for comparison operator syntax with proper token ordering, and comprehensive test coverage for all type pairs and edge cases. This article documents the implemented pattern to serve as a reference for the architectural decisions and design approach.
The compare_values Helper Function#
The compare_values function returns Option<Ordering> and handles all Value type pairs:
pub fn compare_values(left: &Value, right: &Value) -> Option<Ordering> {
match (left, right) {
(Value::Uint(a), Value::Uint(b)) => Some(a.cmp(b)),
(Value::Int(a), Value::Int(b)) => Some(a.cmp(b)),
(Value::Uint(a), Value::Int(b)) => Some(i128::from(*a).cmp(&i128::from(*b))),
(Value::Int(a), Value::Uint(b)) => Some(i128::from(*a).cmp(&i128::from(*b))),
(Value::Float(a), Value::Float(b)) => a.partial_cmp(b),
(Value::String(a), Value::String(b)) => Some(a.cmp(b)),
(Value::Bytes(a), Value::Bytes(b)) => Some(a.cmp(b)),
_ => None,
}
}
Type Pairs Handled:
- Same-type integer comparisons (
Uint/Uint,Int/Int): Direct native comparison using Rust'scmpmethod - Cross-type integer comparisons (
Uint/Int,Int/Uint): Usesi128coercion to handle the full range of both types without overflow - Float comparisons (
Float/Float): Usespartial_cmpwhich returnsSome(Ordering)for normal values and infinities,Nonefor NaN - String comparisons (
String/String): Lexicographic comparison - Bytes comparisons (
Bytes/Bytes): Lexicographic byte-by-byte comparison - Incompatible type pairs (including
FloatvsInt/Uint): ReturnsNone(evaluates tofalsein operator functions)
Comparison Operator Functions#
The implementation includes four operator functions that delegate to compare_values:
pub fn apply_less_than(left: &Value, right: &Value) -> bool {
compare_values(left, right) == Some(Ordering::Less)
}
pub fn apply_greater_than(left: &Value, right: &Value) -> bool {
compare_values(left, right) == Some(Ordering::Greater)
}
pub fn apply_less_equal(left: &Value, right: &Value) -> bool {
matches!(
compare_values(left, right),
Some(Ordering::Less | Ordering::Equal)
)
}
pub fn apply_greater_equal(left: &Value, right: &Value) -> bool {
matches!(
compare_values(left, right),
Some(Ordering::Greater | Ordering::Equal)
)
}
Each function transforms the Option<Ordering> result into operator-specific boolean semantics:
apply_less_than: Returnstrueif ordering isLess,falseotherwiseapply_greater_than: Returnstrueif ordering isGreater,falseotherwiseapply_less_equal: Returnstrueunless ordering isGreater,falseforNoneapply_greater_equal: Returnstrueunless ordering isLess,falseforNone
Incompatible type pairs (returning None from compare_values) evaluate to false in all operators.
Cross-Type Integer Coercion#
The pattern's critical innovation is using i128 for cross-type integer comparisons. This approach is necessary because:
u64range: 0 to 18,446,744,073,709,551,615 (0 to 2^64 - 1)i64range: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (-2^63 to 2^63 - 1)i128range: Can represent both full ranges without loss or overflow
The pattern correctly handles the edge case where Value::Uint(u64::MAX) and Value::Int(-1) have identical bit patterns (both are 0xFFFFFFFFFFFFFFFF in two's complement representation) but represent vastly different mathematical values. The i128 coercion ensures they correctly evaluate as unequal: i128::from(u64::MAX) is 18,446,744,073,709,551,615, while i128::from(-1_i64) is -1.
The existing apply_equal function already implements this i128 coercion pattern, demonstrating the approach's viability:
// Cross-type integer coercion (safe via i128 to avoid overflow)
(Value::Uint(a), Value::Int(b)) => i128::from(*a) == i128::from(*b),
(Value::Int(a), Value::Uint(b)) => i128::from(*a) == i128::from(*b),
The compare_values helper extends this pattern from equality comparisons to ordering comparisons.
AST and Parser Integration#
The Operator enum in src/parser/ast.rs includes four comparison operator variants:
pub enum Operator {
Equal,
NotEqual,
LessThan,
GreaterThan,
LessEqual,
GreaterEqual,
BitwiseAnd,
BitwiseAndMask(u64),
}
The parser in src/parser/grammar.rs recognizes comparison operator syntax with proper token ordering. Longer operators (<=, >=) are parsed before shorter prefixes (<, >) to prevent premature matching. This ordering requirement is critical: if < is parsed before <=, the input <= would incorrectly match as < followed by =.
Operator Dispatch#
The apply_operator function in src/evaluator/operators.rs dispatches to all operator implementations including comparison operators:
pub fn apply_operator(operator: &Operator, left: &Value, right: &Value) -> bool {
match operator {
Operator::Equal => apply_equal(left, right),
Operator::NotEqual => apply_not_equal(left, right),
Operator::LessThan => apply_less_than(left, right),
Operator::GreaterThan => apply_greater_than(left, right),
Operator::LessEqual => apply_less_equal(left, right),
Operator::GreaterEqual => apply_greater_equal(left, right),
Operator::BitwiseAnd => apply_bitwise_and(left, right),
Operator::BitwiseAndMask(mask) => { /* inline logic */ },
}
}
The exhaustive match ensures compile-time verification that all Operator variants are handled. When new operators are added to the enum, the compiler forces updates to this dispatch function.
Value Type System#
The Value enum in src/parser/ast.rs defines five variants:
pub enum Value {
Uint(u64),
Int(i64),
Float(f64),
Bytes(Vec<u8>),
String(String),
}
The Value enum derives PartialEq but not Eq due to IEEE 754 floating-point NaN semantics (NaN != NaN).
The pattern handles all meaningful comparison combinations. Same-type comparisons use native Rust ordering. Cross-type integer comparisons require i128 coercion to ensure mathematical correctness. Float comparisons use partial_cmp which returns Option<Ordering> to handle NaN. String and byte comparisons use lexicographic ordering. Incompatible type combinations (such as Uint vs String, or Float vs Int) return None from compare_values, which operators interpret as false.
Usage Examples#
Magic Rule Patterns#
Comparison operators enable common magic file patterns:
Version Detection:
0 long >0x00020000 PDF document, version 2.x
Size Validation:
8 long <100 Small file marker
Range Checking:
0 long >=1000 Large file header
0 long <=65535 Valid 16-bit value
These patterns are fundamental to real-world magic files. The GNU file command's magic database extensively uses comparison operators for format version detection, size constraints, and range-based matching.
Type Compatibility Table#
| Left Type | Right Type | Result | Behavior |
|---|---|---|---|
Uint | Uint | Some(Ordering) | Direct native comparison |
Int | Int | Some(Ordering) | Direct native comparison |
Uint | Int | Some(Ordering) | i128 coercion prevents overflow |
Int | Uint | Some(Ordering) | i128 coercion prevents overflow |
Float | Float | Some(Ordering) or None | partial_cmp: normal values and infinities return Some, NaN returns None |
String | String | Some(Ordering) | Lexicographic comparison |
Bytes | Bytes | Some(Ordering) | Lexicographic byte-by-byte |
Float | Int/Uint | None | Incompatible (no cross-type coercion) |
| Mixed incompatible | Mixed incompatible | None | Evaluates to false |
IEEE 754 Float Comparison Semantics#
Float-to-float comparisons follow IEEE 754 ordering rules via Rust's partial_cmp:
- Normal finite values: Standard numerical ordering (
1.0 < 2.0) - Infinities:
-∞ < all finite values < +∞; infinities equal themselves - NaN (Not a Number): Unordered with respect to all values (including itself).
partial_cmpreturnsNone, causing all comparison operators to returnfalse - Subnormal numbers: Handled correctly by
partial_cmp
Float Comparison Examples:
// Normal float comparisons
compare_values(&Value::Float(1.0), &Value::Float(2.0)) // Some(Less)
compare_values(&Value::Float(3.14), &Value::Float(2.71)) // Some(Greater)
// Infinity comparisons
compare_values(&Value::Float(1.0), &Value::Float(f64::INFINITY)) // Some(Less)
compare_values(&Value::Float(f64::NEG_INFINITY), &Value::Float(1.0)) // Some(Less)
// NaN returns None, making all comparison operators return false
compare_values(&Value::Float(f64::NAN), &Value::Float(1.0)) // None
apply_less_than(&Value::Float(f64::NAN), &Value::Float(1.0)) // false
apply_greater_than(&Value::Float(f64::NAN), &Value::Float(1.0)) // false
// Float vs Int/Uint: incompatible types
compare_values(&Value::Float(1.0), &Value::Uint(1)) // None
compare_values(&Value::Int(1), &Value::Float(1.0)) // None
Note: Float equality operators use epsilon-aware comparison (|a - b| <= f64::EPSILON) rather than partial_cmp. See the Float Epsilon Equality Pattern for details on the distinction between ordering and equality semantics for floats.
Architectural Pattern and Reusability#
The pattern demonstrates three architectural principles:
- Shared Helper Function: Consolidates type-checking and coercion logic in a helper returning an intermediate type (
Option<Ordering>) - Operator Delegation: Individual operators transform the helper's result into operator-specific boolean semantics
- Type System Enforcement: Exhaustive match expressions ensure compile-time completeness when new types are added
This pattern scales beyond comparison operators. Future comparison-like operators can follow the same structure: a shared helper handles type pair logic while individual operator functions implement operator-specific semantics. The pattern separates concerns cleanly: type compatibility logic lives in one function, while operator-specific logic lives in dedicated functions.
Design Rationale#
The initial implementation approach would duplicate match logic across apply_less_than, apply_greater_than, apply_less_equal, and apply_greater_equal. Each function would independently handle all type pair combinations with identical coercion logic. The compare_values helper consolidates this logic into a single source of truth.
Benefits of the shared helper approach:
- Maintainability: Type pair handling exists in one place. Adding support for new
Valuetypes requires updating onlycompare_values. - Consistency: All comparison operators use identical coercion and type-matching logic, eliminating subtle behavioral differences.
- Testability: The shared function can be tested independently with comprehensive type pair coverage. Operator tests focus on ordering semantics rather than type handling.
- Code size: Four one-line operator functions replace four complex functions with duplicated match expressions.
The existing codebase demonstrates this pattern with equality operators, where apply_not_equal delegates to apply_equal for consistency:
pub fn apply_not_equal(left: &Value, right: &Value) -> bool {
!apply_equal(left, right)
}
This delegation ensures apply_not_equal always returns the logical negation of apply_equal, preventing behavioral inconsistencies.
Implementation Requirements#
File Changes#
| File | Purpose | Required Changes |
|---|---|---|
src/evaluator/operators.rs | Operator evaluation logic | Add compare_values helper and four apply_* functions; extend apply_operator dispatch |
src/parser/ast.rs | AST type definitions | Add four new Operator enum variants |
src/parser/grammar.rs | Parser implementation | Add comparison operator parsing with proper token ordering |
src/evaluator/strength.rs | Rule strength scoring | Assign strength scores to new comparison operators |
build.rs | Build-time rule compilation | Update operator serialization logic |
src/build_helpers.rs | Build helper utilities | Update operator serialization logic |
tests/property_tests.rs | Property-based testing | Extend operator generation strategy |
Implementation Steps#
The implementation followed these steps:
- Extended
Operatorenum insrc/parser/ast.rswithLessThan,GreaterThan,LessEqual,GreaterEqualvariants - Added parsing logic in
src/parser/grammar.rswith correct token ordering (<=before<,>=before>) - Implemented
compare_valueshelper insrc/evaluator/operators.rswith all type pair handling - Implemented four
apply_*functions that delegate tocompare_values - Extended
apply_operatordispatch to route new operator variants - Added operator serialization in both
build.rsandsrc/build_helpers.rs - Added comprehensive tests covering all type pairs, edge cases, and integration scenarios
- Updated documentation including API docs, mdbook, and examples
The Enum Extension And Exhaustive Match Synchronization pattern required synchronized updates across all files that match on the Operator enum. The Rust compiler's exhaustive match checking enforces this synchronization at compile time.
Testing Requirements#
The implementation includes comprehensive test coverage:
- Basic numeric tests: Same values, different values, zero, extreme values (
u64::MAX,i64::MIN,i64::MAX) - Cross-type integer tests:
UintvsIntcomparisons with edge cases, especiallyu64::MAXvsInt(-1) - Float tests: Normal values, infinities (
f64::INFINITY,f64::NEG_INFINITY), NaN handling, epsilon edge cases - String tests: Lexicographic ordering, case sensitivity, empty strings, Unicode handling
- Bytes tests: Lexicographic byte-by-byte comparison, empty byte sequences
- Incompatible type tests: Verify
falsereturn for mismatched types (e.g.,UintvsString,FloatvsInt) - Edge cases: Boundary values, mathematical edge cases, type coercion edge cases, NaN unordered semantics
- Operator consistency tests: Verify
compare_valuesresults match operator semantics - Dispatch tests: Verify
apply_operatorcorrectly routes to comparison operator functions - Integration tests: Real magic rule patterns with version detection, size validation, range checking
The tests maintain the high coverage standard established in the codebase with over 1,400 lines of tests for equality and bitwise operators. Implementation details are in src/evaluator/operators/comparison.rs.
Related Patterns and Topics#
-
Cross-Type Integer Coercion Pattern: The
i128-based technique for safely comparingu64andi64values without overflow, handling edge cases likeu64::MAXvsInt(-1)that have identical bit patterns but different mathematical values. -
Enum Extension And Exhaustive Match Synchronization: The architectural requirement for synchronized updates across multiple files when extending core enums. The Rust compiler's exhaustive match checking enforces this synchronization at compile time.
-
Parser Token Ordering: The critical requirement that longer operators must be parsed before shorter prefixes to prevent premature matching. For comparison operators,
<=and>=must be parsed before<and>. -
Magic File Compatibility: The broader goal of achieving compatibility with system magic databases. Comparison operators unlock approximately 40% more magic file compatibility.
-
Operator Evaluation Pipeline: The three-stage evaluation process in libmagic-rs consisting of offset resolution, type interpretation, and operator application.
References#
- Issue #34: Parser: implement comparison operators (<, >, <=, >=) - Feature request created February 14, 2026
- PR #104: feat(parser): implement comparison operators - Implementation pull request (merged)
- PR #105: docs: updates for PR #104 - Documentation updates (merged March 1, 2026)
- Issue #62: refactor: pre-create evaluator submodules for v0.2.0 features - References
comparison.rsmodule for this feature - Issue #53: Epic #24 for Operator Completeness - Tracks implementation of missing magic file operators