Documents
Comparison Operators And Compare_Values Helper Pattern
Comparison Operators And Compare_Values Helper Pattern
Type
Topic
Status
Published
Created
Mar 1, 2026
Updated
Mar 7, 2026
Created by
Dosu Bot
Updated by
Dosu Bot

Comparison Operators And Compare_Values Helper Pattern#

Overview#

The Comparison Operators And Compare_Values Helper Pattern is a reusable architectural pattern for implementing comparison operators in libmagic-rs. The pattern uses a shared compare_values helper function that handles all Value type pairs with cross-type coercion, consolidating duplicated type-matching logic. This architectural approach avoids code duplication across the four comparison operator implementations (<, >, <=, >=) by delegating to a single source of truth for type-pair matching and ordering logic.

The pattern addresses Issue #34, which requests comparison operators to improve magic file compatibility. According to the issue, comparison operators unlock approximately 40% more magic file compatibility and are critical for version checks, size validation, and range matching in magic rules. Magic files frequently use comparison operators to detect file format versions, validate size constraints, and perform range-based matching.

The implementation builds on the foundation of cross-type integer coercion using i128, which safely compares u64 and i64 values. The compare_values helper extends this pattern to return Option<Ordering> instead of bool, enabling all four comparison operators to share identical type-handling logic while implementing operator-specific ordering semantics.

Implementation Status#

The comparison operators and compare_values helper function were implemented in PR #104, which was merged. The implementation includes all four comparison operators (LessThan, GreaterThan, LessEqual, GreaterEqual) added to the Operator enum, the shared compare_values helper function returning Option<Ordering>, parser support for comparison operator syntax with proper token ordering, and comprehensive test coverage for all type pairs and edge cases. This article documents the implemented pattern to serve as a reference for the architectural decisions and design approach.

The compare_values Helper Function#

The compare_values function returns Option<Ordering> and handles all Value type pairs:

pub fn compare_values(left: &Value, right: &Value) -> Option<Ordering> {
    match (left, right) {
        (Value::Uint(a), Value::Uint(b)) => Some(a.cmp(b)),
        (Value::Int(a), Value::Int(b)) => Some(a.cmp(b)),
        (Value::Uint(a), Value::Int(b)) => Some(i128::from(*a).cmp(&i128::from(*b))),
        (Value::Int(a), Value::Uint(b)) => Some(i128::from(*a).cmp(&i128::from(*b))),
        (Value::Float(a), Value::Float(b)) => a.partial_cmp(b),
        (Value::String(a), Value::String(b)) => Some(a.cmp(b)),
        (Value::Bytes(a), Value::Bytes(b)) => Some(a.cmp(b)),
        _ => None,
    }
}

Type Pairs Handled:

  • Same-type integer comparisons (Uint/Uint, Int/Int): Direct native comparison using Rust's cmp method
  • Cross-type integer comparisons (Uint/Int, Int/Uint): Uses i128 coercion to handle the full range of both types without overflow
  • Float comparisons (Float/Float): Uses partial_cmp which returns Some(Ordering) for normal values and infinities, None for NaN
  • String comparisons (String/String): Lexicographic comparison
  • Bytes comparisons (Bytes/Bytes): Lexicographic byte-by-byte comparison
  • Incompatible type pairs (including Float vs Int/Uint): Returns None (evaluates to false in operator functions)

Comparison Operator Functions#

The implementation includes four operator functions that delegate to compare_values:

pub fn apply_less_than(left: &Value, right: &Value) -> bool {
    compare_values(left, right) == Some(Ordering::Less)
}

pub fn apply_greater_than(left: &Value, right: &Value) -> bool {
    compare_values(left, right) == Some(Ordering::Greater)
}

pub fn apply_less_equal(left: &Value, right: &Value) -> bool {
    matches!(
        compare_values(left, right),
        Some(Ordering::Less | Ordering::Equal)
    )
}

pub fn apply_greater_equal(left: &Value, right: &Value) -> bool {
    matches!(
        compare_values(left, right),
        Some(Ordering::Greater | Ordering::Equal)
    )
}

Each function transforms the Option<Ordering> result into operator-specific boolean semantics:

  • apply_less_than: Returns true if ordering is Less, false otherwise
  • apply_greater_than: Returns true if ordering is Greater, false otherwise
  • apply_less_equal: Returns true unless ordering is Greater, false for None
  • apply_greater_equal: Returns true unless ordering is Less, false for None

Incompatible type pairs (returning None from compare_values) evaluate to false in all operators.

Cross-Type Integer Coercion#

The pattern's critical innovation is using i128 for cross-type integer comparisons. This approach is necessary because:

  • u64 range: 0 to 18,446,744,073,709,551,615 (0 to 2^64 - 1)
  • i64 range: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (-2^63 to 2^63 - 1)
  • i128 range: Can represent both full ranges without loss or overflow

The pattern correctly handles the edge case where Value::Uint(u64::MAX) and Value::Int(-1) have identical bit patterns (both are 0xFFFFFFFFFFFFFFFF in two's complement representation) but represent vastly different mathematical values. The i128 coercion ensures they correctly evaluate as unequal: i128::from(u64::MAX) is 18,446,744,073,709,551,615, while i128::from(-1_i64) is -1.

The existing apply_equal function already implements this i128 coercion pattern, demonstrating the approach's viability:

// Cross-type integer coercion (safe via i128 to avoid overflow)
(Value::Uint(a), Value::Int(b)) => i128::from(*a) == i128::from(*b),
(Value::Int(a), Value::Uint(b)) => i128::from(*a) == i128::from(*b),

The compare_values helper extends this pattern from equality comparisons to ordering comparisons.

AST and Parser Integration#

The Operator enum in src/parser/ast.rs includes four comparison operator variants:

pub enum Operator {
    Equal,
    NotEqual,
    LessThan,
    GreaterThan,
    LessEqual,
    GreaterEqual,
    BitwiseAnd,
    BitwiseAndMask(u64),
}

The parser in src/parser/grammar.rs recognizes comparison operator syntax with proper token ordering. Longer operators (<=, >=) are parsed before shorter prefixes (<, >) to prevent premature matching. This ordering requirement is critical: if < is parsed before <=, the input <= would incorrectly match as < followed by =.

Operator Dispatch#

The apply_operator function in src/evaluator/operators.rs dispatches to all operator implementations including comparison operators:

pub fn apply_operator(operator: &Operator, left: &Value, right: &Value) -> bool {
    match operator {
        Operator::Equal => apply_equal(left, right),
        Operator::NotEqual => apply_not_equal(left, right),
        Operator::LessThan => apply_less_than(left, right),
        Operator::GreaterThan => apply_greater_than(left, right),
        Operator::LessEqual => apply_less_equal(left, right),
        Operator::GreaterEqual => apply_greater_equal(left, right),
        Operator::BitwiseAnd => apply_bitwise_and(left, right),
        Operator::BitwiseAndMask(mask) => { /* inline logic */ },
    }
}

The exhaustive match ensures compile-time verification that all Operator variants are handled. When new operators are added to the enum, the compiler forces updates to this dispatch function.

Value Type System#

The Value enum in src/parser/ast.rs defines five variants:

pub enum Value {
    Uint(u64),
    Int(i64),
    Float(f64),
    Bytes(Vec<u8>),
    String(String),
}

The Value enum derives PartialEq but not Eq due to IEEE 754 floating-point NaN semantics (NaN != NaN).

The pattern handles all meaningful comparison combinations. Same-type comparisons use native Rust ordering. Cross-type integer comparisons require i128 coercion to ensure mathematical correctness. Float comparisons use partial_cmp which returns Option<Ordering> to handle NaN. String and byte comparisons use lexicographic ordering. Incompatible type combinations (such as Uint vs String, or Float vs Int) return None from compare_values, which operators interpret as false.

Usage Examples#

Magic Rule Patterns#

Comparison operators enable common magic file patterns:

Version Detection:

0 long >0x00020000 PDF document, version 2.x

Size Validation:

8 long <100 Small file marker

Range Checking:

0 long >=1000 Large file header
0 long <=65535 Valid 16-bit value

These patterns are fundamental to real-world magic files. The GNU file command's magic database extensively uses comparison operators for format version detection, size constraints, and range-based matching.

Type Compatibility Table#

Left TypeRight TypeResultBehavior
UintUintSome(Ordering)Direct native comparison
IntIntSome(Ordering)Direct native comparison
UintIntSome(Ordering)i128 coercion prevents overflow
IntUintSome(Ordering)i128 coercion prevents overflow
FloatFloatSome(Ordering) or Nonepartial_cmp: normal values and infinities return Some, NaN returns None
StringStringSome(Ordering)Lexicographic comparison
BytesBytesSome(Ordering)Lexicographic byte-by-byte
FloatInt/UintNoneIncompatible (no cross-type coercion)
Mixed incompatibleMixed incompatibleNoneEvaluates to false

IEEE 754 Float Comparison Semantics#

Float-to-float comparisons follow IEEE 754 ordering rules via Rust's partial_cmp:

  • Normal finite values: Standard numerical ordering (1.0 < 2.0)
  • Infinities: -∞ < all finite values < +∞; infinities equal themselves
  • NaN (Not a Number): Unordered with respect to all values (including itself). partial_cmp returns None, causing all comparison operators to return false
  • Subnormal numbers: Handled correctly by partial_cmp

Float Comparison Examples:

// Normal float comparisons
compare_values(&Value::Float(1.0), &Value::Float(2.0)) // Some(Less)
compare_values(&Value::Float(3.14), &Value::Float(2.71)) // Some(Greater)

// Infinity comparisons
compare_values(&Value::Float(1.0), &Value::Float(f64::INFINITY)) // Some(Less)
compare_values(&Value::Float(f64::NEG_INFINITY), &Value::Float(1.0)) // Some(Less)

// NaN returns None, making all comparison operators return false
compare_values(&Value::Float(f64::NAN), &Value::Float(1.0)) // None
apply_less_than(&Value::Float(f64::NAN), &Value::Float(1.0)) // false
apply_greater_than(&Value::Float(f64::NAN), &Value::Float(1.0)) // false

// Float vs Int/Uint: incompatible types
compare_values(&Value::Float(1.0), &Value::Uint(1)) // None
compare_values(&Value::Int(1), &Value::Float(1.0)) // None

Note: Float equality operators use epsilon-aware comparison (|a - b| <= f64::EPSILON) rather than partial_cmp. See the Float Epsilon Equality Pattern for details on the distinction between ordering and equality semantics for floats.

Architectural Pattern and Reusability#

The pattern demonstrates three architectural principles:

  1. Shared Helper Function: Consolidates type-checking and coercion logic in a helper returning an intermediate type (Option<Ordering>)
  2. Operator Delegation: Individual operators transform the helper's result into operator-specific boolean semantics
  3. Type System Enforcement: Exhaustive match expressions ensure compile-time completeness when new types are added

This pattern scales beyond comparison operators. Future comparison-like operators can follow the same structure: a shared helper handles type pair logic while individual operator functions implement operator-specific semantics. The pattern separates concerns cleanly: type compatibility logic lives in one function, while operator-specific logic lives in dedicated functions.

Design Rationale#

The initial implementation approach would duplicate match logic across apply_less_than, apply_greater_than, apply_less_equal, and apply_greater_equal. Each function would independently handle all type pair combinations with identical coercion logic. The compare_values helper consolidates this logic into a single source of truth.

Benefits of the shared helper approach:

  • Maintainability: Type pair handling exists in one place. Adding support for new Value types requires updating only compare_values.
  • Consistency: All comparison operators use identical coercion and type-matching logic, eliminating subtle behavioral differences.
  • Testability: The shared function can be tested independently with comprehensive type pair coverage. Operator tests focus on ordering semantics rather than type handling.
  • Code size: Four one-line operator functions replace four complex functions with duplicated match expressions.

The existing codebase demonstrates this pattern with equality operators, where apply_not_equal delegates to apply_equal for consistency:

pub fn apply_not_equal(left: &Value, right: &Value) -> bool {
    !apply_equal(left, right)
}

This delegation ensures apply_not_equal always returns the logical negation of apply_equal, preventing behavioral inconsistencies.

Implementation Requirements#

File Changes#

FilePurposeRequired Changes
src/evaluator/operators.rsOperator evaluation logicAdd compare_values helper and four apply_* functions; extend apply_operator dispatch
src/parser/ast.rsAST type definitionsAdd four new Operator enum variants
src/parser/grammar.rsParser implementationAdd comparison operator parsing with proper token ordering
src/evaluator/strength.rsRule strength scoringAssign strength scores to new comparison operators
build.rsBuild-time rule compilationUpdate operator serialization logic
src/build_helpers.rsBuild helper utilitiesUpdate operator serialization logic
tests/property_tests.rsProperty-based testingExtend operator generation strategy

Implementation Steps#

The implementation followed these steps:

  1. Extended Operator enum in src/parser/ast.rs with LessThan, GreaterThan, LessEqual, GreaterEqual variants
  2. Added parsing logic in src/parser/grammar.rs with correct token ordering (<= before <, >= before >)
  3. Implemented compare_values helper in src/evaluator/operators.rs with all type pair handling
  4. Implemented four apply_* functions that delegate to compare_values
  5. Extended apply_operator dispatch to route new operator variants
  6. Added operator serialization in both build.rs and src/build_helpers.rs
  7. Added comprehensive tests covering all type pairs, edge cases, and integration scenarios
  8. Updated documentation including API docs, mdbook, and examples

The Enum Extension And Exhaustive Match Synchronization pattern required synchronized updates across all files that match on the Operator enum. The Rust compiler's exhaustive match checking enforces this synchronization at compile time.

Testing Requirements#

The implementation includes comprehensive test coverage:

  • Basic numeric tests: Same values, different values, zero, extreme values (u64::MAX, i64::MIN, i64::MAX)
  • Cross-type integer tests: Uint vs Int comparisons with edge cases, especially u64::MAX vs Int(-1)
  • Float tests: Normal values, infinities (f64::INFINITY, f64::NEG_INFINITY), NaN handling, epsilon edge cases
  • String tests: Lexicographic ordering, case sensitivity, empty strings, Unicode handling
  • Bytes tests: Lexicographic byte-by-byte comparison, empty byte sequences
  • Incompatible type tests: Verify false return for mismatched types (e.g., Uint vs String, Float vs Int)
  • Edge cases: Boundary values, mathematical edge cases, type coercion edge cases, NaN unordered semantics
  • Operator consistency tests: Verify compare_values results match operator semantics
  • Dispatch tests: Verify apply_operator correctly routes to comparison operator functions
  • Integration tests: Real magic rule patterns with version detection, size validation, range checking

The tests maintain the high coverage standard established in the codebase with over 1,400 lines of tests for equality and bitwise operators. Implementation details are in src/evaluator/operators/comparison.rs.

  • Cross-Type Integer Coercion Pattern: The i128-based technique for safely comparing u64 and i64 values without overflow, handling edge cases like u64::MAX vs Int(-1) that have identical bit patterns but different mathematical values.

  • Enum Extension And Exhaustive Match Synchronization: The architectural requirement for synchronized updates across multiple files when extending core enums. The Rust compiler's exhaustive match checking enforces this synchronization at compile time.

  • Parser Token Ordering: The critical requirement that longer operators must be parsed before shorter prefixes to prevent premature matching. For comparison operators, <= and >= must be parsed before < and >.

  • Magic File Compatibility: The broader goal of achieving compatibility with system magic databases. Comparison operators unlock approximately 40% more magic file compatibility.

  • Operator Evaluation Pipeline: The three-stage evaluation process in libmagic-rs consisting of offset resolution, type interpretation, and operator application.

References#

Comparison Operators And Compare_Values Helper Pattern | Dosu