Cross-Type Integer Coercion Pattern#

The Cross-Type Integer Coercion Pattern is a specific implementation technique in libmagic-rs for safely comparing unsigned 64-bit integers (u64) with signed 64-bit integers (i64) by converting both operands to a 128-bit signed intermediate type (i128). This pattern ensures mathematically correct comparisons across the full range of both types without overflow or information loss. The pattern is implemented in src/evaluator/operators.rs and is used in all comparison operators (apply_equal, apply_not_equal, apply_less_than, apply_greater_than, apply_less_equal, and apply_greater_equal), while bitwise operations use a different casting strategy.

The pattern exists to handle magic rule evaluations where file data can be interpreted as either unsigned or signed integers depending on the rule specification, and comparisons must work correctly regardless of type combinations. The implementation leverages Rust's type system to provide compile-time safety guarantees while using only safe code -- no unsafe blocks are required.

The most critical edge case this pattern handles is comparing Value::Uint(u64::MAX) (18,446,744,073,709,551,615) with Value::Int(-1), which have identical bit representations in two's complement but represent vastly different mathematical values. Tests verify that these values correctly evaluate as unequal.

Technical Implementation#

Core Pattern#

The pattern is implemented in the apply_equal function (lines 48-69) using i128::from() for cross-type conversion:

pub fn apply_equal(left: &Value, right: &Value) -> bool {
    match (left, right) {
        (Value::Uint(a), Value::Uint(b)) => a == b,
        (Value::Int(a), Value::Int(b)) => a == b,
        // Cross-type integer coercion (safe via i128 to avoid overflow)
        (Value::Uint(a), Value::Int(b)) => i128::from(*a) == i128::from(*b),
        (Value::Int(a), Value::Uint(b)) => i128::from(*a) == i128::from(*b),
        _ => false,
    }
}

The implementation uses Rust's From trait which provides infallible, lossless conversions from both u64 and i64 to i128. An inline comment at line 62 documents the rationale: "Cross-type integer coercion (safe via i128 to avoid overflow)".

Type System Foundation#

The pattern operates on the Value enum defined in src/parser/ast.rs:

#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum Value {
    Uint(u64),
    Int(i64),
    Bytes(Vec<u8>),
    String(String),
}

This enum represents the possible value types produced during magic rule evaluation, where file contents can be interpreted as unsigned integers (Uint), signed integers (Int), byte sequences, or strings.

Application to Other Operators#

The apply_not_equal function (lines 108-110) inherits the pattern by delegating to apply_equal:

pub fn apply_not_equal(left: &Value, right: &Value) -> bool {
    !apply_equal(left, right)
}

The pattern extends to all comparison operators through the compare_values function, which implements ordering comparisons using the same i128 coercion strategy. The apply_less_than, apply_greater_than, apply_less_equal, and apply_greater_equal functions all delegate to compare_values, ensuring consistent cross-type integer handling across all comparison operations:

pub fn compare_values(left: &Value, right: &Value) -> Option<Ordering> {
    match (left, right) {
        (Value::Uint(a), Value::Uint(b)) => Some(a.cmp(b)),
        (Value::Int(a), Value::Int(b)) => Some(a.cmp(b)),
        (Value::Uint(a), Value::Int(b)) => Some(i128::from(*a).cmp(&i128::from(*b))),
        (Value::Int(a), Value::Uint(b)) => Some(i128::from(*a).cmp(&i128::from(*b))),
        (Value::String(a), Value::String(b)) => Some(a.cmp(b)),
        (Value::Bytes(a), Value::Bytes(b)) => Some(a.cmp(b)),
        _ => None,
    }
}

Why i128 Is Required#

Range Requirements#

The i128 type is the smallest Rust integer type that can represent the full ranges of both u64 and i64:

u64 range: 0 to 18,446,744,073,709,551,615 (0 to 2^64 - 1)
i64 range: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (-2^63 to 2^63 - 1)
i128 range: -170,141,183,460,469,231,731,687,303,715,884,105,728 to 170,141,183,460,469,231,731,687,303,715,884,105,727 (-2^127 to 2^127 - 1)

Since u64::MAX exceeds i64::MAX by more than double, no smaller signed type can represent all u64 values as positive numbers while also accommodating negative i64 values.

Alternative Approaches Rejected#

Other potential approaches were not used:

Casting to u64: Would lose sign information for negative i64 values, causing -1 to be interpreted as u64::MAX
Casting to i64: Would overflow for u64 values greater than i64::MAX, losing information
Casting to f64: Would lose precision for integers near the 64-bit boundaries (f64 has only 53 bits of mantissa)
Conditional logic: Would require complex branching based on value ranges, reducing code clarity

The i128 approach provides mathematical correctness, code simplicity, and leverages Rust's type system for safety guarantees.

Edge Cases Handled#

The i128 coercion correctly handles:

u64::MAX (18,446,744,073,709,551,615) vs negative i64: u64::MAX remains a large positive value in i128, while negative i64 values remain negative
i64::MIN (-9,223,372,036,854,775,808): Converts safely to i128 without signed overflow
Values within overlapping range: u64 values from 0 to i64::MAX correctly equal their i64 counterparts
Zero comparison: Uint(0) correctly equals Int(0) despite different enum variants

Contrast with Bitwise Operations#

Different Semantics Require Different Casting#

Bitwise operations use as u64 casting instead of i128 coercion. The apply_bitwise_and function demonstrates this alternative approach:

pub fn apply_bitwise_and(left: &Value, right: &Value) -> bool {
    match (left, right) {
        (Value::Uint(a), Value::Uint(b)) => (a & b) != 0,
        #[allow(clippy::cast_sign_loss)]
        (Value::Int(a), Value::Int(b)) => ((*a as u64) & (*b as u64)) != 0,
        #[allow(clippy::cast_sign_loss)]
        (Value::Uint(a), Value::Int(b)) => (a & (*b as u64)) != 0,
        #[allow(clippy::cast_sign_loss)]
        (Value::Int(a), Value::Uint(b)) => ((*a as u64) & b) != 0,
        _ => false,
    }
}

Key Differences#

Aspect	Comparison Operators	Bitwise Operators
Casting Method	`i128::from()`	`as u64`
Target Type	`i128` (signed, 128-bit)	`u64` (unsigned, 64-bit)
Semantic Goal	Preserve mathematical value	Preserve bit pattern
Sign Handling	Negative numbers remain negative	Negative numbers interpreted as large positive values (two's complement)
Code Comment	"safe via i128 to avoid overflow" (line 62)	"cast to unsigned for bitwise operations" (line 153)
Clippy Attributes	None needed (infallible conversion)	`#[allow(clippy::cast_sign_loss)]` required

Why Bitwise Operations Differ#

Bitwise operations need to work on the underlying bit representation rather than mathematical values. For example, bitwise AND checking if specific bits are set in a file format header must treat the sign bit as just another bit, not as a mathematical sign indicator. Tests verify this behavior with cases like Int(-1) & Int(1) evaluating to true because -1 has all bits set in two's complement representation.

Safety Guarantees#

No Unsafe Code#

The pattern uses only safe Rust:

i128::from() is a safe, infallible conversion implemented by the standard library
No pointer casts, transmutes, or unsafe blocks are required
The Rust compiler verifies all conversions at compile time

Type System Enforcement#

Rust's match expressions ensure exhaustive handling:

All Value enum variants must be matched
Adding new Value variants would cause compilation errors until updated
The pattern cannot silently fail or produce undefined behavior

Infallible Conversion#

Both u64 and i64 have impl From<u64> for i128 and impl From<i64> for i128 in the standard library, making these conversions infallible (they cannot panic or fail).

Test Coverage#

Edge Case Tests#

The test suite includes comprehensive coverage for cross-type integer coercion:

test_apply_equal_uint_vs_int (lines 403-422)#

#[test]
fn test_apply_equal_uint_vs_int() {
    // Same numeric value across types should match
    let left = Value::Uint(42);
    let right = Value::Int(42);
    assert!(apply_equal(&left, &right));

    // Negative Int cannot equal Uint
    let left = Value::Uint(42);
    let right = Value::Int(-42);
    assert!(!apply_equal(&left, &right));

    // Large Uint that doesn't fit in i64 cannot equal Int
    let left = Value::Uint(u64::MAX);
    let right = Value::Int(-1);
    assert!(!apply_equal(&left, &right));
}

This test explicitly verifies the critical edge case where Uint(u64::MAX) and Int(-1) have identical bit patterns but must evaluate as unequal.

test_apply_equal_edge_cases (lines 544-567)#

This test covers extreme values including u64::MAX and i64::MIN:

#[test]
fn test_apply_equal_edge_cases() {
    let max_unsigned = Value::Uint(u64::MAX);
    let max_signed = Value::Int(i64::MAX);
    let min_int = Value::Int(i64::MIN);

    // Cross-type edge cases
    assert!(!apply_equal(&max_unsigned, &Value::Int(-1)));
    assert!(apply_equal(&Value::Uint(i64::MAX as u64), &max_signed));
}

Exhaustive Operator Testing#

The test_apply_operator_all_combinations test (lines 1568-1619) verifies that the apply_operator dispatch function remains synchronized with individual operator implementations:

#[test]
fn test_apply_operator_all_combinations() {
    let operators = [
        Operator::Equal,
        Operator::NotEqual,
        Operator::BitwiseAnd,
        Operator::BitwiseAndMask(0xFF),
    ];
    let values = [
        Value::Uint(42),
        Value::Int(-42),
        Value::Bytes(vec![42]),
        Value::String("42".to_string()),
    ];

    for operator in &operators {
        for left in &values {
            for right in &values {
                let result = apply_operator(operator, left, right);
                let expected = match operator {
                    Operator::Equal => apply_equal(left, right),
                    // ... other operators
                };
                assert_eq!(result, expected);
            }
        }
    }
}

This test provides 64 total combinations (4 operators × 4 left values × 4 right values) ensuring no panics and consistent behavior.

Additional Test Coverage#

test_apply_equal_int_extreme_values: Verifies i64::MIN and i64::MAX handling
test_apply_not_equal_int_extreme_values: Confirms inequality operations with extreme values
test_apply_equal_all_cross_type_combinations: Systematically tests all cross-type pairs
test_apply_bitwise_and_int_negative: Demonstrates the difference between comparison and bitwise semantics
test_compare_values_ordering: Tests the compare_values function with same-type and cross-type integer comparisons, including extreme values like u64::MAX vs Int(-1)
test_comparison_operators_consistency: Verifies all comparison operators agree with compare_values ordering results
test_comparison_operators_in_magic_rules: Integration test demonstrating comparison operators in real magic rules

Architectural Context#

Role in the Evaluation Pipeline#

The Cross-Type Integer Coercion Pattern exists within libmagic-rs's three-stage evaluation pipeline:

Parsing stage: Magic rule files are parsed into an abstract syntax tree (AST)
Reading stage: File data is read at specified offsets and interpreted as typed values (producing Value::Uint or Value::Int)
Comparison stage: The pattern enables safe comparison of these values against expected values in the rule

The evaluator's three-stage pipeline is documented in src/evaluator/mod.rs, showing how values flow from file reading to operator application.

Why Cross-Type Comparison Is Necessary#

Magic rules can specify expected values as either signed or unsigned integers, while file data interpretation depends on the type specifier in the rule (e.g., byte vs ubyte, long vs ulong). The pattern allows rules like:

0 byte =42 "file contains signed 42"
0 ubyte =42 "file contains unsigned 42"
0 ubyte >128 "file contains large unsigned value"

These rules correctly match regardless of whether the comparison is Uint == Int, Int == Uint, Uint == Uint, or Int == Int, and ordering comparisons (>, <, >=, <=) work correctly across type boundaries.

Design Philosophy#

The pattern embodies libmagic-rs's commitment to:

Explicit behavior: The conversion is visible in source code, not hidden in implicit casting rules
Safety first: Uses safe Rust with compile-time guarantees rather than unsafe code or runtime checks
Correctness over performance: Prioritizes mathematically correct results even if i128 operations are slightly slower than u64 operations
Future extensibility: The same pattern scales to additional comparison operators as they are implemented

Module-level documentation (lines 4-8) states the goal: "This module provides functions for applying comparison and bitwise operators to values during magic rule evaluation. It handles type-safe comparisons between different Value variants."

Relevant Code Files#

File	Purpose	Key Lines
`src/evaluator/operators.rs`	Core implementation of cross-type integer coercion pattern	Lines 48-69 (apply_equal), Lines 108-110 (apply_not_equal)
`src/evaluator/operators.rs`	Contrasting bitwise operation implementation using u64 casting	Lines 148-166 (apply_bitwise_and)
`src/evaluator/operators.rs`	Test coverage for edge cases	Lines 403-422 (test_apply_equal_uint_vs_int), Lines 544-567 (test_apply_equal_edge_cases)
`src/evaluator/operators.rs`	Exhaustive operator consistency testing	Lines 1568-1619 (test_apply_operator_all_combinations)
`src/parser/ast.rs`	Value enum definition	Lines 121-130
`tests/property_tests.rs`	Property-based testing with arbitrary value generation	Lines 1-221

Type Coercion: General patterns for safely converting between numeric types in programming languages
Two's Complement: Binary representation system used for signed integers, where u64::MAX and i64(-1) have identical bit patterns but different mathematical meanings
Operator Overloading: Language features that allow custom operator behavior for user-defined types
Magic Number Detection: File type identification technique used by the file command and libmagic
Rust's From Trait: Standard library trait providing infallible type conversions
Rust's Ord and Ordering: Standard library traits for types that form a total order, used by compare_values to implement ordering comparisons