Documents
Cross-Type Integer Coercion Pattern
Cross-Type Integer Coercion Pattern
Type
Topic
Status
Published
Created
Mar 1, 2026
Updated
Mar 1, 2026
Created by
Dosu Bot
Updated by
Dosu Bot

Cross-Type Integer Coercion Pattern#

The Cross-Type Integer Coercion Pattern is a specific implementation technique in libmagic-rs for safely comparing unsigned 64-bit integers (u64) with signed 64-bit integers (i64) by converting both operands to a 128-bit signed intermediate type (i128). This pattern ensures mathematically correct comparisons across the full range of both types without overflow or information loss. The pattern is implemented in src/evaluator/operators.rs and is used in all comparison operators (apply_equal, apply_not_equal, apply_less_than, apply_greater_than, apply_less_equal, and apply_greater_equal), while bitwise operations use a different casting strategy.

The pattern exists to handle magic rule evaluations where file data can be interpreted as either unsigned or signed integers depending on the rule specification, and comparisons must work correctly regardless of type combinations. The implementation leverages Rust's type system to provide compile-time safety guarantees while using only safe code -- no unsafe blocks are required.

The most critical edge case this pattern handles is comparing Value::Uint(u64::MAX) (18,446,744,073,709,551,615) with Value::Int(-1), which have identical bit representations in two's complement but represent vastly different mathematical values. Tests verify that these values correctly evaluate as unequal.

Technical Implementation#

Core Pattern#

The pattern is implemented in the apply_equal function (lines 48-69) using i128::from() for cross-type conversion:

pub fn apply_equal(left: &Value, right: &Value) -> bool {
    match (left, right) {
        (Value::Uint(a), Value::Uint(b)) => a == b,
        (Value::Int(a), Value::Int(b)) => a == b,
        // Cross-type integer coercion (safe via i128 to avoid overflow)
        (Value::Uint(a), Value::Int(b)) => i128::from(*a) == i128::from(*b),
        (Value::Int(a), Value::Uint(b)) => i128::from(*a) == i128::from(*b),
        _ => false,
    }
}

The implementation uses Rust's From trait which provides infallible, lossless conversions from both u64 and i64 to i128. An inline comment at line 62 documents the rationale: "Cross-type integer coercion (safe via i128 to avoid overflow)".

Type System Foundation#

The pattern operates on the Value enum defined in src/parser/ast.rs:

#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum Value {
    Uint(u64),
    Int(i64),
    Bytes(Vec<u8>),
    String(String),
}

This enum represents the possible value types produced during magic rule evaluation, where file contents can be interpreted as unsigned integers (Uint), signed integers (Int), byte sequences, or strings.

Application to Other Operators#

The apply_not_equal function (lines 108-110) inherits the pattern by delegating to apply_equal:

pub fn apply_not_equal(left: &Value, right: &Value) -> bool {
    !apply_equal(left, right)
}

The pattern extends to all comparison operators through the compare_values function, which implements ordering comparisons using the same i128 coercion strategy. The apply_less_than, apply_greater_than, apply_less_equal, and apply_greater_equal functions all delegate to compare_values, ensuring consistent cross-type integer handling across all comparison operations:

pub fn compare_values(left: &Value, right: &Value) -> Option<Ordering> {
    match (left, right) {
        (Value::Uint(a), Value::Uint(b)) => Some(a.cmp(b)),
        (Value::Int(a), Value::Int(b)) => Some(a.cmp(b)),
        (Value::Uint(a), Value::Int(b)) => Some(i128::from(*a).cmp(&i128::from(*b))),
        (Value::Int(a), Value::Uint(b)) => Some(i128::from(*a).cmp(&i128::from(*b))),
        (Value::String(a), Value::String(b)) => Some(a.cmp(b)),
        (Value::Bytes(a), Value::Bytes(b)) => Some(a.cmp(b)),
        _ => None,
    }
}

Why i128 Is Required#

Range Requirements#

The i128 type is the smallest Rust integer type that can represent the full ranges of both u64 and i64:

  • u64 range: 0 to 18,446,744,073,709,551,615 (0 to 2^64 - 1)
  • i64 range: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (-2^63 to 2^63 - 1)
  • i128 range: -170,141,183,460,469,231,731,687,303,715,884,105,728 to 170,141,183,460,469,231,731,687,303,715,884,105,727 (-2^127 to 2^127 - 1)

Since u64::MAX exceeds i64::MAX by more than double, no smaller signed type can represent all u64 values as positive numbers while also accommodating negative i64 values.

Alternative Approaches Rejected#

Other potential approaches were not used:

  1. Casting to u64: Would lose sign information for negative i64 values, causing -1 to be interpreted as u64::MAX
  2. Casting to i64: Would overflow for u64 values greater than i64::MAX, losing information
  3. Casting to f64: Would lose precision for integers near the 64-bit boundaries (f64 has only 53 bits of mantissa)
  4. Conditional logic: Would require complex branching based on value ranges, reducing code clarity

The i128 approach provides mathematical correctness, code simplicity, and leverages Rust's type system for safety guarantees.

Edge Cases Handled#

The i128 coercion correctly handles:

  1. u64::MAX (18,446,744,073,709,551,615) vs negative i64: u64::MAX remains a large positive value in i128, while negative i64 values remain negative
  2. i64::MIN (-9,223,372,036,854,775,808): Converts safely to i128 without signed overflow
  3. Values within overlapping range: u64 values from 0 to i64::MAX correctly equal their i64 counterparts
  4. Zero comparison: Uint(0) correctly equals Int(0) despite different enum variants

Contrast with Bitwise Operations#

Different Semantics Require Different Casting#

Bitwise operations use as u64 casting instead of i128 coercion. The apply_bitwise_and function demonstrates this alternative approach:

pub fn apply_bitwise_and(left: &Value, right: &Value) -> bool {
    match (left, right) {
        (Value::Uint(a), Value::Uint(b)) => (a & b) != 0,
        #[allow(clippy::cast_sign_loss)]
        (Value::Int(a), Value::Int(b)) => ((*a as u64) & (*b as u64)) != 0,
        #[allow(clippy::cast_sign_loss)]
        (Value::Uint(a), Value::Int(b)) => (a & (*b as u64)) != 0,
        #[allow(clippy::cast_sign_loss)]
        (Value::Int(a), Value::Uint(b)) => ((*a as u64) & b) != 0,
        _ => false,
    }
}

Key Differences#

AspectComparison OperatorsBitwise Operators
Casting Methodi128::from()as u64
Target Typei128 (signed, 128-bit)u64 (unsigned, 64-bit)
Semantic GoalPreserve mathematical valuePreserve bit pattern
Sign HandlingNegative numbers remain negativeNegative numbers interpreted as large positive values (two's complement)
Code Comment"safe via i128 to avoid overflow" (line 62)"cast to unsigned for bitwise operations" (line 153)
Clippy AttributesNone needed (infallible conversion)#[allow(clippy::cast_sign_loss)] required

Why Bitwise Operations Differ#

Bitwise operations need to work on the underlying bit representation rather than mathematical values. For example, bitwise AND checking if specific bits are set in a file format header must treat the sign bit as just another bit, not as a mathematical sign indicator. Tests verify this behavior with cases like Int(-1) & Int(1) evaluating to true because -1 has all bits set in two's complement representation.

Safety Guarantees#

No Unsafe Code#

The pattern uses only safe Rust:

  • i128::from() is a safe, infallible conversion implemented by the standard library
  • No pointer casts, transmutes, or unsafe blocks are required
  • The Rust compiler verifies all conversions at compile time

Type System Enforcement#

Rust's match expressions ensure exhaustive handling:

  • All Value enum variants must be matched
  • Adding new Value variants would cause compilation errors until updated
  • The pattern cannot silently fail or produce undefined behavior

Infallible Conversion#

Both u64 and i64 have impl From<u64> for i128 and impl From<i64> for i128 in the standard library, making these conversions infallible (they cannot panic or fail).

Test Coverage#

Edge Case Tests#

The test suite includes comprehensive coverage for cross-type integer coercion:

test_apply_equal_uint_vs_int (lines 403-422)#

#[test]
fn test_apply_equal_uint_vs_int() {
    // Same numeric value across types should match
    let left = Value::Uint(42);
    let right = Value::Int(42);
    assert!(apply_equal(&left, &right));

    // Negative Int cannot equal Uint
    let left = Value::Uint(42);
    let right = Value::Int(-42);
    assert!(!apply_equal(&left, &right));

    // Large Uint that doesn't fit in i64 cannot equal Int
    let left = Value::Uint(u64::MAX);
    let right = Value::Int(-1);
    assert!(!apply_equal(&left, &right));
}

This test explicitly verifies the critical edge case where Uint(u64::MAX) and Int(-1) have identical bit patterns but must evaluate as unequal.

test_apply_equal_edge_cases (lines 544-567)#

This test covers extreme values including u64::MAX and i64::MIN:

#[test]
fn test_apply_equal_edge_cases() {
    let max_unsigned = Value::Uint(u64::MAX);
    let max_signed = Value::Int(i64::MAX);
    let min_int = Value::Int(i64::MIN);

    // Cross-type edge cases
    assert!(!apply_equal(&max_unsigned, &Value::Int(-1)));
    assert!(apply_equal(&Value::Uint(i64::MAX as u64), &max_signed));
}

Exhaustive Operator Testing#

The test_apply_operator_all_combinations test (lines 1568-1619) verifies that the apply_operator dispatch function remains synchronized with individual operator implementations:

#[test]
fn test_apply_operator_all_combinations() {
    let operators = [
        Operator::Equal,
        Operator::NotEqual,
        Operator::BitwiseAnd,
        Operator::BitwiseAndMask(0xFF),
    ];
    let values = [
        Value::Uint(42),
        Value::Int(-42),
        Value::Bytes(vec![42]),
        Value::String("42".to_string()),
    ];

    for operator in &operators {
        for left in &values {
            for right in &values {
                let result = apply_operator(operator, left, right);
                let expected = match operator {
                    Operator::Equal => apply_equal(left, right),
                    // ... other operators
                };
                assert_eq!(result, expected);
            }
        }
    }
}

This test provides 64 total combinations (4 operators × 4 left values × 4 right values) ensuring no panics and consistent behavior.

Additional Test Coverage#

Architectural Context#

Role in the Evaluation Pipeline#

The Cross-Type Integer Coercion Pattern exists within libmagic-rs's three-stage evaluation pipeline:

  1. Parsing stage: Magic rule files are parsed into an abstract syntax tree (AST)
  2. Reading stage: File data is read at specified offsets and interpreted as typed values (producing Value::Uint or Value::Int)
  3. Comparison stage: The pattern enables safe comparison of these values against expected values in the rule

The evaluator's three-stage pipeline is documented in src/evaluator/mod.rs, showing how values flow from file reading to operator application.

Why Cross-Type Comparison Is Necessary#

Magic rules can specify expected values as either signed or unsigned integers, while file data interpretation depends on the type specifier in the rule (e.g., byte vs ubyte, long vs ulong). The pattern allows rules like:

0 byte =42 "file contains signed 42"
0 ubyte =42 "file contains unsigned 42"
0 ubyte >128 "file contains large unsigned value"

These rules correctly match regardless of whether the comparison is Uint == Int, Int == Uint, Uint == Uint, or Int == Int, and ordering comparisons (>, <, >=, <=) work correctly across type boundaries.

Design Philosophy#

The pattern embodies libmagic-rs's commitment to:

  1. Explicit behavior: The conversion is visible in source code, not hidden in implicit casting rules
  2. Safety first: Uses safe Rust with compile-time guarantees rather than unsafe code or runtime checks
  3. Correctness over performance: Prioritizes mathematically correct results even if i128 operations are slightly slower than u64 operations
  4. Future extensibility: The same pattern scales to additional comparison operators as they are implemented

Module-level documentation (lines 4-8) states the goal: "This module provides functions for applying comparison and bitwise operators to values during magic rule evaluation. It handles type-safe comparisons between different Value variants."

Relevant Code Files#

FilePurposeKey Lines
src/evaluator/operators.rsCore implementation of cross-type integer coercion patternLines 48-69 (apply_equal), Lines 108-110 (apply_not_equal)
src/evaluator/operators.rsContrasting bitwise operation implementation using u64 castingLines 148-166 (apply_bitwise_and)
src/evaluator/operators.rsTest coverage for edge casesLines 403-422 (test_apply_equal_uint_vs_int), Lines 544-567 (test_apply_equal_edge_cases)
src/evaluator/operators.rsExhaustive operator consistency testingLines 1568-1619 (test_apply_operator_all_combinations)
src/parser/ast.rsValue enum definitionLines 121-130
tests/property_tests.rsProperty-based testing with arbitrary value generationLines 1-221
  • Type Coercion: General patterns for safely converting between numeric types in programming languages
  • Two's Complement: Binary representation system used for signed integers, where u64::MAX and i64(-1) have identical bit patterns but different mathematical meanings
  • Operator Overloading: Language features that allow custom operator behavior for user-defined types
  • Magic Number Detection: File type identification technique used by the file command and libmagic
  • Rust's From Trait: Standard library trait providing infallible type conversions
  • Rust's Ord and Ordering: Standard library traits for types that form a total order, used by compare_values to implement ordering comparisons

See Also#

Cross-Type Integer Coercion Pattern | Dosu