Cross-Type Integer Coercion Pattern#
The Cross-Type Integer Coercion Pattern is a specific implementation technique in libmagic-rs for safely comparing unsigned 64-bit integers (u64) with signed 64-bit integers (i64) by converting both operands to a 128-bit signed intermediate type (i128). This pattern ensures mathematically correct comparisons across the full range of both types without overflow or information loss. The pattern is implemented in src/evaluator/operators.rs and is used in all comparison operators (apply_equal, apply_not_equal, apply_less_than, apply_greater_than, apply_less_equal, and apply_greater_equal), while bitwise operations use a different casting strategy.
The pattern exists to handle magic rule evaluations where file data can be interpreted as either unsigned or signed integers depending on the rule specification, and comparisons must work correctly regardless of type combinations. The implementation leverages Rust's type system to provide compile-time safety guarantees while using only safe code -- no unsafe blocks are required.
The most critical edge case this pattern handles is comparing Value::Uint(u64::MAX) (18,446,744,073,709,551,615) with Value::Int(-1), which have identical bit representations in two's complement but represent vastly different mathematical values. Tests verify that these values correctly evaluate as unequal.
Technical Implementation#
Core Pattern#
The pattern is implemented in the apply_equal function (lines 48-69) using i128::from() for cross-type conversion:
pub fn apply_equal(left: &Value, right: &Value) -> bool {
match (left, right) {
(Value::Uint(a), Value::Uint(b)) => a == b,
(Value::Int(a), Value::Int(b)) => a == b,
// Cross-type integer coercion (safe via i128 to avoid overflow)
(Value::Uint(a), Value::Int(b)) => i128::from(*a) == i128::from(*b),
(Value::Int(a), Value::Uint(b)) => i128::from(*a) == i128::from(*b),
_ => false,
}
}
The implementation uses Rust's From trait which provides infallible, lossless conversions from both u64 and i64 to i128. An inline comment at line 62 documents the rationale: "Cross-type integer coercion (safe via i128 to avoid overflow)".
Type System Foundation#
The pattern operates on the Value enum defined in src/parser/ast.rs:
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum Value {
Uint(u64),
Int(i64),
Bytes(Vec<u8>),
String(String),
}
This enum represents the possible value types produced during magic rule evaluation, where file contents can be interpreted as unsigned integers (Uint), signed integers (Int), byte sequences, or strings.
Application to Other Operators#
The apply_not_equal function (lines 108-110) inherits the pattern by delegating to apply_equal:
pub fn apply_not_equal(left: &Value, right: &Value) -> bool {
!apply_equal(left, right)
}
The pattern extends to all comparison operators through the compare_values function, which implements ordering comparisons using the same i128 coercion strategy. The apply_less_than, apply_greater_than, apply_less_equal, and apply_greater_equal functions all delegate to compare_values, ensuring consistent cross-type integer handling across all comparison operations:
pub fn compare_values(left: &Value, right: &Value) -> Option<Ordering> {
match (left, right) {
(Value::Uint(a), Value::Uint(b)) => Some(a.cmp(b)),
(Value::Int(a), Value::Int(b)) => Some(a.cmp(b)),
(Value::Uint(a), Value::Int(b)) => Some(i128::from(*a).cmp(&i128::from(*b))),
(Value::Int(a), Value::Uint(b)) => Some(i128::from(*a).cmp(&i128::from(*b))),
(Value::String(a), Value::String(b)) => Some(a.cmp(b)),
(Value::Bytes(a), Value::Bytes(b)) => Some(a.cmp(b)),
_ => None,
}
}
Why i128 Is Required#
Range Requirements#
The i128 type is the smallest Rust integer type that can represent the full ranges of both u64 and i64:
- u64 range: 0 to 18,446,744,073,709,551,615 (0 to 2^64 - 1)
- i64 range: -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (-2^63 to 2^63 - 1)
- i128 range: -170,141,183,460,469,231,731,687,303,715,884,105,728 to 170,141,183,460,469,231,731,687,303,715,884,105,727 (-2^127 to 2^127 - 1)
Since u64::MAX exceeds i64::MAX by more than double, no smaller signed type can represent all u64 values as positive numbers while also accommodating negative i64 values.
Alternative Approaches Rejected#
Other potential approaches were not used:
- Casting to u64: Would lose sign information for negative i64 values, causing
-1to be interpreted asu64::MAX - Casting to i64: Would overflow for u64 values greater than
i64::MAX, losing information - Casting to f64: Would lose precision for integers near the 64-bit boundaries (f64 has only 53 bits of mantissa)
- Conditional logic: Would require complex branching based on value ranges, reducing code clarity
The i128 approach provides mathematical correctness, code simplicity, and leverages Rust's type system for safety guarantees.
Edge Cases Handled#
The i128 coercion correctly handles:
- u64::MAX (18,446,744,073,709,551,615) vs negative i64: u64::MAX remains a large positive value in i128, while negative i64 values remain negative
- i64::MIN (-9,223,372,036,854,775,808): Converts safely to i128 without signed overflow
- Values within overlapping range: u64 values from 0 to
i64::MAXcorrectly equal their i64 counterparts - Zero comparison:
Uint(0)correctly equalsInt(0)despite different enum variants
Contrast with Bitwise Operations#
Different Semantics Require Different Casting#
Bitwise operations use as u64 casting instead of i128 coercion. The apply_bitwise_and function demonstrates this alternative approach:
pub fn apply_bitwise_and(left: &Value, right: &Value) -> bool {
match (left, right) {
(Value::Uint(a), Value::Uint(b)) => (a & b) != 0,
#[allow(clippy::cast_sign_loss)]
(Value::Int(a), Value::Int(b)) => ((*a as u64) & (*b as u64)) != 0,
#[allow(clippy::cast_sign_loss)]
(Value::Uint(a), Value::Int(b)) => (a & (*b as u64)) != 0,
#[allow(clippy::cast_sign_loss)]
(Value::Int(a), Value::Uint(b)) => ((*a as u64) & b) != 0,
_ => false,
}
}
Key Differences#
| Aspect | Comparison Operators | Bitwise Operators |
|---|---|---|
| Casting Method | i128::from() | as u64 |
| Target Type | i128 (signed, 128-bit) | u64 (unsigned, 64-bit) |
| Semantic Goal | Preserve mathematical value | Preserve bit pattern |
| Sign Handling | Negative numbers remain negative | Negative numbers interpreted as large positive values (two's complement) |
| Code Comment | "safe via i128 to avoid overflow" (line 62) | "cast to unsigned for bitwise operations" (line 153) |
| Clippy Attributes | None needed (infallible conversion) | #[allow(clippy::cast_sign_loss)] required |
Why Bitwise Operations Differ#
Bitwise operations need to work on the underlying bit representation rather than mathematical values. For example, bitwise AND checking if specific bits are set in a file format header must treat the sign bit as just another bit, not as a mathematical sign indicator. Tests verify this behavior with cases like Int(-1) & Int(1) evaluating to true because -1 has all bits set in two's complement representation.
Safety Guarantees#
No Unsafe Code#
The pattern uses only safe Rust:
i128::from()is a safe, infallible conversion implemented by the standard library- No pointer casts, transmutes, or
unsafeblocks are required - The Rust compiler verifies all conversions at compile time
Type System Enforcement#
Rust's match expressions ensure exhaustive handling:
- All
Valueenum variants must be matched - Adding new
Valuevariants would cause compilation errors until updated - The pattern cannot silently fail or produce undefined behavior
Infallible Conversion#
Both u64 and i64 have impl From<u64> for i128 and impl From<i64> for i128 in the standard library, making these conversions infallible (they cannot panic or fail).
Test Coverage#
Edge Case Tests#
The test suite includes comprehensive coverage for cross-type integer coercion:
test_apply_equal_uint_vs_int (lines 403-422)#
#[test]
fn test_apply_equal_uint_vs_int() {
// Same numeric value across types should match
let left = Value::Uint(42);
let right = Value::Int(42);
assert!(apply_equal(&left, &right));
// Negative Int cannot equal Uint
let left = Value::Uint(42);
let right = Value::Int(-42);
assert!(!apply_equal(&left, &right));
// Large Uint that doesn't fit in i64 cannot equal Int
let left = Value::Uint(u64::MAX);
let right = Value::Int(-1);
assert!(!apply_equal(&left, &right));
}
This test explicitly verifies the critical edge case where Uint(u64::MAX) and Int(-1) have identical bit patterns but must evaluate as unequal.
test_apply_equal_edge_cases (lines 544-567)#
This test covers extreme values including u64::MAX and i64::MIN:
#[test]
fn test_apply_equal_edge_cases() {
let max_unsigned = Value::Uint(u64::MAX);
let max_signed = Value::Int(i64::MAX);
let min_int = Value::Int(i64::MIN);
// Cross-type edge cases
assert!(!apply_equal(&max_unsigned, &Value::Int(-1)));
assert!(apply_equal(&Value::Uint(i64::MAX as u64), &max_signed));
}
Exhaustive Operator Testing#
The test_apply_operator_all_combinations test (lines 1568-1619) verifies that the apply_operator dispatch function remains synchronized with individual operator implementations:
#[test]
fn test_apply_operator_all_combinations() {
let operators = [
Operator::Equal,
Operator::NotEqual,
Operator::BitwiseAnd,
Operator::BitwiseAndMask(0xFF),
];
let values = [
Value::Uint(42),
Value::Int(-42),
Value::Bytes(vec![42]),
Value::String("42".to_string()),
];
for operator in &operators {
for left in &values {
for right in &values {
let result = apply_operator(operator, left, right);
let expected = match operator {
Operator::Equal => apply_equal(left, right),
// ... other operators
};
assert_eq!(result, expected);
}
}
}
}
This test provides 64 total combinations (4 operators × 4 left values × 4 right values) ensuring no panics and consistent behavior.
Additional Test Coverage#
- test_apply_equal_int_extreme_values: Verifies
i64::MINandi64::MAXhandling - test_apply_not_equal_int_extreme_values: Confirms inequality operations with extreme values
- test_apply_equal_all_cross_type_combinations: Systematically tests all cross-type pairs
- test_apply_bitwise_and_int_negative: Demonstrates the difference between comparison and bitwise semantics
- test_compare_values_ordering: Tests the
compare_valuesfunction with same-type and cross-type integer comparisons, including extreme values likeu64::MAXvsInt(-1) - test_comparison_operators_consistency: Verifies all comparison operators agree with
compare_valuesordering results - test_comparison_operators_in_magic_rules: Integration test demonstrating comparison operators in real magic rules
Architectural Context#
Role in the Evaluation Pipeline#
The Cross-Type Integer Coercion Pattern exists within libmagic-rs's three-stage evaluation pipeline:
- Parsing stage: Magic rule files are parsed into an abstract syntax tree (AST)
- Reading stage: File data is read at specified offsets and interpreted as typed values (producing
Value::UintorValue::Int) - Comparison stage: The pattern enables safe comparison of these values against expected values in the rule
The evaluator's three-stage pipeline is documented in src/evaluator/mod.rs, showing how values flow from file reading to operator application.
Why Cross-Type Comparison Is Necessary#
Magic rules can specify expected values as either signed or unsigned integers, while file data interpretation depends on the type specifier in the rule (e.g., byte vs ubyte, long vs ulong). The pattern allows rules like:
0 byte =42 "file contains signed 42"
0 ubyte =42 "file contains unsigned 42"
0 ubyte >128 "file contains large unsigned value"
These rules correctly match regardless of whether the comparison is Uint == Int, Int == Uint, Uint == Uint, or Int == Int, and ordering comparisons (>, <, >=, <=) work correctly across type boundaries.
Design Philosophy#
The pattern embodies libmagic-rs's commitment to:
- Explicit behavior: The conversion is visible in source code, not hidden in implicit casting rules
- Safety first: Uses safe Rust with compile-time guarantees rather than
unsafecode or runtime checks - Correctness over performance: Prioritizes mathematically correct results even if i128 operations are slightly slower than u64 operations
- Future extensibility: The same pattern scales to additional comparison operators as they are implemented
Module-level documentation (lines 4-8) states the goal: "This module provides functions for applying comparison and bitwise operators to values during magic rule evaluation. It handles type-safe comparisons between different Value variants."
Relevant Code Files#
| File | Purpose | Key Lines |
|---|---|---|
src/evaluator/operators.rs | Core implementation of cross-type integer coercion pattern | Lines 48-69 (apply_equal), Lines 108-110 (apply_not_equal) |
src/evaluator/operators.rs | Contrasting bitwise operation implementation using u64 casting | Lines 148-166 (apply_bitwise_and) |
src/evaluator/operators.rs | Test coverage for edge cases | Lines 403-422 (test_apply_equal_uint_vs_int), Lines 544-567 (test_apply_equal_edge_cases) |
src/evaluator/operators.rs | Exhaustive operator consistency testing | Lines 1568-1619 (test_apply_operator_all_combinations) |
src/parser/ast.rs | Value enum definition | Lines 121-130 |
tests/property_tests.rs | Property-based testing with arbitrary value generation | Lines 1-221 |
Related Topics#
- Type Coercion: General patterns for safely converting between numeric types in programming languages
- Two's Complement: Binary representation system used for signed integers, where
u64::MAXandi64(-1)have identical bit patterns but different mathematical meanings - Operator Overloading: Language features that allow custom operator behavior for user-defined types
- Magic Number Detection: File type identification technique used by the
filecommand and libmagic - Rust's
FromTrait: Standard library trait providing infallible type conversions - Rust's
OrdandOrdering: Standard library traits for types that form a total order, used bycompare_valuesto implement ordering comparisons