Indirect Offset Resolution Pipeline

Indirect Offset Resolution Pipeline#

OffsetSpec::Indirect implements pointer-dereference offsets from the magic(5) format — syntax (0x3c.l) reads a 32-bit little-endian value at file offset 0x3C and treats the result as the actual test offset. This is the mechanism that makes PE executable detection (and similar formats with header pointer tables) possible. The implementation was introduced in PR #42 / Issue #37 and lives in src/evaluator/offset/indirect.rs.

The dispatcher in src/evaluator/offset/mod.rs routes OffsetSpec::Indirect { .. } to indirect::resolve_indirect_offset_with_anchor(spec, buffer, Some(last_match_end)). The #[cfg(test)]-only wrapper resolve_indirect_offset calls the same function with anchor = None .

The 4-Step Pipeline#

resolve_indirect_offset_with_anchor performs four sequential steps. Each step can independently produce an error; no step is reached if a prior step fails.

Step 1 — Resolve base offset to absolute position
Calls resolve_absolute_offset, which handles negative values as from-end positions. When base_relative is set ((&N.X) magic syntax), the base is anchor + base_offset using checked_add before passing to resolve_absolute_offset.

Step 2 — Read pointer value
Calls read_pointer(buffer, abs_base, pointer_type, endian). This function dispatches to the existing read_byte / read_short / read_long / read_quad readers from evaluator::types, then immediately calls extract_raw_unsigned to convert the result to a u64. Non-numeric types (String, Float, Double) return EvaluationError::UnsupportedType.

Step 3 — Apply adjustment
Calls apply_adjustment(pointer_value, adjustment, adjustment_op). The supported IndirectAdjustmentOp variants are Add, Mul, Div, Mod, And, Or, Xor. When result_relative is set (&(N.X) syntax), the computed offset is further incremented by the anchor using checked_add.

Step 4 — Validate final offset against buffer length
If final_offset >= buffer.len(), returns EvaluationError::BufferOverrun. This is the gate that rejects enormous values produced by signed pointer reinterpretation (see below).

Signed Pointer Reinterpretation#

When a pointer is read as a signed integer, the bit pattern is preserved rather than the mathematical sign. extract_raw_unsigned casts Value::Int(v) to u64 via *v as u64 — so i32(-1) read from [0xFF, 0xFF, 0xFF, 0xFF] becomes u64::MAX, not an underflow. This matches libmagic's apprentice.c::do_offset behavior .

The bounds check at step 4 catches these enormous values on 64-bit platforms. On 32-bit platforms, usize::try_from(u64) will fail first with EvaluationError::InvalidOffset whenever the u64 pointer value exceeds 32-bit capacity . Two tests branch on usize::BITS == 64 to assert the platform-specific error variant .

Separated Concerns#

Three functions each own exactly one responsibility :

Function	Responsibility
`read_pointer`	Type dispatch and endianness; delegates to `evaluator::types` readers
`extract_raw_unsigned`	Signed-to-unsigned bit-cast via `*v as u64`
`apply_adjustment`	Arithmetic with overflow protection via `checked_*`; all ops on `u64`

apply_adjustment uses signed semantics only for Add (so parser-encoded (N.X-1) as Add(-1) performs subtraction correctly). All other ops — Mul, Div, Mod, And, Or, Xor — reinterpret the i64 adjustment as a u64 bit pattern, matching libmagic's raw machine-word behavior .

i64::MIN handling: apply_adjustment uses adjustment.unsigned_abs() for the Add path . unsigned_abs returns 2^63 as u64 without overflow, eliminating the need for a special case. An explicit -i64::MIN negation would panic in debug mode; unsigned_abs avoids this entirely.

Div and Mod with a zero operand: checked_div(0) and checked_rem(0) both return None, which apply_adjustment maps to EvaluationError::InvalidOffset rather than panicking .

Test Coverage#

The inline #[cfg(test)] module contains 15 named test functions covering:

All pointer types (Byte, Short, Long, Quad) × both endiannesses
Signed and unsigned pointer values including negative
Positive and negative adjustments
From-end base offsets
Pointer-read buffer overruns and final-offset buffer overruns
Arithmetic overflow, underflow, and all IndirectAdjustmentOp variants
Unsupported pointer types: String, Float, Double
Real-world PE-header scenario: pointer at 0x3C pointing to 0x80, verified to contain PE\0\0
32-bit platform branching (usize::BITS == 64 guard)

The external integration test file tests/indirect_offset_integration.rs provides additional end-to-end coverage.

Key Pitfalls#

Never use usize::from(u32) in the indirect pipeline — it does not compile on 32-bit targets. Use usize::try_from(u32)? instead.
Signed pointer values are raw bit patterns, not mathematical negatives. Do not apply signed arithmetic to them before the bounds check.
Always use checked_add / checked_sub / checked_mul for offset arithmetic. Malicious files can craft values that target overflow.
Follow the same 4-step pattern when adding new offset types: resolve base → read value → apply adjustment → validate bounds.