Documents
Indirect Offset Advanced Syntax And Anchor-Relative Variants
Indirect Offset Advanced Syntax And Anchor-Relative Variants
Type
Topic
Status
Published
Created
Apr 25, 2026
Updated
Apr 25, 2026
Created by
Dosu Bot
Updated by
Dosu Bot

Indirect Offset Advanced Syntax And Anchor-Relative Variants#

The full indirect offset syntax in libmagic-rs extends well beyond the basic (base.type) form. OffsetSpec::Indirect carries three fields that control the full surface: adjustment_op: IndirectAdjustmentOp, base_relative: bool, and result_relative: bool . These were added in v0.6.0 and use #[serde(default)] so older serialized AST snapshots deserialize cleanly .

This article covers the syntax surface and operator semantics. For the 4-step evaluation pipeline, see Indirect Offset Resolution Pipeline. For parser-evaluator reachability, see Indirect Offset Parser-Evaluator Sync. For spec-derived test expectations, see Indirect Offset GNU File Semantic Correctness.


Adjustment Placement: Two Mutually Exclusive Forms#

The adjustment operand can appear in exactly one of two positions per rule :

Form 1 — Inside the parens (canonical magic(5)):
(base.type+N), (base.type-N), (base.type*N), (base.type/N), (base.type%N), (base.type&N), (base.type|N), (base.type^N).
The full operator set — +, -, *, /, %, &, |, ^ — is valid here.

Form 2 — After the closing paren (legacy/alternate):
(base.type)+N, (base.type)-N.
Only + and - are accepted in this form. Use Form 1 for arithmetic beyond add/subtract.

Combining both forms in one rule (e.g., (19.b-1)+2) is not permitted. The parser enforces this: parse_inside_adjustment runs first; if it succeeds, parse_outside_adjustment is skipped. The chosen operator is stored on OffsetSpec::Indirect.adjustment_op; subtraction is folded into IndirectAdjustmentOp::Add with a negative operand .


IndirectAdjustmentOp Enum#

Defined in src/parser/ast.rs with #[derive(Default)] where Add is the #[default] variant:

VariantMagic syntaxOperand interpretation in apply_adjustment
Add+N / -Ni64 signed — subtraction via negative operand
Mul*Nreinterpreted as u64 bit pattern
Div/Nreinterpreted as u64; zero operand → EvaluationError::InvalidOffset
Mod%Nreinterpreted as u64; zero operand → EvaluationError::InvalidOffset
And&Nreinterpreted as u64 bit pattern
Or`N`
Xor^Nreinterpreted as u64 bit pattern

Add uses signed semantics so that (N.X-1) (encoded by the parser as Add(-1)) performs subtraction correctly. All other ops reinterpret the i64 adjustment as a u64 bit pattern to match libmagic's apprentice.c::do_offset raw-machine-word behavior .

i64::unsigned_abs() is used in the Add path to handle i64::MIN without overflow panic in debug mode . Mul also rejects integer overflow via checked_mul with EvaluationError::InvalidOffset .


Anchor-Relative Wrapper Variants#

Two boolean flags on OffsetSpec::Indirect encode GNU file's anchor-relative forms, where the anchor is EvaluationContext::last_match_end():

base_relative: true(&N.X) syntax
The pointer-read address is anchor + base_offset. The base shifts to the anchor before the pointer is read .

result_relative: true&(N.X) syntax
The pointer is read at base_offset (absolute). The read value is then added to the anchor to produce the final offset .

Composition: Both flags can be set simultaneously. With both true, the pointer is read at anchor + base_offset, and then the result is added to the anchor. The grammar therefore covers all combinations:

Magic syntaxbase_relativeresult_relative
(N.X)falsefalse
(&N.X)truefalse
&(N.X)falsetrue
&(&N.X)truetrue

Adjustment forms compose with anchor variants: (&N.X+adj), &(N.X)+adj, etc. are all valid .

The parser sets base_relative by detecting a leading & inside the parens . result_relative is set in parse_offset when a &( prefix is detected before calling parse_indirect_offset .


Pointer Specifier Table#

pointer_specifier_to_type() maps the single-character specifier after . to a (TypeKind, Endianness) pair. All types are signed by default per GNU file semantics .

SpecifierWidthEndiannessSigned
.b1 byteLittleYes
.B1 byteBigYes
.s2 bytesLittleYes
.S2 bytesBigYes
.l4 bytesLittleYes
.L4 bytesBigYes
.q8 bytesLittleYes
.Q8 bytesBigYes

A signed pointer value read as negative (e.g., [0xFF, 0xFF, 0xFF, 0xFF] as .l) is reinterpreted as a raw unsigned u64 via extract_raw_unsigned's *v as u64 cast — yielding u64::MAX. The bounds check (step 4 of the pipeline) catches the enormous value .


Test Discipline for Cross-Platform Byte Buffers#

Prefer big-endian specifiers (.L, .S, .Q) over native-byte-order specifiers (.l, .s, .q) when constructing test byte buffers. Big-endian layouts are deterministic across architectures; native-endian .l produces x86-specific byte sequences that break on big-endian targets .

Never use to_ne_bytes() in test fixtures for indirect offset buffers — use to_be_bytes() or explicit byte arrays .


Key Source Files#

FilePurpose
src/parser/ast.rsIndirectAdjustmentOp enum (lines 118–193); OffsetSpec::Indirect fields (lines 240–275)
src/parser/grammar/mod.rsparse_indirect_offset(), pointer_specifier_to_type()
src/evaluator/offset/indirect.rsapply_adjustment(), resolve_indirect_offset_with_anchor()
GOTCHAS.mdS3.7 (adjustment forms, specifier mapping), S6.3 (signed-by-default)
AGENTS.mdCurrent Limitations / Offset Specifications
tests/indirect_offset_integration.rsEnd-to-end tests covering all specifiers and adjustment forms
Indirect Offset Advanced Syntax And Anchor-Relative Variants | Dosu