Documents
Continuation-Sibling Anchor Reset At Recursion Depth
Continuation-Sibling Anchor Reset At Recursion Depth
Type
Topic
Status
Published
Created
Apr 25, 2026
Updated
Apr 25, 2026
Created by
Dosu Bot
Updated by
Dosu Bot

Continuation-Sibling Anchor Reset At Recursion Depth#

evaluate_rules in src/evaluator/engine/mod.rs implements two distinct anchor semantics for OffsetSpec::Relative resolution, selected by recursion_depth. The asymmetry is intentional and must not be unified.

The Two Modes#

DepthBehavior&N resolves against
recursion_depth == 0 (top-level)Siblings chain anchor-to-anchorWherever the deepest descendant of the previous sibling left the anchor
recursion_depth > 0 (continuation)Anchor reset to entry_anchor before each siblingThe parent-level anchor at the point this sibling list was entered

Top-level chaining (depth 0): After each successful match, EvaluationContext::last_match_end() advances. The next sibling resolves &N against that advanced position. This is the "advance through the file as you match" mode for top-level classification rules, documented in GOTCHAS S3.8.

Continuation-sibling reset (depth > 0): Inside a child scope — i.e., when evaluating the children list of a matched rule — the anchor is reset to the saved entry_anchor before each sibling iteration. Two consecutive child rules like >>&0 ubyte ...; >>&0 offset ... both resolve &0 against the same parent position, not off each other. This matches libmagic's ms->c.li[cont_level] per-level anchor model .

Implementation#

The gate is established once per evaluate_rules call at lines 666–668:

let entry_anchor = context.last_match_end();
let is_indirect_reentry = context.take_indirect_reentry();
let is_child_sibling_list = context.recursion_depth() > 0 && !is_indirect_reentry;

Inside the sibling loop, the reset fires at lines 687–689:

if is_child_sibling_list {
    context.set_last_match_end(entry_anchor);
}

The is_indirect_reentry exception carves out MetaType::Indirect dispatch: indirect re-entry increments recursion_depth (to bound cycles via RecursionGuard), but semantically re-evaluates the root rule list at a new file offset, which uses top-level sibling chaining. The one-shot take_indirect_reentry() flag consumes the exception at entry so that children of matched rules inside the re-entry fall back to continuation semantics normally .

Canonical Regression Fixture#

third_party/tests/searchbug.magic is the load-bearing corpus fixture. The relevant part2 subroutine :

0 name part2
>0 search/12 ABC found_ABC
>>&0 ubyte x followed_by 0x%02x
>>&0 offset x at_offset %lld

Lines 10 and 11 are continuation siblings at the same >>&0 depth. Both resolve &0 against the position where search/12 ABC matched — they must produce followed_by <byte> and at_offset 11 (not at_offset 12). Under depth-0-style chaining, the ubyte rule at line 10 would advance the anchor by 1 byte, causing line 11 to report at_offset 12 — the wrong answer. The reset is what produces the correct result .

Tests#

Two tests directly exercise the two modes:

Invariant for new tests: Tests asserting continuation-sibling behavior must exercise rules through a parent rule's children list (ensuring recursion_depth > 0). Tests asserting top-level chaining must stay at recursion_depth == 0. Mixing the two test modes will exercise the wrong semantic branch .

Authoritative References#

SourceSection
GOTCHAS.md S3.8Full anchor model and the continuation-sibling exception
AGENTS.md line 223Feature summary and cross-reference to GOTCHAS S3.8
src/evaluator/engine/mod.rs lines 630–689Implementation with full rationale comments
third_party/tests/searchbug.magicCanonical regression fixture
PR #230The PR that introduced this behavior (resolves GitHub issue #42)
  • RAII Scope Guards — covers AnchorScope and SubroutineScope save/restore patterns; distinct from the is_child_sibling_list reset (those scopes save/restore across meta-type dispatch boundaries, not across sibling iterations).
  • Meta-Type Subroutine Dispatch Architecture — covers Use/Indirect/Default/Clear control-flow; the Indirect re-entry exception to the sibling reset is documented there.
  • Attacker-Controlled Length Prefix Anchor Poisoning — a separate failure mode of the same last_match_end field via Pascal-string length prefix manipulation.