Continuation-Sibling Anchor Reset At Recursion Depth#
evaluate_rules in src/evaluator/engine/mod.rs implements two distinct anchor semantics for OffsetSpec::Relative resolution, selected by recursion_depth. The asymmetry is intentional and must not be unified.
The Two Modes#
| Depth | Behavior | &N resolves against |
|---|---|---|
recursion_depth == 0 (top-level) | Siblings chain anchor-to-anchor | Wherever the deepest descendant of the previous sibling left the anchor |
recursion_depth > 0 (continuation) | Anchor reset to entry_anchor before each sibling | The parent-level anchor at the point this sibling list was entered |
Top-level chaining (depth 0): After each successful match, EvaluationContext::last_match_end() advances. The next sibling resolves &N against that advanced position. This is the "advance through the file as you match" mode for top-level classification rules, documented in GOTCHAS S3.8.
Continuation-sibling reset (depth > 0): Inside a child scope — i.e., when evaluating the children list of a matched rule — the anchor is reset to the saved entry_anchor before each sibling iteration. Two consecutive child rules like >>&0 ubyte ...; >>&0 offset ... both resolve &0 against the same parent position, not off each other. This matches libmagic's ms->c.li[cont_level] per-level anchor model .
Implementation#
The gate is established once per evaluate_rules call at lines 666–668:
let entry_anchor = context.last_match_end();
let is_indirect_reentry = context.take_indirect_reentry();
let is_child_sibling_list = context.recursion_depth() > 0 && !is_indirect_reentry;
Inside the sibling loop, the reset fires at lines 687–689:
if is_child_sibling_list {
context.set_last_match_end(entry_anchor);
}
The is_indirect_reentry exception carves out MetaType::Indirect dispatch: indirect re-entry increments recursion_depth (to bound cycles via RecursionGuard), but semantically re-evaluates the root rule list at a new file offset, which uses top-level sibling chaining. The one-shot take_indirect_reentry() flag consumes the exception at entry so that children of matched rules inside the re-entry fall back to continuation semantics normally .
Canonical Regression Fixture#
third_party/tests/searchbug.magic is the load-bearing corpus fixture. The relevant part2 subroutine :
0 name part2
>0 search/12 ABC found_ABC
>>&0 ubyte x followed_by 0x%02x
>>&0 offset x at_offset %lld
Lines 10 and 11 are continuation siblings at the same >>&0 depth. Both resolve &0 against the position where search/12 ABC matched — they must produce followed_by <byte> and at_offset 11 (not at_offset 12). Under depth-0-style chaining, the ubyte rule at line 10 would advance the anchor by 1 byte, causing line 11 to report at_offset 12 — the wrong answer. The reset is what produces the correct result .
Tests#
Two tests directly exercise the two modes:
test_offset_does_not_advance_anchor_for_continuation_siblings— verifies that two consecutive child siblings both resolve&Nagainst the same parent anchor (depth > 0 path).relative_anchor_can_decrease_when_later_sibling_matches_at_lower_position— verifies top-level sibling chaining at depth 0, including the case where the anchor can move backward.
Invariant for new tests: Tests asserting continuation-sibling behavior must exercise rules through a parent rule's children list (ensuring recursion_depth > 0). Tests asserting top-level chaining must stay at recursion_depth == 0. Mixing the two test modes will exercise the wrong semantic branch .
Authoritative References#
| Source | Section |
|---|---|
GOTCHAS.md S3.8 | Full anchor model and the continuation-sibling exception |
AGENTS.md line 223 | Feature summary and cross-reference to GOTCHAS S3.8 |
src/evaluator/engine/mod.rs lines 630–689 | Implementation with full rationale comments |
third_party/tests/searchbug.magic | Canonical regression fixture |
| PR #230 | The PR that introduced this behavior (resolves GitHub issue #42) |
Related Topics#
- RAII Scope Guards — covers
AnchorScopeandSubroutineScopesave/restore patterns; distinct from theis_child_sibling_listreset (those scopes save/restore across meta-type dispatch boundaries, not across sibling iterations). - Meta-Type Subroutine Dispatch Architecture — covers
Use/Indirect/Default/Clearcontrol-flow; theIndirectre-entry exception to the sibling reset is documented there. - Attacker-Controlled Length Prefix Anchor Poisoning — a separate failure mode of the same
last_match_endfield via Pascal-string length prefix manipulation.