Grep the workspace before authoring substrate

Rule: Before locking any substrate that asserts a fact about the workspace (file lists, naming patterns applied to all repos, cross-ref scopes, sweep targets, “files to check” enumerations), run grep -rlE against the workspace to discover the empirical state. Closed-set lists authored from memory are an anti-pattern.

Why: Three failures in 24h (2026-05-24 → 2026-05-25) from the same root cause:

  1. PE-47 v1.13 (locked overnight 2026-05-25T02:35) — claimed feature-dir naming <role>-phase-1/ was a substrate-correcting “deviation” from canonical-shared inherit-v2-phase-1/. Didn’t check Spec-Kit’s actual convention (NNN-semantic-slug/). Reversed in v1.14 + v1.15 + v1.16 after SOTA research surfaced the right pattern.

  2. PE-47 v1.15 (2026-05-25T09:00) — locked feature-dir mapping at 5 names: standard / inheritkit / ias / www / test-suite without checking the rule applied uniformly. Rich spotted inconsistency: 3 different naming patterns mixed (semantic / repo-name-minus-prefix / last-segment). Re-locked as pure-semantic-role after the diagnostic. Cost: 2 unnecessary lock-revisions + 5 extra commits + substrate-churn cascade.

  3. Workspace-rename dispatch v1.0 (2026-05-25T09:30) — authored Step 6 file list (7 docs) from memory. /review-plan audit caught: ≥61 actual grep hits across the workspace + 3 HIGH-severity missed scopes (live worktrees, package.json JSON metadata, .specify/** subtree). Marked needs-revision before dispatch.

Common thread: Claude pattern-matches what the substrate “should” look like, locks confidently, skips the grep -rlE discovery step. The damage compounds when the substrate then directs OTHER Claude sessions to act on the false closed-set — the executing session inherits the blind spots.

How to apply:

When you are about to author ANY of these artefact-shapes:

  • A “sweep targets” list (files to update across a workspace)
  • A mapping table claiming to cover N cases (repos / modules / files / etc.)
  • A naming-rule lock applied to multiple entities
  • Cross-reference scope for a refactor
  • A “files to check” enumeration in a dispatch doc
  • A list of CI workflows / configs / scripts to update
  • An assertion that “all instances of X are at Y”

Do FIRST (before committing the substrate):

# Discover the empirical scope across markdown + JSON + YAML + scripts
grep -rlE '<pattern>' ~/testatetech/ \
  --include='*.md' --include='*.json' --include='*.yml' --include='*.toml' \
  --include='*.sh' --include='*.py' --include='*.ts' --include='*.js' \
  --exclude-dir=.git --exclude-dir=node_modules \
  2>/dev/null | wc -l
 
# For directory-state checks (especially worktrees + tmp dirs)
for r in code-inherit-v2 code-inheritkit code-ias code-inheritv2-www code-inheritv2-test-suite; do
  git -C ~/testatetech/$r worktree list 2>/dev/null
done
 
# For naming-rule locks: explicit enumeration of ALL cases the rule covers
# (not just the first 1-2; test the rule against EVERY case)
for case in case1 case2 case3 case4 case5; do
  echo "Rule produces: $(apply_rule $case)"
done

Then:

  • Replace closed-set lists with grep-driven discovery in the substrate itself (e.g., dispatch instruction reads: “Step N: run grep -rlE to produce file list, then iterate” — not “the file list is X, Y, Z”)
  • Append the actual hit-counts to the substrate’s “clean-state” section so future-Claude knows the true scope at execution time
  • Test the naming rule against every case explicitly (not just the first 1-2)
  • For dispatch docs specifically: the executing session should ALSO grep-discover, not trust the authored list. Trust the empirical, not the inherited

Anti-pattern signals (STOP if you catch yourself doing these):

  • Typing a list of file paths without having just run grep -rlE to produce them
  • Asserting “these are all the places X appears” without wc -l evidence
  • Authoring a mapping table for N cases where you only mentally tested 1-2
  • Locking a “canonical” name without enumerating all N entities the canonical applies to
  • Treating an LLM-recalled file list as the same epistemic class as an actual grep result

Verification check before locking: ask yourself “did the file list / mapping / pattern come from a grep run in THIS session, or from my memory of what should be there?” If memory: STOP, grep, then revise.

5-item pre-authoring checklist (elevated 2026-05-25T12:15 BST after 4th occurrence in ~30h — frontmatter standards-stack design spec v1.0 was authored claiming “v1.3 → v1.4” when the target was already at v1.4, plus 5 other empirical-fact errors caught by /review-plan audit):

Before writing ANY substrate body that asserts facts about workspace state, complete ALL 5 items:

  1. Read target file’s current version + statushead -10 <target-file> BEFORE proposing a version bump. Multiple authoring sessions may have advanced the target since memory was last read. The substrate-version-collision pattern (proposing vN → vN+1 when target is already at vN+1) is a sentinel for this failure.

  2. Count actual LoC of any code being replacedwc -l <script> BEFORE asserting “this replaces ~X LoC”. Estimates from memory drift; actual counts are 1-command-away.

  3. Enumerate ALL entities the substrate coversgrep -rl '<pattern>' <scope> BEFORE asserting “N files use this”. Do not estimate; count. If the substrate claims to handle multiple cases (file types, status values, naming patterns), explicitly enumerate each case + verify each is covered.

  4. Run the validation tools the substrate proposes on a sample — BEFORE asserting “JSON Schema validates as expected” or “yamllint catches this” or “owl:propertyX has these semantics”, actually run the tool on a representative file. If proposing a vocabulary or schema change, sanity-check with the canonical W3C/ISO/IETF spec text (not memory of what it might mean).

  5. Confirm cross-refs in the proposed substrate resolve to existing filesls or gh api BEFORE listing path/IRI references. The PE-47 cascade + spec v1.0 cross-ref claims that didn’t resolve are the empirical evidence this matters.

Apply this checklist EXPLICITLY by writing the checklist results into the substrate’s honest_substrate_read_declaration: frontmatter field OR §0 of the substrate body. Visible-discipline-application is a forcing function; memory-only application is not. Empirically validated: the 4 occurrences in 30h all had memories cited but the checklist was not visibly applied.

Related memories:

Discipline lineage: This memory generalises three sister memories that each cover one slice of the problem. The unifying claim: trust empirical-grep output, never memory-recall, for any substrate that asserts facts about workspace state.

Sources of the 2026-05-25 evidence:

  • SCF v1.13 → v1.14 → v1.15 → v1.16 supersession chain (PE-47 lock-revisions; in docs-strategy/docs/superpowers/specs/2026-04-29-multi-phase-audit/batch-imp-24-phase-d-double-prime-path-e-pivot-research-findings-v1.0.md)
  • Workspace-rename dispatch v1.0 §10.5 review findings (in docs-strategy/docs/superpowers/specs/2026-04-29-multi-phase-audit/workspace-rename-dispatch-v1.0.md; commit 1e90a71)
  • /review-plan audit transcript (in this session’s claude-mem corpus 2026-05-25T09:45 BST)