Grep the workspace before authoring substrate

Rule: Before locking any substrate that asserts a fact about the workspace (file lists, naming patterns applied to all repos, cross-ref scopes, sweep targets, “files to check” enumerations), run grep -rlE against the workspace to discover the empirical state. Closed-set lists authored from memory are an anti-pattern.

Why: Three failures in 24h (2026-05-24 → 2026-05-25) from the same root cause:

PE-47 v1.13 (locked overnight 2026-05-25T02:35) — claimed feature-dir naming <role>-phase-1/ was a substrate-correcting “deviation” from canonical-shared inherit-v2-phase-1/. Didn’t check Spec-Kit’s actual convention (NNN-semantic-slug/). Reversed in v1.14 + v1.15 + v1.16 after SOTA research surfaced the right pattern.
PE-47 v1.15 (2026-05-25T09:00) — locked feature-dir mapping at 5 names: standard / inheritkit / ias / www / test-suite without checking the rule applied uniformly. Rich spotted inconsistency: 3 different naming patterns mixed (semantic / repo-name-minus-prefix / last-segment). Re-locked as pure-semantic-role after the diagnostic. Cost: 2 unnecessary lock-revisions + 5 extra commits + substrate-churn cascade.
Workspace-rename dispatch v1.0 (2026-05-25T09:30) — authored Step 6 file list (7 docs) from memory. /review-plan audit caught: ≥61 actual grep hits across the workspace + 3 HIGH-severity missed scopes (live worktrees, package.json JSON metadata, .specify/** subtree). Marked needs-revision before dispatch.

Common thread: Claude pattern-matches what the substrate “should” look like, locks confidently, skips the grep -rlE discovery step. The damage compounds when the substrate then directs OTHER Claude sessions to act on the false closed-set — the executing session inherits the blind spots.

How to apply:

When you are about to author ANY of these artefact-shapes:

A “sweep targets” list (files to update across a workspace)
A mapping table claiming to cover N cases (repos / modules / files / etc.)
A naming-rule lock applied to multiple entities
Cross-reference scope for a refactor
A “files to check” enumeration in a dispatch doc
A list of CI workflows / configs / scripts to update
An assertion that “all instances of X are at Y”

Do FIRST (before committing the substrate):

# Discover the empirical scope across markdown + JSON + YAML + scripts
grep -rlE '<pattern>' ~/testatetech/ \
  --include='*.md' --include='*.json' --include='*.yml' --include='*.toml' \
  --include='*.sh' --include='*.py' --include='*.ts' --include='*.js' \
  --exclude-dir=.git --exclude-dir=node_modules \
  2>/dev/null | wc -l
 
# For directory-state checks (especially worktrees + tmp dirs)
for r in code-inherit-v2 code-inheritkit code-ias code-inheritv2-www code-inheritv2-test-suite; do
  git -C ~/testatetech/$r worktree list 2>/dev/null
done
 
# For naming-rule locks: explicit enumeration of ALL cases the rule covers
# (not just the first 1-2; test the rule against EVERY case)
for case in case1 case2 case3 case4 case5; do
  echo "Rule produces: $(apply_rule $case)"
done

Then:

Replace closed-set lists with grep-driven discovery in the substrate itself (e.g., dispatch instruction reads: “Step N: run grep -rlE to produce file list, then iterate” — not “the file list is X, Y, Z”)
Append the actual hit-counts to the substrate’s “clean-state” section so future-Claude knows the true scope at execution time
Test the naming rule against every case explicitly (not just the first 1-2)
For dispatch docs specifically: the executing session should ALSO grep-discover, not trust the authored list. Trust the empirical, not the inherited

Anti-pattern signals (STOP if you catch yourself doing these):

Typing a list of file paths without having just run grep -rlE to produce them
Asserting “these are all the places X appears” without wc -l evidence
Authoring a mapping table for N cases where you only mentally tested 1-2
Locking a “canonical” name without enumerating all N entities the canonical applies to
Treating an LLM-recalled file list as the same epistemic class as an actual grep result

Verification check before locking: ask yourself “did the file list / mapping / pattern come from a grep run in THIS session, or from my memory of what should be there?” If memory: STOP, grep, then revise.

5-item pre-authoring checklist (elevated 2026-05-25T12:15 BST after 4th occurrence in ~30h — frontmatter standards-stack design spec v1.0 was authored claiming “v1.3 → v1.4” when the target was already at v1.4, plus 5 other empirical-fact errors caught by /review-plan audit):

Before writing ANY substrate body that asserts facts about workspace state, complete ALL 5 items:

Read target file’s current version + status — head -10 <target-file> BEFORE proposing a version bump. Multiple authoring sessions may have advanced the target since memory was last read. The substrate-version-collision pattern (proposing vN → vN+1 when target is already at vN+1) is a sentinel for this failure.
Count actual LoC of any code being replaced — wc -l <script> BEFORE asserting “this replaces ~X LoC”. Estimates from memory drift; actual counts are 1-command-away.
Enumerate ALL entities the substrate covers — grep -rl '<pattern>' <scope> BEFORE asserting “N files use this”. Do not estimate; count. If the substrate claims to handle multiple cases (file types, status values, naming patterns), explicitly enumerate each case + verify each is covered.
Run the validation tools the substrate proposes on a sample — BEFORE asserting “JSON Schema validates as expected” or “yamllint catches this” or “owl:propertyX has these semantics”, actually run the tool on a representative file. If proposing a vocabulary or schema change, sanity-check with the canonical W3C/ISO/IETF spec text (not memory of what it might mean).
Confirm cross-refs in the proposed substrate resolve to existing files — ls or gh api BEFORE listing path/IRI references. The PE-47 cascade + spec v1.0 cross-ref claims that didn’t resolve are the empirical evidence this matters.

Apply this checklist EXPLICITLY by writing the checklist results into the substrate’s honest_substrate_read_declaration: frontmatter field OR §0 of the substrate body. Visible-discipline-application is a forcing function; memory-only application is not. Empirically validated: the 4 occurrences in 30h all had memories cited but the checklist was not visibly applied.

Related memories:

banner_sweep_grep_all_occurrences (sister discipline; banner-rewrites must grep all hits)
concurrent_burst_race_condition_count_24h (live-worktree state matters; concurrent sessions race)
verify_before_author (verifying-before-author skill; 6-method toolkit)
verify_after_author_via_directory_ls (post-author verification; method-7 candidate)
git_mv_with_unstaged_edits_loses_modifications (worktree-edge-case where unverified state causes silent data loss)
research_artefact_forward_traceability (research findings need bidirectional cite-back; same discipline applied to research)
architecture_state_file_discipline (state files exist BECAUSE memory-based assumptions drift)

Discipline lineage: This memory generalises three sister memories that each cover one slice of the problem. The unifying claim: trust empirical-grep output, never memory-recall, for any substrate that asserts facts about workspace state.

Sources of the 2026-05-25 evidence:

SCF v1.13 → v1.14 → v1.15 → v1.16 supersession chain (PE-47 lock-revisions; in docs-strategy/docs/superpowers/specs/2026-04-29-multi-phase-audit/batch-imp-24-phase-d-double-prime-path-e-pivot-research-findings-v1.0.md)
Workspace-rename dispatch v1.0 §10.5 review findings (in docs-strategy/docs/superpowers/specs/2026-04-29-multi-phase-audit/workspace-rename-dispatch-v1.0.md; commit 1e90a71)
/review-plan audit transcript (in this session’s claude-mem corpus 2026-05-25T09:45 BST)

TT Claude Memory

Explorer

feedback_grep_workspace_before_authoring_substrate

Grep the workspace before authoring substrate

Graph View

Backlinks