Methodology hot-reload at decision points (locked 2026-05-20 Q-044 trigger)
Rule. A Q-NNN session that has been running ≥30 minutes MUST re-grep MEMORY.md for entries added since session-open before any [E] scorecard authoring + [F] closing-question lock + [G] cascade execution start. If new entries surface that affect the active recommendation OR are new feedback memories on disciplines relevant to the current question, the session PAUSES, re-evaluates the recommendation under the new disciplines, and continues only if the recommendation is robust to them.
Mechanical check — at session-open [A], capture:
SESSION_OPEN_MEMORY_COUNT=$(grep -c '^- ' ~/.claude/projects/-home-richardd-testatetech/memory/MEMORY.md)
SESSION_OPEN_MEMORY_HASH=$(sha256sum ~/.claude/projects/-home-richardd-testatetech/memory/MEMORY.md | cut -d' ' -f1)At [E] / [F] / [G] checkpoints, re-capture + compare. If count_now > count_open OR hash_now ≠ hash_open, run full diff:
diff <(git -C ~/.claude/projects/-home-richardd-testatetech/memory show HEAD~1:MEMORY.md 2>/dev/null) ~/.claude/projects/-home-richardd-testatetech/memory/MEMORY.md
# Or simpler if memory is not git-tracked:
grep '^- ' ~/.claude/projects/-home-richardd-testatetech/memory/MEMORY.md | head -20
# Compare top-N entries (newest-first) vs session-open snapshot kept in memoryIf new entries appear in the top-N, the session SHALL:
- Read each new memory file end-to-end (≤5 min per file)
- Evaluate whether the new discipline affects the active scorecard / recommendation
- If yes: pause; re-author the affected scorecard cells / recommendation reasoning under the new discipline; surface the re-evaluation in the [F] closing-question reasoning (“Hot-reload at [E]: new memory
feedback_Xlocked since session-open; re-evaluated recommendation under X; [outcome]”) - If no: note the hot-reload check ran + nothing affected (“Hot-reload at [E]: 1 new memory locked since session-open; not relevant to current question scope”)
Why
Long-running Q-NNN sessions (Harvard-depth typical: 1.5-3h) load MEMORY.md once at [A] context-priming and treat it as static for the duration. Concurrent sessions (orchestration; sibling Q-NNN; saturation declarations; methodology-refinement batches like ω′ itself) can lock new feedback memories during the in-flight Q’s window. Without hot-reload, the in-flight session operates on stale methodology.
The cost compounds when the new memory is on a discipline directly relevant to the active question’s recommendation. The Q-NNN session emits a recommendation under v3.17 disciplines; the new memory codifies v3.17 + delta; the Q-NNN recommendation is incorrect-by-v3.17-plus-delta even though it was correct-by-stale-v3.17.
Hot-reload at [E] / [F] / [G] is the cheapest fix — one grep per checkpoint, three checkpoints per Q. Total overhead ~30s per Q. Cost of missing a relevant new memory: full re-author or refactor task in Phase-1.5+.
Worked example — Q-044 missed-hot-reload (lock-trigger; 2026-05-20)
Timeline:
- 2026-05-20T07:00 BST [A] — Q-044 session starts. Loads MEMORY.md snapshot: ~80 entries.
- 2026-05-20T07:30..08:00 BST — Orchestration-session locks
feedback_spikes_inline_not_tasked.md(extends do-now-over-task-list to spikes specifically; ≤30 min spike → do inline; defer-cost compounds via aspirational substrate downstream). New MEMORY.md entry added to top. - 2026-05-20T08:30 BST — Sibling session locks
feedback_pre_stage_for_other_session.md(cross-session pre-staging). New MEMORY.md entry added. - 2026-05-20T09:00..11:00 BST [E] — Q-044 session authors α-vs-β scorecard under stale MEMORY.md state (still on the [A] snapshot).
- 2026-05-20T11:00 BST [F] — Q-044 session emits α recommendation. Stale.
- 2026-05-20T11:25 BST — Orchestration session reviews Q-044 output, identifies both root causes (defer-cost arithmetic missing + methodology hot-reload missing), locks ω′.
What hot-reload would have changed: At [E] (around 09:00 BST), Q-044 session would have hot-reloaded MEMORY.md, found 2 new feedback entries (feedback_spikes_inline_not_tasked + feedback_pre_stage_for_other_session), and re-evaluated α-vs-β under the spike-inline discipline. Combined with the defer-cost arithmetic (ω′.1), this would have favoured β.
The miss isn’t Q-044’s fault — the session was operating correctly under v3.17 (which doesn’t mandate hot-reload). The fix is v3.18 + this memory.
How to apply
At session-open [A]:
- Capture
SESSION_OPEN_MEMORY_COUNT+SESSION_OPEN_MEMORY_HASH(one bash line each) - Note the capture in cascade-Q §B or session-state file: “Hot-reload baseline: N entries, hash X8f3…”
At [E] scorecard authoring, [F] closing-question lock, [G] cascade execution start:
- Re-capture count + hash
- If unchanged: note “Hot-reload check at [E/F/G]: no MEMORY.md changes since session-open” (1 line)
- If changed: diff top-N entries; read each new feedback memory; evaluate relevance to active recommendation; if relevant, pause + re-author; surface re-evaluation in [F] reasoning
For session durations <30 min, hot-reload is optional (low probability of intervening memory locks). For ≥30 min, mandatory.
For orchestration sessions spawning child Q-NNN sessions: tell the child to apply hot-reload discipline; alternatively pre-stage the relevant new memories in the child’s launch-prompt under “Read these FIRST” so the child loads them at [A].
Anti-patterns
- Skipping hot-reload because “the session is almost done” — [F] closing-question lock is exactly when a stale recommendation crystallizes; hot-reload is cheap, the lock is expensive to reverse
- Hot-reload check that only counts entries — count can stay the same if an entry was edited / refined; the hash catches edits too
- Reading new memories but not re-evaluating the active recommendation under them — the discipline is paused-evaluate-resume, not paused-read-resume
- Treating non-feedback memory adds (project / reference) as triggers — only feedback memories codify methodology; project memories record events; reference memories point at external systems. Only feedback memories require recommendation re-evaluation
- Hot-reload at the wrong checkpoints — [A] doesn’t need it (just loaded); arbitrary mid-flight checks add overhead without benefit; [E] / [F] / [G] are the load-bearing decision-points
When this rule fires
- At [E] scorecard authoring in refined-prompt v3.17 L4 closing-question discipline
- At [F] closing-question lock in the same
- At [G] cascade execution start (cascade-Q file authoring; spike-reference materialisation; SSSOM mapping_set authoring)
- At any orchestration decision-point where the in-flight session has been running ≥30 min + MEMORY.md may have updated
- At any /review-plan invocation where the agent must re-evaluate against current discipline state
Companion memories
[[feedback_spikes_inline_not_tasked]](newer sibling — the memory whose missed hot-reload triggered this lock)[[feedback_pre_stage_for_other_session]](newer sibling — cross-session pre-staging; pairs with hot-reload as the proactive complement to the reactive re-grep)[[feedback_defer_cost_arithmetic_in_recommendations]](sibling — ω′.1; addresses the OTHER half of the Q-044 root cause)[[feedback_research_artefact_forward_traceability]](cross-link — same theme of preventing silent drift across artefact boundaries; hot-reload is the in-session analog of forward-traceability)[[feedback_check_t_files_first_for_any_inherit_v2_work]](cross-link — read-time discipline parallel; hot-reload is the in-session refresh of the same load-bearing read)
Empirical precedent ledger
| # | When | Q / Batch | Outcome |
|---|---|---|---|
| 1 | 2026-05-20 | Q-044 Q-A3 CryptocurrencyOrDigitalAsset | LOCK-TRIGGER. Q-044 missed hot-reload on feedback_spikes_inline_not_tasked (locked ~30 min into Q-044 execution). At [E], hot-reload would have surfaced the new memory + re-evaluated α-vs-β under spike-inline + defer-cost arithmetic. Codified as ω′.2 lock-trigger. |
(Future Q-NNN sessions that invoke this rule should append their outcome here.)
Cross-link to refined-prompt v3.18 L4 amendment
Codified at ~/testatetech/docs-strategy/docs/superpowers/specs/2026-04-29-multi-phase-audit/refined-end-of-turn-directive.md v3.18 L4 closing-question sub-rule (b) “Methodology hot-reload at [E]/[F]/[G] checkpoints” — this memory is the durable rule; the refined-prompt amendment is the in-flight invocation surface.
Source-attribution
- Trigger: Q-044 session 2026-05-20T11:25 fork-analysis (orchestration-session methodology review)
- Rich-decision: 2026-05-20T11:25 lock #2 — “re-grep MEMORY.md at [E]/[F]/[G] checkpoints for entries added since session-open; pause + re-evaluate if new entries surface”
- Locked-as: ω′.2 launch-prompt + this memory file
- Sibling locks: ω′.1 defer-cost arithmetic (paired); refined-prompt v3.17 → v3.18 L4 amendment