Methodology hot-reload at decision points (locked 2026-05-20 Q-044 trigger)

Rule. A Q-NNN session that has been running ≥30 minutes MUST re-grep MEMORY.md for entries added since session-open before any [E] scorecard authoring + [F] closing-question lock + [G] cascade execution start. If new entries surface that affect the active recommendation OR are new feedback memories on disciplines relevant to the current question, the session PAUSES, re-evaluates the recommendation under the new disciplines, and continues only if the recommendation is robust to them.

Mechanical check — at session-open [A], capture:

SESSION_OPEN_MEMORY_COUNT=$(grep -c '^- ' ~/.claude/projects/-home-richardd-testatetech/memory/MEMORY.md)
SESSION_OPEN_MEMORY_HASH=$(sha256sum ~/.claude/projects/-home-richardd-testatetech/memory/MEMORY.md | cut -d' ' -f1)

At [E] / [F] / [G] checkpoints, re-capture + compare. If count_now > count_open OR hash_now ≠ hash_open, run full diff:

diff <(git -C ~/.claude/projects/-home-richardd-testatetech/memory show HEAD~1:MEMORY.md 2>/dev/null) ~/.claude/projects/-home-richardd-testatetech/memory/MEMORY.md
# Or simpler if memory is not git-tracked:
grep '^- ' ~/.claude/projects/-home-richardd-testatetech/memory/MEMORY.md | head -20
# Compare top-N entries (newest-first) vs session-open snapshot kept in memory

If new entries appear in the top-N, the session SHALL:

  1. Read each new memory file end-to-end (≤5 min per file)
  2. Evaluate whether the new discipline affects the active scorecard / recommendation
  3. If yes: pause; re-author the affected scorecard cells / recommendation reasoning under the new discipline; surface the re-evaluation in the [F] closing-question reasoning (“Hot-reload at [E]: new memory feedback_X locked since session-open; re-evaluated recommendation under X; [outcome]”)
  4. If no: note the hot-reload check ran + nothing affected (“Hot-reload at [E]: 1 new memory locked since session-open; not relevant to current question scope”)

Why

Long-running Q-NNN sessions (Harvard-depth typical: 1.5-3h) load MEMORY.md once at [A] context-priming and treat it as static for the duration. Concurrent sessions (orchestration; sibling Q-NNN; saturation declarations; methodology-refinement batches like ω′ itself) can lock new feedback memories during the in-flight Q’s window. Without hot-reload, the in-flight session operates on stale methodology.

The cost compounds when the new memory is on a discipline directly relevant to the active question’s recommendation. The Q-NNN session emits a recommendation under v3.17 disciplines; the new memory codifies v3.17 + delta; the Q-NNN recommendation is incorrect-by-v3.17-plus-delta even though it was correct-by-stale-v3.17.

Hot-reload at [E] / [F] / [G] is the cheapest fix — one grep per checkpoint, three checkpoints per Q. Total overhead ~30s per Q. Cost of missing a relevant new memory: full re-author or refactor task in Phase-1.5+.

Worked example — Q-044 missed-hot-reload (lock-trigger; 2026-05-20)

Timeline:

  • 2026-05-20T07:00 BST [A] — Q-044 session starts. Loads MEMORY.md snapshot: ~80 entries.
  • 2026-05-20T07:30..08:00 BST — Orchestration-session locks feedback_spikes_inline_not_tasked.md (extends do-now-over-task-list to spikes specifically; ≤30 min spike → do inline; defer-cost compounds via aspirational substrate downstream). New MEMORY.md entry added to top.
  • 2026-05-20T08:30 BST — Sibling session locks feedback_pre_stage_for_other_session.md (cross-session pre-staging). New MEMORY.md entry added.
  • 2026-05-20T09:00..11:00 BST [E] — Q-044 session authors α-vs-β scorecard under stale MEMORY.md state (still on the [A] snapshot).
  • 2026-05-20T11:00 BST [F] — Q-044 session emits α recommendation. Stale.
  • 2026-05-20T11:25 BST — Orchestration session reviews Q-044 output, identifies both root causes (defer-cost arithmetic missing + methodology hot-reload missing), locks ω′.

What hot-reload would have changed: At [E] (around 09:00 BST), Q-044 session would have hot-reloaded MEMORY.md, found 2 new feedback entries (feedback_spikes_inline_not_tasked + feedback_pre_stage_for_other_session), and re-evaluated α-vs-β under the spike-inline discipline. Combined with the defer-cost arithmetic (ω′.1), this would have favoured β.

The miss isn’t Q-044’s fault — the session was operating correctly under v3.17 (which doesn’t mandate hot-reload). The fix is v3.18 + this memory.

How to apply

At session-open [A]:

  1. Capture SESSION_OPEN_MEMORY_COUNT + SESSION_OPEN_MEMORY_HASH (one bash line each)
  2. Note the capture in cascade-Q §B or session-state file: “Hot-reload baseline: N entries, hash X8f3…”

At [E] scorecard authoring, [F] closing-question lock, [G] cascade execution start:

  1. Re-capture count + hash
  2. If unchanged: note “Hot-reload check at [E/F/G]: no MEMORY.md changes since session-open” (1 line)
  3. If changed: diff top-N entries; read each new feedback memory; evaluate relevance to active recommendation; if relevant, pause + re-author; surface re-evaluation in [F] reasoning

For session durations <30 min, hot-reload is optional (low probability of intervening memory locks). For ≥30 min, mandatory.

For orchestration sessions spawning child Q-NNN sessions: tell the child to apply hot-reload discipline; alternatively pre-stage the relevant new memories in the child’s launch-prompt under “Read these FIRST” so the child loads them at [A].

Anti-patterns

  1. Skipping hot-reload because “the session is almost done” — [F] closing-question lock is exactly when a stale recommendation crystallizes; hot-reload is cheap, the lock is expensive to reverse
  2. Hot-reload check that only counts entries — count can stay the same if an entry was edited / refined; the hash catches edits too
  3. Reading new memories but not re-evaluating the active recommendation under them — the discipline is paused-evaluate-resume, not paused-read-resume
  4. Treating non-feedback memory adds (project / reference) as triggers — only feedback memories codify methodology; project memories record events; reference memories point at external systems. Only feedback memories require recommendation re-evaluation
  5. Hot-reload at the wrong checkpoints — [A] doesn’t need it (just loaded); arbitrary mid-flight checks add overhead without benefit; [E] / [F] / [G] are the load-bearing decision-points

When this rule fires

  • At [E] scorecard authoring in refined-prompt v3.17 L4 closing-question discipline
  • At [F] closing-question lock in the same
  • At [G] cascade execution start (cascade-Q file authoring; spike-reference materialisation; SSSOM mapping_set authoring)
  • At any orchestration decision-point where the in-flight session has been running ≥30 min + MEMORY.md may have updated
  • At any /review-plan invocation where the agent must re-evaluate against current discipline state

Companion memories

  • [[feedback_spikes_inline_not_tasked]] (newer sibling — the memory whose missed hot-reload triggered this lock)
  • [[feedback_pre_stage_for_other_session]] (newer sibling — cross-session pre-staging; pairs with hot-reload as the proactive complement to the reactive re-grep)
  • [[feedback_defer_cost_arithmetic_in_recommendations]] (sibling — ω′.1; addresses the OTHER half of the Q-044 root cause)
  • [[feedback_research_artefact_forward_traceability]] (cross-link — same theme of preventing silent drift across artefact boundaries; hot-reload is the in-session analog of forward-traceability)
  • [[feedback_check_t_files_first_for_any_inherit_v2_work]] (cross-link — read-time discipline parallel; hot-reload is the in-session refresh of the same load-bearing read)

Empirical precedent ledger

#WhenQ / BatchOutcome
12026-05-20Q-044 Q-A3 CryptocurrencyOrDigitalAssetLOCK-TRIGGER. Q-044 missed hot-reload on feedback_spikes_inline_not_tasked (locked ~30 min into Q-044 execution). At [E], hot-reload would have surfaced the new memory + re-evaluated α-vs-β under spike-inline + defer-cost arithmetic. Codified as ω′.2 lock-trigger.

(Future Q-NNN sessions that invoke this rule should append their outcome here.)

Codified at ~/testatetech/docs-strategy/docs/superpowers/specs/2026-04-29-multi-phase-audit/refined-end-of-turn-directive.md v3.18 L4 closing-question sub-rule (b) “Methodology hot-reload at [E]/[F]/[G] checkpoints” — this memory is the durable rule; the refined-prompt amendment is the in-flight invocation surface.

Source-attribution

  • Trigger: Q-044 session 2026-05-20T11:25 fork-analysis (orchestration-session methodology review)
  • Rich-decision: 2026-05-20T11:25 lock #2 — “re-grep MEMORY.md at [E]/[F]/[G] checkpoints for entries added since session-open; pause + re-evaluate if new entries surface”
  • Locked-as: ω′.2 launch-prompt + this memory file
  • Sibling locks: ω′.1 defer-cost arithmetic (paired); refined-prompt v3.17 → v3.18 L4 amendment