Rule: This memory consolidates eight refined-prompt v3.6 → v3.7 candidates that surfaced across the ε.ι 10-spike derisking suite (S1+S2+S2.5+S2.6+S3+S4+S5; S6+S7+S8 in flight or pending) plus 3 confrontation spikes (S2.10+S2.9+S2.9b) on 2026-05-02 BST. Each candidate is INDEXED here so that a future refined-prompt v3.6 → v3.7 authoring session has a single grep-able reference rather than reconstructing from 12+ scattered changelog rows in arch-state §11/§12 v3.16-v3.25.

Why this consolidation memory exists: Per Rich-directive 2026-05-02T~16:25 BST: “please save all the information we have learnt from these spikes. i do not want to have to do them again.” The per-spike outcomes are durable in arch-state §11/§12 + Q-003 §10 + T-files + per-spike memory files. The CROSS-CUTTING methodological learnings — particularly the maturity-vocabulary expansions and architectural-layer distinctions — are scattered across changelog rows and at risk of being missed when refined-prompt v3.7 is next authored. This memory closes that risk.


Candidate 1 — Full maturity vocabulary including 2 new sub-modes

Refined-prompt v3.6 has (current vocabulary): outcome-VALIDATED / outcome-MITIGATED / outcome-KILL-CONDITION-MET.

Refined-prompt v3.7 should have (expanded vocabulary; 5 outcomes + 2 sub-modes):

OutcomeMeaningLifecycleSource spikeExample
outcome-VALIDATEDAll kill clauses NOT MET on strict reading. No architectural alternative needed. Clean pass.PermanentS1, S2.5, S2.10, S5, S8S5 FIBO SSSOM 4-step toolchain pipeline runs cleanly in 5.77s; 3-of-3 mappings preserved
outcome-VALIDATED-WITH-METHODOLOGICAL-SUBSTITUTIONKill clauses NOT MET, but the measurement instrument was substituted for an alternative because the literal one was unavailable. Conservative-upper-bound substitution acceptable; same-direction-with-literal expected.TRANSIENT — should retire when literal measurement instrument becomes available. Schedule a same-session-or-next-session re-test.S4S4 used Claude Opus 4.7 main-loop subagents as LLM substitute for OntoGPT 1.0.16 + Haiku 4.5/GPT-4o because ANTHROPIC_API_KEY + OPENAI_API_KEY were not set in shell
outcome-VALIDATED-WITH-PROVISIONING-NOTEKill clauses NOT MET, but the substrate was substituted for an architectural alternative because the literal substrate’s PROVISIONING (not its substrate-correctness) was blocked.TRANSIENT — should retire when provisioning unblocks. Schedule a same-session follow-on. Has retired in S2.9b within ~30 min of provisioning unblock.S2.9 → S2.9bS2.9 used numpy in-memory store as architectural alternative when Postgres provisioning blocked by sudo password unavailable; S2.9b retested on pgvector substrate after Rich provisioned + retired the sub-mode
outcome-MITIGATEDKill clauses STRICTLY met but architectural alternative validates the load-bearing theory’s SPIRIT. Lock-time decision uses the alternative path.PermanentS2 → S2.5, S2.6, S3S3: gen-typescript + gen-json-schema drop annotations entirely (clause 2 strictly met); YAML-as-canonical via gen-pydantic —meta full + Catala-via-shim validates spirit
outcome-KILL-CONDITION-METKill clauses strictly met AND architectural alternatives also fail. Genuinely killed; fallback path engaged.Permanent(none in this suite — all “killed” spikes had alternatives that validated; pattern is rare)

Sub-mode lifecycle rule: TRANSIENT sub-modes (METHODOLOGICAL-SUBSTITUTION + PROVISIONING-NOTE) are not lock-time-final. They flag a follow-up obligation. When the obligation is discharged (substrate provisioned / measurement instrument available), the sub-mode retires and the outcome graduates to plain VALIDATED. The retirement is a logging-contract event — same-session if possible (S2.9b precedent) or next-session with explicit scheduling.

Forward-action under refined-prompt v3.7: when authoring a spike subagent prompt, surface the 5 outcomes as the maturity vocabulary; instruct the spike-runner to use the sub-mode if the precondition-failure pattern matches; flag that sub-modes are TRANSIENT and require follow-up scheduling.


Candidate 2 — Architectural-layer-of-failure distinction in kill clauses

Pattern surfaced at: S2.9 graph-RAG retrieval pilot (clause 4 “Postgres+pgvector setup fails” was a PROVISIONING-layer failure, not a substrate-layer failure; numpy in-memory was substrate-agnostic).

Refined-prompt v3.7 should require kill clauses to specify which architectural layer the failure occupies, so strict-vs-spirit reading can apply with appropriate latitude:

LayerWhat failsStrict-vs-spirit readingExample
SubstrateThe substrate itself genuinely cannot represent / handle the load-bearing propertySubstrate failure is usually KILL or MITIGATED-via-alternative-substrateS2: schema-automator+funowl middleman is substrate-broken on production OWL
ToolingA specific tool / flag / version of an otherwise-good substrate failsTooling failure is usually MITIGATED via tool-swap, version-bump, or different invocationS3: gen-typescript drops annotations — tooling layer of LinkML 1.10
ProvisioningThe substrate is correct but PROVISIONING (install / config / credentials / network) failedProvisioning failure → architectural-alternative-subsisute → VALIDATED-WITH-PROVISIONING-NOTE → schedule re-testS2.9: Postgres+pgvector is correct but sudo password unavailable; numpy substrate-agnostic substitute
MeasurementThe MEASUREMENT INSTRUMENT (LLM, reasoner, formal-verifier) is unavailable or substitutedMeasurement-instrument substitution → VALIDATED-WITH-METHODOLOGICAL-SUBSTITUTION → schedule literal-tooling re-test if load-bearingS4: ANTHROPIC_API_KEY missing → Claude Opus 4.7 subagent as conservative-upper-bound LLM-substitute

Forward-action: refined-prompt v3.7 step that authors kill clauses must tag each clause with layer: substrate|tooling|provisioning|measurement so the spike-runner can apply the correct strict-vs-spirit-reading discipline.


Candidate 3 — Strict-vs-spirit reading as PRE-FLIGHT check, not just post-hoc

Pattern surfaced at: S5 FIBO SSSOM (methodological observation #2 in the v3.25 changelog row): “first spike to apply strict-vs-spirit reading discipline IN ADVANCE rather than reactively (post-S3 + S2.6 codification matured). Pre-S5 checked all 4 kill clauses for framing artefacts; none found; clauses well-framed against this spike’s substrate; outcome cleanly VALIDATED with no MITIGATED ambiguity.”

Refined-prompt v3.7 should specify: at spike-prompt authoring time, the orchestrator runs strict-vs-spirit reading on each kill clause AS A PRE-FLIGHT CHECK, identifying:

  • Which clause is substrate / tooling / provisioning / measurement layer (per Candidate 2)
  • Whether the clause’s threshold has structural ceilings the prompt didn’t account for (per Candidate 4)
  • Whether the spirit clause’s intent is captured in the strict reading or whether they could diverge

Pre-flight check transforms the discipline from post-hoc rationalisation into a prompt-quality gate. Post-S5 the discipline is now operating as pre-flight, validating its own maturity.

Source memories:

  • feedback_kill_condition_strict_vs_spirit_reading_via_outcome_MITIGATED.md (post-S3 codification)
  • This memory (post-S5 generalisation: pre-flight, not just post-hoc)

Candidate 4 — Sample-set + threshold-relative-to-structural-ceiling specification

Pattern surfaced at: S2.6 owlready2 production-scale spike. The 90%-IC kill clause was strictly met (44% on full-CCO random sampling) but only because BFO 2020’s Continuant+Occurrent top-level split makes ≥90% IC mathematically unreachable for full-CCO sampling. The clause was a FRAMING ARTEFACT, not a substantive failure.

Refined-prompt v3.7 rule: when authoring a kill clause that involves a numeric threshold over a sample set, the clause must specify:

  • (a) Sample set explicitly (e.g., “random 50 from full CCO” vs “random 50 from MaterialEntity subtree”)
  • (b) Threshold relative to that sample-set’s structural ceiling (e.g., “≥90% IC reach where the structural ceiling is the proportion of the sample set descended from IC; if the structural ceiling is itself <90%, the clause is unreachable by construction”)

Without (a) + (b), the spike-runner gets a kill clause that LOOKS strict but is actually unreachable due to corpus topology. Outcome gets stuck in MITIGATED-via-strict-vs-spirit-rescue when the clause should never have been authored that way.

Forward-action: refined-prompt v3.7 has a Step that requires kill-clause authors to compute the structural ceiling FIRST (e.g., what fraction of the sample-set could possibly meet the threshold under best-case topology?) and only then set the threshold relative to that ceiling.


Candidate 5 — Same-session follow-on against original substrate when alternatives-first substitute validates

Pattern surfaced at: S2.9b pgvector re-test (~30 min wall-clock; outcome upgraded VALIDATED-WITH-PROVISIONING-NOTE → VALIDATED).

Refined-prompt v3.7 rule: when an alternatives-first substitute validates the load-bearing theory, schedule a same-session (or same-day) follow-on against the original substrate as soon as provisioning unblocks. The comparison data has high decision-quality value at marginal cost (~30 min wall-clock in the S2.9b case). Three concrete benefits demonstrated:

  1. Sub-mode retires (cleaner maturity vocabulary at lock-time)
  2. Production-store recommendation flips with empirical justification (S2.9b: numpy “deferred to Phase-1 implementation” → pgvector “wins” with 2-4× latency advantage and identical recall)
  3. Phase-1.5 stress-test thresholds tighten based on empirical floor (S2.9b: p95 <100ms → <50ms)

Forward-action: refined-prompt v3.7 has a Step at spike-suite-closure time that asks “for each TRANSIENT sub-mode outcome, has the follow-on been scheduled or executed?” — driving same-session retirement where feasible.


Candidate 6 — Plan-files accumulate “untested-text” defects when suites reorganise

Pattern surfaced at: 4 plan-defects caught in 24h on 2026-05-02 BST:

  1. S6 AM-CDM URL kebab-case buffalo-mfg-works → 404 (correct: CamelCase BuffaloMfgWorks); plan v1.4 → v1.5 fix
  2. S5 FIBO IRI module-path FBC/FunctionalEntities/FinancialInstruments → 404 (correct: FBC/ProductsAndServices/FinancialProductsAndServices); plan v1.5 → v1.6 fix
  3. S4 T-file slug T-spike-eps-iota-S4-uk-w-pipeline-pilot-... doesn’t match actual file ...uk-w-pipeline-... (extra -pilot word); plan v1.8 → v1.9 fix
  4. S-number swap typos (Tasks 6+7 in §4 should reference S8+S9 but plan body says S6+S7; Tasks 8+9 in §5 should reference S6+S7 but plan body says S8+S9) — plan v1.8 → v1.9 fix

Sub-rule under feedback_test_theories_immediately_when_tabled: when reorganising plan section ordering OR copying URLs/IRIs/filenames from prior research into a plan, run a verification pass:

  • For URLs / IRIs: grep -nE 'https?://' plan.md then probe each with curl -sILo /dev/null -w '%{http_code}'; flag any non-2xx
  • For filenames: grep -nE 'T-spike-eps-iota-S[0-9]+(\.[0-9]+)?-' plan.md and verify each S-number matches its surrounding Task heading
  • For command-flags: pin the tool version + flag-name to a docs version-pin (e.g., LinkML 1.10.0 --meta full not --include-annotations=True)

Forward-action: pin-drift hook could grow a sibling that flags any https?:// URL in a plan/spec file that doesn’t return 2xx — analogous to how check-frontmatter-pins.py catches stale companion-file pins. A second sibling could enforce the S-number-matches-Task-heading rule via regex.


Candidate 7 — Parallel-session Edit-tool stale-Read pattern

Pattern surfaced at: S5 FIBO closure (~80 min wall-clock dominated by Edit-tool stale-Read contention with parallel S2.9b + S8 sessions writing to arch-state in close succession).

Refined-prompt v3.7 rule: in concurrent-spike sessions, recommend re-Read-before-Edit on every cross-link cascade step. The Edit tool’s stale-Read detection serialises low-frequency contention without explicit locking, but adds wall-clock cost when sessions write in close succession.

Three concrete patterns observed:

  1. Edit aborts with “modified since read” → re-Read shows parallel-session updates landed cleanly → Edit applies without conflict
  2. Pin-drift hook on commit catches stale companion-file version_pinned values when one window’s commit landed before another’s; the --fix mode auto-resolves
  3. arch-state version numbers can advance unpredictably during a closure cascade (S5 saw v3.22 → v3.23 → v3.24 → v3.25 in ~20 min as parallel windows landed)

Forward-action: refined-prompt v3.7 documents this as expected behaviour, not a defect. When wall-clock budget is tight, prefer dispatching dependent spikes serially rather than concurrently to avoid contention overhead. When dispatching concurrently, expect ~30-90 min of contention overhead per cross-link cascade.


Candidate 8 — Logging-contract closure as same-session-or-defer-with-trigger

Pattern surfaced at: 9 of 10 spikes in this suite closed logging-contract within same session as T-file authoring; only S1 had the historical 4.5h lag (caught by Rich asking + codified as feedback_logging_contract_closure_within_same_session).

Pattern matured to (post-S2.9b same-session follow-on): logging-contract closure now extends to same-session FOLLOW-ON amendments when a TRANSIENT sub-mode retires. S2.9b’s amendment of S2.9’s T-file v1.0 → v1.1 (in-place, with §11 numpy-vs-pgvector comparison + CHANGELOG v1.1 entry) is the precedent.

Refined-prompt v3.7 should specify: logging-contract closure includes:

  1. Initial T-file authoring (within same session as spike completion)
  2. arch-state §11/§12 row + Changelog row (within same session)
  3. Q-003 (or relevant cascade-Q) §10 row + CHANGELOG (within same session)
  4. New memory file (within same session)
  5. MEMORY.md +1 entry (within same session)
  6. active-work-log update (within same session)
  7. TRANSIENT sub-mode follow-on (within same session if substrate / measurement-instrument unblocks; otherwise defer with explicit reconsideration trigger)

Step 7 is the v3.7 addition.


How to apply (summary)

When authoring refined-prompt v3.6 → v3.7:

  1. Add Steps 13-20 to absorb Candidates 1-8 above.
  2. Update the maturity-vocabulary table with 5 outcomes + 2 sub-modes.
  3. Add architectural-layer tagging to kill-clause authoring guidance.
  4. Add pre-flight strict-vs-spirit check.
  5. Add structural-ceiling specification rule for numeric thresholds.
  6. Add same-session follow-on rule for TRANSIENT sub-modes.
  7. Add plan-file URL/IRI/filename verification pass when reorganising sections.
  8. Document parallel-session Edit-tool stale-Read pattern.
  9. Extend logging-contract closure to include TRANSIENT sub-mode follow-on.

Each candidate is supported by a specific spike’s empirical evidence (cited inline above) — the v3.7 author can grep this memory for Source spike / Pattern surfaced at to find the original arch-state row containing the full evidence.


Boundary tests (when this consolidation memory FIRES strongly)

  • ✓ Authoring refined-prompt v3.6 → v3.7
  • ✓ Authoring a new spike-suite plan (e.g., for ε.something-else or ω.+)
  • ✓ Reviewing a kill clause for framing artefacts
  • ✓ Deciding whether a TRANSIENT sub-mode has retired
  • ✓ Authoring a feedback memory that touches kill-condition vocabulary

Boundary tests (when this consolidation memory does NOT apply)

  • ✓ Authoring a per-spike T-file (use the canonical T-file frontmatter pattern instead)
  • ✓ Updating arch-state for a non-spike amendment (use the A-N amendment pattern instead)
  • ✓ Year-2+ horizon decisions (defer with explicit reconsideration trigger; this memory is Phase-1-bound)

Codification trigger

Rich-directive 2026-05-02T~16:25 BST: “please save all the information we have learnt from these spikes. i do not want to have to do them again.” Acted on by: (a) plan v1.8 → v1.9 with 6 typo fixes; (b) THIS memory consolidating 8 v3.7 candidates; (c) feedback_paste_safety_for_terminal_handoffs.md codifying the 3 paste-mangling failures; (d) spike-suite-eps-iota-outcome-index.md spec doc as the single-page suite-state reference for Phase E Task 13 lock-decision.


  • feedback_kill_condition_strict_vs_spirit_reading_via_outcome_MITIGATED.md — post-S3 narrower codification of strict-vs-spirit; this memory generalises with sub-modes + pre-flight + architectural-layer
  • feedback_surface_alternatives_before_collapsing_synthesis_to_baseline.md — post-S2.5 alternatives-first; this memory extends with same-session follow-on for TRANSIENT sub-modes
  • feedback_test_theories_immediately_when_tabled.md — post-S2.9 generalisation; this memory adds a sub-rule for plan-file URL/IRI/filename verification
  • feedback_logging_contract_closure_within_same_session.md — post-S1 codification; this memory extends with same-session follow-on for TRANSIENT sub-mode retirement
  • feedback_paste_safety_for_terminal_handoffs.md — sibling consolidation memory codifying the 3 paste-mangling failures
  • spike-suite-eps-iota-outcome-index.md — companion spec doc indexing all spike outcomes for Phase E Task 13 lock-decision