The rule: A tool / library / extension / framework / hook / plugin only lands as a Wave execution deliverable AFTER it has been empirically pilot-tested against TT’s real substrate. README-validated spike outcomes are INSUFFICIENT for COMMIT-level decisions — they must trigger a follow-on pilot spike with TT-real-data BEFORE the item can move from spike DEFER to wave COMMIT.
Why: Rich-frustration 2026-05-26 after Wave C Del 4 memorylint pilot returned NOT-FIT during execution. Root cause chain:
- S2 spike validated memorylint via README only — “T-spike-s2 capability validated by README — empirical end-to-end test not run”
- Suite-1 §4 correctly said “pilot-first” — but didn’t make pilot a separate spike
- P4 substrate-archaeology audit (master plan v1.5) found memorylint as G-02 gap and absorbed it into Wave C as Del 4
- P4 hybrid-C lock (master plan v1.6) committed it to execution despite README-only validation
- Wave C executing session 2026-05-26T~06:35 BST ran the pilot AS Del 4 — discovered NOT-FIT at empirical contact
Pilots-during-execution waste wave time on deliverables that don’t deliver. The proper sequence: README-validated → DEFER pending empirical pilot → pilot as standalone spike → IF FIT then Wave deliverable. We compressed steps 3-4 incorrectly.
How to apply:
- Spike phase (pre-Wave): Every spike T-file MUST classify empirical status with one of three values:
VALIDATED-EMPIRICAL— tool was run end-to-end against TT’s real substrate; outcomes measuredVALIDATED-README-ONLY— tool capability confirmed via documentation; pilot REQUIRED before COMMITINCONCLUSIVE— neither empirical nor README-clear; needs deeper investigation
- Suite findings doc: If any spike returns
VALIDATED-README-ONLY, the suite §4 disposition CANNOT be COMMIT — it MUST be either DEFER-pending-pilot OR a separate pilot-spike scheduled BEFORE wave commit. - Substrate-archaeology audit (P4-style): When sweeping gaps, README-only validation does NOT make a gap absorbable into a wave. Such gaps land as standalone pilot-spike items, not wave deliverables.
- Wave LP authoring: Before any deliverable lands in a Wave LP, verify the source spike was
VALIDATED-EMPIRICAL. IfVALIDATED-README-ONLY, the deliverable must include the pre-validation spike as its first step (see Wave D §3 OPA pin-drift pattern — “A separate validation spike is required before adoption (~2-3h)” — that’s the CORRECT discipline). - Coordinator audit (R8-style): Pre-dispatch LP audits MUST include “empirical-fit-status verification” as a check column alongside T-file citation accuracy + LoC claims accuracy.
Pattern recurrence — items at risk:
- ✓ Wave C Del 1 EARS hook — T2 empirical (13.5% FN / 6.2% FP on 5 TT spec files). Correctly classified.
- ✗ Wave C Del 4 memorylint — S2 README-only. Wrong classification → wasted execution. NOT-FIT discovered 2026-05-26.
- ✓ Wave C Del 5 Conftest+Rego — S3 empirical (~120 Python LoC → 53 Rego on 50-fixture set with 0/50 FN+FP). Correctly classified.
- ✓ Wave C Del 6 MADR-5 — S12 empirical on 5 sample Q-NNN files (VALIDATED-WITH-NOTE). Correctly classified.
- ✓ Wave D Del 1 mkdocs — T1 empirical 4-way comparison. Correctly classified.
- ✓ Wave D Del 3 OPA pin-drift — Wave D §3 LP correctly notes “S3 was OPA+Conftest for frontmatter generally, NOT pin-drift specifically. A separate validation spike is required”. GOLD STANDARD pattern.
- ⚠️ Phase-1.5+ Wave G mcp_agent_mail — S4 status uncertain; master plan §15 G-06 flags “mis-classified DEFER (should be REJECT per Anthropic Rider blocker)”. Same risk pattern as memorylint. Pre-validation spike MANDATORY.
- ⚠️ Phase-1.5+ Wave G cross-AI memory bridge — README-only or empirical? Verify before Wave G dispatch.
- ⚠️ Phase-1.5+ Wave G Codex TaskList tools — README-only or empirical? Verify before Wave G dispatch.
- ⚠️ Phase-1.5+ Wave H annotation-linter OPA — S7 VALIDATED-WITH-NOTE; empirical status against TT’s 8 annotation rules unclear. Treat as README-only-until-confirmed; require pre-validation spike.
VALIDATED mid-Wave-C-execution 2026-05-26: The discipline was locked by sibling session 2026-05-26 ~07:00 BST DURING Wave C Del 4 memorylint pilot. The pilot returned NOT-FIT at 32abe08 (model mismatch + PE-13 lock conflict) ~30 minutes later. Both findings independently confirmed by Wave C executor + coordinator-handoff session §3.2. The discipline is now load-bearing, not theoretical — Phase-1.5+ Wave G + Wave H launch-prompts MUST honour pre-validation spike block before any README-only-validated deliverable can be COMMIT-classified.
Companion locks: feedback-lp-t-file-substrate-audit-pre-dispatch (catches T-file citation accuracy pre-dispatch); feedback-verify-framework-extension-maintenance-before-lock (related discipline — maintenance status); verifying-before-author skill (the 6-method toolkit for substrate verification); feedback-conftest-rego-dogfood-self-validation-pattern (sister lesson — schema-validates-its-own-author paradigm for spike-paired validators).
Anti-pattern this rule prevents: Wave executions that include “pilot to determine fit” as a deliverable. Pilots are spike outcomes; wave deliverables are COMMIT-level adoptions of fit-confirmed substrate. Mixing the two wastes wave time, dilutes wave-velocity calibration, and discovers fitness in production where the cost of NOT-FIT is much higher.
Sister rule (related but distinct): feedback-defer-cost-arithmetic-in-recommendations — defer-cost calculation should factor pilot-cost into the deferral, not the COMMIT decision.