Defer-cost arithmetic in recommendations (locked 2026-05-20 Q-044 trigger)

Rule. Whenever a closing-question option spawns a Phase-1.5+ richard-task as the cost-side of a thin-MVP-first recommendation, the scorecard MUST include explicit defer-cost arithmetic. Formally:

now-cost (X)  +  Phase-1.5+ refactor-cost (Y)  ×  refactor-probability (P)  =  expected total (Z)

Compare Z to the cost of the substantive-now option. When Z > substantive-now-cost AND refactor-probability is HIGH, recommend the substantive-now option regardless of which option carries the thin-MVP-first framing.

Refactor-probability is HIGH when the deferred work is one of:

  • structural — schema topology, BFO inheritance shape, class-leaf split, primitive boundary
  • ontological — BFO honesty, A-class amendments, cross-module-primitive placement, alignment-axiom strength
  • discipline-driven — pre-commit hook gating, cascade-compliance requirement, MQ-N enforcement layer
  • annotation surface — LinkML annotation discipline (per MQ-016) where the deferred surface drives downstream consumer correctness

Refactor-probability is LOW when the deferred work is ergonomic / cosmetic / opt-in (RICS date-of-report split; IFRS 13 input hierarchy; cross-border-recognition uplift; etc. — see Q-CM-2 task-83 + task-84 for canonical low-prob examples).

Why

Quantitative heuristics (“1.5-2h MVP shipped this sprint”) win against qualitative cost-tags (“BFO honesty COMPROMISED” / “refactor required later”) by default at scorecard-authoring time. The recommendation-logic isn’t broken — it’s working as designed under refined-prompt v3.17 L4 closing-question rules, which weight time-to-ship over future-refactor-cost when both are surfaced as adjacent scorecard rows.

The asymmetry has a cost. When refactor-probability is HIGH and structural, the deferred work compounds — every downstream Q-N that touches the deferred surface either inherits the wrong topology or has to wait for the refactor. The arithmetic fixes this by forcing the deferred cost into the same scorecard cell as the now-cost, making the trade-off legible at recommendation time.

Worked example — Q-044 α-vs-β class-topology (lock-trigger; 2026-05-20)

Q-044 (Q-A3 Assets.CryptocurrencyOrDigitalAsset) surfaced 2 viable topologies:

  • α: single-class CryptocurrencyOrDigitalAsset + TypeScheme — one class with a 5-value SKOS classifier (cryptocurrency / utility-token / security-token / NFT / stablecoin). ~1.5-2h now. Refactor cost when the BFO honesty problem surfaces: ~3-4h to split into two leaves (FungibleDigitalAsset + NonFungibleDigitalAsset). Refactor-probability HIGH (BFO honesty is structural + ontological — fungibility is the load-bearing axis for the 5 cascade-Q consumers downstream).
  • β: two-leaves split BFO-honest — FungibleDigitalAsset + NonFungibleDigitalAsset as siblings under DigitalAsset. ~2-3h now. Refactor cost for the same use cases: 0h (already split).

Arithmetic:

  • α: 1.5-2h + (3-4h × HIGH ≈ 1.0) = 5-6h expected total
  • β: 2-3h + (0h × HIGH ≈ 1.0) = 2-3h expected total

Math favours β by ~2-3h. Q-044 session recommended α (thin-MVP-first framing) despite this math being derivable from the scorecard’s own cost-tag rows. The recommendation-logic was applying thin-MVP-first as the priority — defer-cost arithmetic supersedes that priority when math + HIGH refactor-probability favour substantive-now.

How to apply

At [E] scorecard authoring in refined-prompt v3.17 L4 closing-question discipline:

  1. For every option that spawns a Phase-1.5+ richard-task as the cost-side of its thin-MVP framing, author a defer_cost_arithmetic: block in the scorecard. Format:

    defer_cost_arithmetic:
      now_cost: "1.5-2h"
      refactor_cost: "3-4h"
      refactor_probability: HIGH  # structural / ontological / discipline-driven / annotation-surface
      refactor_probability_basis: "BFO honesty is the load-bearing axis for N downstream consumers"
      expected_total: "5-6h"
    
  2. Classify refactor-probability using the 4-bucket vocabulary above. State the basis (one sentence — what makes this HIGH or LOW).

  3. Compare expected totals across all options that spawn Phase-1.5+ tasks AND any substantive-now option. When Z_thin-mvp > Z_substantive-now AND P_thin-mvp = HIGH, the recommendation MUST be the substantive-now option, regardless of thin-MVP-first framing.

  4. When defer-cost arithmetic overrides thin-MVP-first, EXPLICITLY note this in the [F] closing-question reasoning: “Defer-cost arithmetic supersedes thin-MVP-first because refactor-probability HIGH; expected total favours substantive-now by N hours.”

Anti-patterns

  1. Citing only now-cost in the scorecard — the deferred cost is invisible; thin-MVP-first wins by default
  2. Tagging refactor-probability as “MEDIUM” to soften the override — be honest. Structural / ontological / discipline-driven = HIGH. Hand-wave categories (“might need refactor later”) = HIGH for safety; specific opt-in features = LOW
  3. Skipping the refactor_probability_basis line — the qualitative reasoning is the audit-trail; without it future-Claude can’t sanity-check the classification
  4. Treating low-prob deferred work the same as high-prob — Q-CM-2 task-83 (RICS date-of-report split) is LOW-probability opt-in; Q-044 α refactor is HIGH-probability structural. Don’t conflate.

When this rule fires

  • At [E] scorecard authoring whenever an option lists “spawns Phase-1.5+ task N” as a cost-side artefact
  • At any Recommendation step (cascade-Q file authoring; cascade execution; sub-clarification adjudication) where the thin-MVP-first framing is load-bearing for the recommendation
  • Retroactively at [F] closing-question lock if Rich queries the recommendation reasoning — defer-cost math becomes the audit-record

Companion memories

  • [[feedback_spikes_inline_not_tasked]] (parent — this rule extends spike-inline to the recommendation-logic layer for non-spike work)
  • [[feedback_do_now_over_task_list_addition]] (grandparent — the originating DO-NOW discipline)
  • [[feedback_pre_stage_for_other_session]] (sibling 2026-05-20 — cross-session pre-staging pattern)
  • [[feedback_methodology_hot_reload_at_decision_points]] (sibling — ω′.2; addresses the OTHER half of the Q-044 root cause)
  • [[feedback_batch_compression_lowers_defer_threshold]] (cross-link — compression evidence that lowers the defer-vs-do-now threshold; this rule operates within that lower threshold)

Empirical precedent ledger

#WhenQ / BatchOutcome
12026-05-20Q-044 Q-A3 CryptocurrencyOrDigitalAsset α-vs-βLOCK-TRIGGER. α (1.5-2h now + 3-4h refactor × HIGH) = 5-6h expected total; β (2-3h + 0h refactor) = 2-3h total. Recommendation-logic should have favoured β; codified as ω′.1 lock-trigger.

(Future Q-NNN sessions that invoke this rule should append their outcome here.)

Codified at ~/testatetech/docs-strategy/docs/superpowers/specs/2026-04-29-multi-phase-audit/refined-end-of-turn-directive.md v3.18 L4 closing-question sub-rule (a) “Defer-cost arithmetic mandatory” — this memory is the durable rule; the refined-prompt amendment is the in-flight invocation surface.

Source-attribution

  • Trigger: Q-044 session 2026-05-20T11:25 fork-analysis (orchestration-session methodology review)
  • Rich-decision: 2026-05-20T11:25 lock #1 — “defer-cost math supersedes thin-MVP-first when refactor-probability HIGH”
  • Locked-as: ω′.1 launch-prompt + this memory file
  • Sibling locks: ω′.2 hot-reload (paired); refined-prompt v3.17 → v3.18 L4 amendment