Phase-1 INHERIT v2 uses vendor-API LLMs (Scenario 2), not fine-tuned models. Rich locked this 2026-04-22 after reviewing LLM cost scenarios 0-5.

What’s committed

  • Scenario 2 (vendor API for NL extraction + narration + AI-tip + RAG help) is the Phase-1 default. Annual cost envelope £4K-£12K at Phase-1 scale.
  • Cloudflare AI Gateway specifically (NOT “Cloudflare or similar”) — single committed gateway for provider-swap, rate-limiting, cost metering, logging, fallback routing.
  • Haiku 4.5 default tier across all user-facing flows. Sonnet 4.6 is an upgrade requiring task-specific evidence (not speculative). No Opus / frontier-tier models in Phase-1 production. Dev-time use (Rich’s own Claude subscription for strategy work) is unaffected.
  • Hard monthly ceiling: £1,500/month at Phase 1. Alerts before breach; provisioning blocks further calls on breach until Rich or operations approves increase.
  • Per-session cost limit, configurable in Advisory BackOffice (not hard-coded in v2). Auto-fallback to form-based flow when exceeded. Implements new mega-spec claim A4.32b.

What’s NOT a Phase-1 possibility

Scenario 3 (fine-tuning per Q4 memo) is gated on any one of:

  • TT ARR ≥ £1m sustained for 12 consecutive months with ≥6 months post-spend runway, OR
  • Acquirer-greenlit (post-acquisition), OR
  • Customer-funded — a single customer pays ≥£40-60K upfront for the specific fine-tune; ships under InheritKit commercial license; TT retains IP.

Explicitly NOT a gate:

  • Competitive-response to a rival shipping a succession-specialist model (Q4 memo trigger #4) — TT does not burn capital chasing competitors without revenue.
  • Cross-task aggregation performance signal (Q4 memo trigger #5) — insufficient without revenue.

Q4 memo’s performance triggers become advisory only — they prioritize which task to fine-tune when the revenue gate opens, not whether to open it.

How to apply

  • Designing any LLM-touching feature in Phase 1: default to Haiku 4.5 via Cloudflare Gateway; upgrade to Sonnet only with task-specific evidence that Haiku plateaus.
  • Architecting any flow that uses LLM calls: include form-based fallback path + per-session cost tracking from Day 1. Per-session limit lives as a BackOffice-admin-editable config value.
  • Budget planning: treat £1,500/month as hard ceiling through Phase 1; model revenue-growth-to-£1m-ARR separately as the revenue-gate precondition for Scenario 3.
  • When asked about fine-tuning: always reply “revenue-gated, not a Phase-1 possibility.” Only revisit when ARR + runway gates test positive.
  • Partner / investor conversations: frame LLMs as an additive UX-boundary layer, NOT as the core product. The core is Catala + JSON Schema + JSON-LD + SPARQL. LLMs are swappable convenience.

Why

  • Solo-bootstrap (R3.21) — fine-tune infrastructure + eval-corpus annotation (£5-£20K/task) is significant capital. Should follow revenue, not lead it.
  • ISO/JTC 1 PAS fast-track evaluates the standard, not the AI layer. LLMs irrelevant to ISO story.
  • Clean-break discipline — LLM layer is swappable via Cloudflare Gateway; if a vendor API becomes unavailable or uneconomic, TT swaps without touching core artefacts.
  • Catala does the rule-work deterministically — LLMs add UX-boundary convenience, not computation. Scenario 0 (no LLMs in production) remains a defensible fallback at any time.

Cross-references

  • docs/superpowers/specs/2026-04-22-inherit-v2-architecture-proposal.md v1.3+ §“LLM posture — Phase-1 Scenario 2 commitment”
  • docs/superpowers/specs/2026-04-21-ambiguity-4-r34-llm-route.md v1.1+ §D5 (triggers downgraded to advisory only)
  • Mega-spec claims: A4.32 (Cloudflare specifically, Haiku default), A4.32b (per-session fallback, BackOffice-editable), A4.32c (£1,500/month ceiling)
  • R3.4 cascade: trigger-gated → revenue-gated

Do NOT

  • Pitch fine-tuning as a Phase-1 capability, even prospectively. R3.4 is strictly revenue-gated.
  • Default to Sonnet or Opus in any new flow. Haiku 4.5 is default.
  • Suggest building a custom AI gateway; Cloudflare is committed.
  • Exceed £1,500/month LLM budget at Phase 1 without Rich’s explicit approval.
  • Treat competitive-response or aggregation triggers as revenue-gate substitutes.