refined-prompt v3.12 → v3.13 released — Named-persona review pass + per-Q correctness-flavour audit

Released: 2026-05-17T15:30 BST (directive lastmod). Two-commit deployment on docs-strategy origin/main:

  • 159ba94docs: peer-reviewer-personas v1.0 — 9 personas for refined-prompt v3.13 named-persona review pass
  • be5b19edocs: refined-prompt v3.12 → v3.13 — named-persona review pass + per-Q correctness-flavour audit

Push verified 2026-05-17T16:14 BST. PDFs auto-exported to Downloads (refined-end-of-turn-directive.pdf v3.13 34pp 210KB + peer-reviewer-personas.pdf v1.0 14pp 73KB; metadata embedded per §11).

Trigger

Rich-directive 2026-05-17 BST after methodology-review conversation locking “mathematical correctness of the standard” as primary optimisation target — with all three correctness-flavours co-equal:

  1. Formal verifiability (the proof-shaped flavour — F*/Coq/Mopsa/SMT)
  2. Internal consistency + completeness (the LinkML/SHACL/Catala/SKOS substrate self-consistency flavour)
  3. Faithful representation of law (the jurisdiction-fidelity flavour — Akoma Ntoso / ELI / statute-citation traceability)

External peer reviewers not available within Phase-2/Phase-3 timeline → internal-mimicry substrate required. Per feedback_actively_use_t_files_in_scorecard_authoring discipline, this is NOT a substitute for real external review; it widens the objection-set but does NOT close the gap.

What shipped

Step 7.5 NAMED-PERSONA REVIEW PASS (new sub-step between step 7 scorecard and step 8 closing question)

  • Each Q selects 3-5 personas from peer-reviewer-personas.md v1.0 library (9 personas across 3 flavours)
  • Per-persona projection produces 2-3 objections per persona (target ~10-15 distinct objections per Q)
  • Each objection classified:
    • (a) addressed — already covered in scorecard body / sub_clarifications_locked
    • (b) inline-addressable — surface-level gap fixable in the lock-answer without rework
    • (c) substantive gap — requires scorecard rework or new spike; (c) BLOCKS the lock
  • Sycophant-persona drift counter-pattern: 3-consecutive-zero-(c) triggers AMPLIFY mode (force at least one persona to produce an aggressive (c)-class objection) — the honest-floor

Step 7 amendment — correctness-flavour audit on criteria-set

Scorecard criteria-set MUST include ≥1 criterion per correctness-flavour with independent veto. If any flavour has zero criteria, the criteria-set is incomplete and must be widened before scoring begins.

Companion peer-reviewer-personas.md v1.0 — 9 personas

3 personas per correctness-flavour (3 × 3 = 9):

FlavourPersonas (illustrative shape; verify against v1.0 source)
Formal verifiabilityProof-substrate / SMT-encoding / decidability-skeptic
Internal consistency + completenessSchema-correctness / classifier-coverage / cross-module-reference-integrity
Faithful representation of lawJurisdiction-fidelity / statute-citation-traceability / lay-intent-faithfulness

Source: ~/testatetech/docs-strategy/docs/superpowers/specs/2026-04-29-multi-phase-audit/peer-reviewer-personas.md (v1.0 frontmatter; 14pp PDF).

Cost + scope

  • Cost ~+10-20 min per Q (3-5 personas × ~3-5 min per persona projection)
  • No reversal of v3.0..v3.12 locks
  • Q-001..Q-026 unaffected — v3.13 forward-only (first production use Q-027 Wills.Bequest)
  • Step 6 sub-letters 6a-6g (substrate exploration) unchanged
  • Step 6f maturity vocabulary 3 sub-modes (locked v3.12) unchanged
  • Pre-flight substrate-completeness gate (locked v3.12 #1) unchanged
  • N≥5 consecutive-bold-synthesis counter-pattern check (locked v3.12 #7) unchanged

Known limit (codified in directive prose)

Internal-simulation widens the objection-set vs abstract counter-argument but does NOT close the gap to real external reviewer. The sycophant-persona drift counter-pattern (3-consecutive-zero-(c) → amplify) is the honest-floor against the failure mode where Claude projects a persona that produces only agreeable objections.

Deferred to future versions

  • F4 closing-question polish — still queued (originally v3.12-deferred, now v3.14+ deferred)
  • F6 research-completeness scorecard — still queued

First production use

Q-027 Wills.Bequest class shape MVP — natural-sequence next Q after Q-026 ψ.η Wills.Will lock (per A-124 module sequencing). Q-027 will be FIRST Q under v3.13 step 7.5 NAMED-PERSONA REVIEW PASS regime + step 7 per-flavour criterion audit.

Substrate documents

  • Directive: ~/testatetech/docs-strategy/docs/superpowers/specs/2026-04-29-multi-phase-audit/refined-end-of-turn-directive.md v3.13 (commit be5b19e)
  • Personas: ~/testatetech/docs-strategy/docs/superpowers/specs/2026-04-29-multi-phase-audit/peer-reviewer-personas.md v1.0 (commit 159ba94)
  • Companion previous-release memory: [[refined_prompt_v3_11_released_harvard_uniform_2026_05_05]] (Harvard-depth uniform default — load-bearing for v3.13)
  • Production-use-9-improvements memory: [[feedback_v3_12_release_9_improvements_2026_05_05]] (v3.12 bundle that v3.13 builds on)
  • Discipline reference: [[feedback_actively_use_t_files_in_scorecard_authoring]] (named-persona review pass is internal-mimicry; T-file citation discipline still applies for prior-knowledge alternatives)