When a weighted scorecard produces a winner at 85-92% and it feels “slightly off” — the impulse is to adjust weights (“criterion X should be more important” etc.). Resist that impulse. Instead, ask whether the QUESTION is framed correctly.

The pattern

Three times on 2026-04-23/04-24, scoreards stalled in the mid-range until Rich reframed the question, at which point a structurally-superior option emerged that hadn’t been in the original option set:

Instance 1 — Angle (vi) faceted classification

Initial framing: “hybrid per-facet matrix (SKOS for taxonomies + OWL class for classifications + DataProperty for enums)” Stall at: vi-4 83.6% vs vi-1 82.8% (1.6pp margin; flat comparison) Rich’s reframe: “find a permutation of standard concepts that is incredibly suitable; this feels like hybrid” Emergent option: vi-6 (OMG Commons Classifiers 4-layer stack — cmns-cls + SKOS + XKOS + CIDOC E55 Type bridge) New result: vi-6 94.6% vs vi-4 83.6% (+11pp) Pattern: “Hybrid” was an admission of not finding the right standard. Reframing to “single coherent layered stack of standards” surfaced vi-6 as THE answer FIBO already uses.

Instance 2 — Phase-1 module authoring sequence (A-24)

Initial framing: “which sequence best enables commercial velocity + partner engagement?” Stall at: Seq C (commercial velocity) 83.6% vs Seq D (balanced) 84.0% (0.4pp margin) Rich’s reframe: “which approach is most suitable for helping us discover any gaps in our plan, and then respond to them?” Emergent option: Seq G1 (Core → Assets → Wills E&W minimal → Delegation → Catalogue → Trusts → Probate; gap-discovery-optimal) New result: Seq G1 94.0% vs Seq D 84.0% (+10pp) Pattern: The question wasn’t “commercial velocity” — it was “gap discovery.” Reframing surfaced a sequence (G1) that wasn’t in the original A/B/C/D/E option set; synthesised from the gap-discovery framing.

Instance 3 — Role placement / facet placement / testamentary-instrument boundary

Initial baselines: R-1 (92%), AC-1 (already thought through as “current”), W-1 (94.2%) Pattern: Rich asked “can you find a way to improve R-1 to 100%?” Refinements applied (orthogonal structural improvements, NOT weight adjustments):

  • R-1’: introduction-module rule + no artificial specialisation + SKOS/CIDOC annotations → 100%
  • AC-1: introduction-module + Core-fallback (honestly named per L2) → 100%
  • W-1’: TestamentaryInstrument abstract parent + Trusts-imports-LegacyLetter + Phase 1.5 subclass plan → 100% Pattern: None of the refinements were weight-fudging. Each addressed a specific criterion-weakness directly via structural improvement.

Why this works

  1. Scorecards constrain thinking to their option set. If the right answer isn’t in the original options, no weight adjustment can surface it.
  2. A stalled scorecard is diagnostic. 85-92% winner with narrow margin often means the question-framing is missing something.
  3. Reframing is cheap. Adding a new option to consider is faster than re-scoring 15 criteria across existing options.
  4. Research typically surfaces the reframing. vi-6 came from Agent 4 research. Seq G1 came from stepping back to ask “what is this decision actually for?”

How to apply

When a scorecard is being built:

  1. Before scoring, ask: “Is there an option I haven’t considered?”
  2. After scoring, if winner is <95%: stop + ask “Is the question framed correctly?”
  3. If winner feels slightly off: do NOT rush to re-weight. Look for a reframing first.
  4. If stalled between two options: the question almost certainly has a missing axis or misaligned framing.

Anti-patterns

  • Weight-fudging — nudging weights to make a preferred option win. Dishonest + scorecard becomes unauditable.
  • Adding criteria post-hoc — unless the new criterion is genuinely distinct and not already captured. (The added criterion in vi-6’s scorecard — “Single-standard-concept coherence” — was legitimate because Rich’s preference was explicit and distinct from existing criteria.)
  • Hybrid as escape hatch — “use both approaches” is often the wrong answer. The vi-6 4-layer stack isn’t a hybrid; it’s one coherent multi-standard solution.

When re-weighting IS correct

Not every stalled scorecard needs reframing. Re-weighting is legitimate when:

  • Rich explicitly states a new preference (e.g. “build cost is not important to me”)
  • New information changes the relative importance of existing criteria
  • A criterion was misweighted (e.g. “reasoner performance” was weighted 5 but should have been 10 given ISO-track scale expectations)

Cross-references

  • Scorecard vi-6 (docs/superpowers/scoping/2026-04-24-scorecards/vi-faceted-classification-A23-scorecard.md) — reframe from hybrid to layered-stack
  • Scorecard Seq G1 (docs/superpowers/scoping/2026-04-24-scorecards/seq-G1-module-authoring-A24-scorecard.md) — reframe from commercial to gap-discovery
  • Scorecards R-1’ / AC-1 / W-1’ — structural refinement (not reframe, but same “don’t re-weight, restructure” principle)
  • README at docs/superpowers/scoping/2026-04-24-scorecards/README.md — meta-pattern observed in honest critique section
  • feedback_introduction_module_rule.md — specific application of reframe-beats-re-weight (answer “where does this primitive live?” via introduction-module rule rather than re-weighting Jackson-vs-Core-leanness criteria)
  • feedback_iri_verification_before_lock.md — parallel verification discipline; both discipline memories prevent silent architectural drift