When a weighted scorecard produces a winner at 85-92% and it feels “slightly off” — the impulse is to adjust weights (“criterion X should be more important” etc.). Resist that impulse. Instead, ask whether the QUESTION is framed correctly.
The pattern
Three times on 2026-04-23/04-24, scoreards stalled in the mid-range until Rich reframed the question, at which point a structurally-superior option emerged that hadn’t been in the original option set:
Instance 1 — Angle (vi) faceted classification
Initial framing: “hybrid per-facet matrix (SKOS for taxonomies + OWL class for classifications + DataProperty for enums)” Stall at: vi-4 83.6% vs vi-1 82.8% (1.6pp margin; flat comparison) Rich’s reframe: “find a permutation of standard concepts that is incredibly suitable; this feels like hybrid” Emergent option: vi-6 (OMG Commons Classifiers 4-layer stack — cmns-cls + SKOS + XKOS + CIDOC E55 Type bridge) New result: vi-6 94.6% vs vi-4 83.6% (+11pp) Pattern: “Hybrid” was an admission of not finding the right standard. Reframing to “single coherent layered stack of standards” surfaced vi-6 as THE answer FIBO already uses.
Instance 2 — Phase-1 module authoring sequence (A-24)
Initial framing: “which sequence best enables commercial velocity + partner engagement?”
Stall at: Seq C (commercial velocity) 83.6% vs Seq D (balanced) 84.0% (0.4pp margin)
Rich’s reframe: “which approach is most suitable for helping us discover any gaps in our plan, and then respond to them?”
Emergent option: Seq G1 (Core → Assets → Wills E&W minimal → Delegation → Catalogue → Trusts → Probate; gap-discovery-optimal)
New result: Seq G1 94.0% vs Seq D 84.0% (+10pp)
Pattern: The question wasn’t “commercial velocity” — it was “gap discovery.” Reframing surfaced a sequence (G1) that wasn’t in the original A/B/C/D/E option set; synthesised from the gap-discovery framing.
Instance 3 — Role placement / facet placement / testamentary-instrument boundary
Initial baselines: R-1 (92%), AC-1 (already thought through as “current”), W-1 (94.2%) Pattern: Rich asked “can you find a way to improve R-1 to 100%?” Refinements applied (orthogonal structural improvements, NOT weight adjustments):
- R-1’: introduction-module rule + no artificial specialisation + SKOS/CIDOC annotations → 100%
- AC-1: introduction-module + Core-fallback (honestly named per L2) → 100%
- W-1’: TestamentaryInstrument abstract parent + Trusts-imports-LegacyLetter + Phase 1.5 subclass plan → 100% Pattern: None of the refinements were weight-fudging. Each addressed a specific criterion-weakness directly via structural improvement.
Why this works
- Scorecards constrain thinking to their option set. If the right answer isn’t in the original options, no weight adjustment can surface it.
- A stalled scorecard is diagnostic. 85-92% winner with narrow margin often means the question-framing is missing something.
- Reframing is cheap. Adding a new option to consider is faster than re-scoring 15 criteria across existing options.
- Research typically surfaces the reframing. vi-6 came from Agent 4 research. Seq G1 came from stepping back to ask “what is this decision actually for?”
How to apply
When a scorecard is being built:
- Before scoring, ask: “Is there an option I haven’t considered?”
- After scoring, if winner is <95%: stop + ask “Is the question framed correctly?”
- If winner feels slightly off: do NOT rush to re-weight. Look for a reframing first.
- If stalled between two options: the question almost certainly has a missing axis or misaligned framing.
Anti-patterns
- Weight-fudging — nudging weights to make a preferred option win. Dishonest + scorecard becomes unauditable.
- Adding criteria post-hoc — unless the new criterion is genuinely distinct and not already captured. (The added criterion in vi-6’s scorecard — “Single-standard-concept coherence” — was legitimate because Rich’s preference was explicit and distinct from existing criteria.)
- Hybrid as escape hatch — “use both approaches” is often the wrong answer. The vi-6 4-layer stack isn’t a hybrid; it’s one coherent multi-standard solution.
When re-weighting IS correct
Not every stalled scorecard needs reframing. Re-weighting is legitimate when:
- Rich explicitly states a new preference (e.g. “build cost is not important to me”)
- New information changes the relative importance of existing criteria
- A criterion was misweighted (e.g. “reasoner performance” was weighted 5 but should have been 10 given ISO-track scale expectations)
Cross-references
- Scorecard vi-6 (
docs/superpowers/scoping/2026-04-24-scorecards/vi-faceted-classification-A23-scorecard.md) — reframe from hybrid to layered-stack - Scorecard Seq G1 (
docs/superpowers/scoping/2026-04-24-scorecards/seq-G1-module-authoring-A24-scorecard.md) — reframe from commercial to gap-discovery - Scorecards R-1’ / AC-1 / W-1’ — structural refinement (not reframe, but same “don’t re-weight, restructure” principle)
- README at
docs/superpowers/scoping/2026-04-24-scorecards/README.md— meta-pattern observed in honest critique section
Related discipline
feedback_introduction_module_rule.md— specific application of reframe-beats-re-weight (answer “where does this primitive live?” via introduction-module rule rather than re-weighting Jackson-vs-Core-leanness criteria)feedback_iri_verification_before_lock.md— parallel verification discipline; both discipline memories prevent silent architectural drift