When a scorecard’s top options are narrow-margin AND carry a sense of “imperfection” — even if above the reframe-trigger threshold — Claude must proactively initiate a deep-dive for refinements + reframings without waiting for Rich to request it explicitly.
Why
Rich’s directive on 2026-04-24 during T-015 G6-7 cultural-disposition adjudication:
“Despite the high scores, D and B feel ‘imperfect’. I feel this topic is potentially going to lead to some awkwardness. I would like you to dig deeper and try to identify improvements to B and D, to see if we can find a way for one to be the more obvious choice”
After Claude dug deeper, Option B’ emerged at 97.4% — a genuine clean-winner vs B’s 93.2% and D’s 93.8%. B’ was the right answer but had not been surfaced without Rich’s intervention.
Rich followed:
“we will select B’ but i am worried that i needed to ask for a deep dive to discover this”
This is a meta-feedback point: the scorecard-first process is too mechanical. It presents narrow-margin leaders as decided when the underlying options may be imperfect refinements of a better-framed solution.
How to apply
Rule 1 — Detect imperfection signals
Before locking a scorecard, watch for:
- Narrow margin between top options (< 5 pp between #1 and #2) — suggests no option fully dominates
- Both top options scoring 3 or 4 on the SAME criterion — suggests the options share a structural limitation
- Top options are variants of the same approach — suggests a better alternative exists outside the current option set
- Recent adopted framework wasn’t applied — e.g., T-022 F+ CIDOC E30 Right framework should inform later tension-adjudications
- Semantic fit 3-4/5 rather than 5/5 — suggests forcing the concept into the wrong shape
- Regression-completeness 4/5 — suggests the option doesn’t capture the primitive’s full structure
Rule 2 — Automatic refinement search
If ≥2 of the above signals trigger:
- Pause scoring conclusion
- Run a refinement pass asking: “What’s the clean option the current set is approximating?”
- Apply recently-adopted frameworks to the current primitive (CIDOC E30 Right, AC-1 two-rule, prov:Activity subtyping, lean-to-InheritKit)
- Consider multi-dimensional framings (is this a Right? an Activity? a Constraint? a Facet? An entity with multiple aspects?)
- Consider rich-structure framings — if v6.6 primitive is >100 LOC, it likely has rich property structure that flat-classifier treatments miss
Rule 3 — Surface refinement candidates proactively
When a refinement candidate emerges, present it AS A NEW OPTION in the scorecard, not as a post-hoc “also consider”. Include:
- What the current top options are missing
- How the refinement addresses the gap
- Scoring against the same 15 criteria
- Whether the margin is decisive (≥5 pp) or still narrow
Rule 4 — If no refinement emerges, acknowledge imperfection
If deep-dive produces no clean-winner refinement:
- Honestly state “top options are genuine-imperfect compromises”
- Surface the remaining tension for Rich to decide
- Don’t pretend the top option is clean when it isn’t
Anti-patterns
- ❌ Presenting B/D narrow-margin with “my lean B/D” when a better B’ exists that wasn’t surfaced
- ❌ Treating “above reframe-trigger 92%” as permission to stop exploring
- ❌ Scoring 5 options + recommending the highest without asking “is there a 6th option we haven’t considered?”
- ❌ Not applying recently-adopted architectural frameworks to later decisions
- ❌ Waiting for Rich to ask “dig deeper” before doing so
Triggers
Any scorecard where:
- Narrow margin (< 5 pp) between top options
- Top options score 3-4 on semantic-precision or regression-completeness
- Recent architectural framework (E30 Right, prov:Activity, AC-1, lean-to-InheritKit) not applied
- Top options are structural variants rather than categorically different
- v6.6 primitive with >100 LOC being treated as flat-value
When triggered: automatically deep-dive BEFORE presenting final scorecard to Rich.
Related memories
feedback_reframe_beats_reweight.md— related discipline for 85-92% stallsfeedback_always_display_full_scorecard.md— discipline for in-chat scorecard displayfeedback_always_save_scorecards.md— persistence disciplinefeedback_scorecards_one_at_a_time_optimal_sequence.md— one-at-a-time sub-decision discipline
Example application (T-015 G6-7 retrospective)
Signals that SHOULD have triggered auto-deep-dive:
- B 93.2% vs D 93.8% — only 0.6 pp margin ✓
- Both scored 4/5 on regression-completeness ✓
- T-022 F+ CIDOC E30 Right framework NOT applied to cultural-disposition ✓
- v6.6 cultural-disposition is 128 LOC — rich structure ✓
- B and D are structural variants of the same facet approach ✓
Under Rule 2, auto-refinement search would have produced B’ (97.4%) without Rich’s intervention.