ζ-Q3 ε.ι S7 Mondo Disease Ontology precedent inspection — outcome-VALIDATED-WITH-NOTE 2026-05-02
Outcome
outcome-VALIDATED-WITH-NOTE — Mondo is a production-grade SSSOM precedent at scale (363,132 mapping rows across 73 TSV files; 109,384 rows in combined mondo.sssom.tsv; ODKfull-docker CI on Jenkins + GitHub Actions). Several patterns transfer cleanly to INHERIT v2’s ω.η lock; some patterns explicitly do not transfer.
Key surprise: Mondo does NOT use the SSSOM 1.1 curation_rule slot anywhere in 73 TSVs. INHERIT v2 ζ-Q5 ψ.γ’ lock is AHEAD of Mondo’s production usage. INHERIT v2’s curation_rule design is FORWARD-LOOKING; Mondo’s mapping_justification (e.g., semapv:ManualMappingCuration) is the current-production baseline.
Findings
| Aspect | Finding |
|---|---|
| SSSOM TSV count | 73 (1 combined + 71 per-source-per-predicate + 1 generated merged view) |
| Total mapping rows | 363,132 |
Combined mondo.sssom.tsv | 109,384 rows |
| Predicate distribution (combined) | 99.92% skos:exactMatch + 0.08% skos:broadMatch |
| Mapping_justification (combined) | 99.95% semapv:UnspecifiedMatching (loses curation attribution that per-source preserves) |
| Mapping_justification (per-source) | semapv:ManualMappingCuration (curated) |
| Schema columns | Minimal 5-6: subject_id / subject_label / predicate_id / object_id / object_label / mapping_justification |
curation_rule slot | NOT USED anywhere |
confidence column | NOT USED |
creator_id column | NOT USED |
extension_definitions | NOT USED |
| CI runtime | obolibrary/odkfull:v1.6 container |
| GitHub Actions | 17 workflows; PR-blocking on master via main.yaml |
| Jenkins | Jenkinsfile runs make test with 100 GB RAM + 128 GB ROBOT_JAVA_ARGS |
| Per-source split distribution | 38 hasdbxref + 14 exactmatch + 5 each (broadmatch / narrowmatch / relatedmatch) + 4 closematch + 1 combined |
| License | CC0-1.0 |
Transferable patterns (5)
- SSSOM-canonical pattern at production scale (109K mappings; CC0; CI-validated)
- Curie_map metadata block in TSV header (per-file scoped curie maps; SSSOM 1.0/1.1 spec compliant)
- PR-blocking SSSOM validation in CI (main.yaml runs validation on every PR to master)
- Combined-file canonical + per-source-files derived (Mondo:
mondo.sssom.tsvcanonical + 71 per-source generated views) - mapping_set_version pinned to release date (NEW Phase-1 Sprint S2 task candidate ~½ day)
NOT-transferable patterns (5)
A. Minimal schema (no curation_rule / confidence / creator_id) — INHERIT v2 retains 7-column schema for forward-looking traceability
B. semapv:UnspecifiedMatching in combined file (loses per-mapping curation attribution)
C. ODKfull docker container (massively over-engineered at INHERIT v2’s 30-50 alignment scale)
D. Jenkins-based CI redundancy (over-belt-and-braces; PR-blocking GitHub Actions sufficient)
E. NO SSSOM 1.1 curation_rule slot — INHERIT v2 ζ-Q5 ψ.γ’ lock STANDS but acknowledges Mondo precedent gap; INHERIT v2 is forward-looking
Implications for Phase E Task 13 ε.ι lock-decision
- ω.η Q-004 lock STANDS — Mondo precedent confirms SSSOM-canonical pattern at production scale (109K mappings).
- A-22 single-file canonical lock STANDS — Mondo’s hybrid combined+per-source pattern shows single-file canonical is right for INHERIT v2’s 30-50 alignment scale.
- NEW Phase-1 Sprint S2 task candidate (lock-time): pin
mapping_set_versionto release date. - NEW A-21 CI gate candidate (lock-time):
G-SSSOM-PR-BLOCKING(already implied by branch protection but worth explicit gate-row). - ζ-Q5 ψ.γ’ curation_rule lock STANDS but acknowledged forward-looking — Mondo precedent does NOT use curation_rule.
- Phase-1.5 trigger candidate: if INHERIT v2 alignment count exceeds 100-200 (Year-2+), evaluate ODKfull adoption + per-source-split.
Cross-references
- T-file:
~/off-github/library/projects/inherit/T-spike-eps-iota-S7-mondo-precedent-2026-05-02.mdv1.0 (22 KB, 208 lines) - arch-state v3.27 → v3.28 §11 S7 row LANDED + Changelog row
- Q-003 v1.9 → v1.10 §10 S7 row LANDED + CHANGELOG entry
- MEMORY.md +1 entry
- Active-work-log entry
- Companion artefacts: sample-and-grep on
/tmp/spike-s7-mondo/(1.1 GB; 0.007% sample sufficient per plan §5)
Methodological observations
- First negative-finding from precedent inspection (Mondo doesn’t use SSSOM 1.1 curation_rule). NEGATIVE findings are first-class deliverables — they identify INHERIT v2 design choices that are FORWARD-LOOKING ahead of current production.
- Sample-and-grep scope discipline followed per plan §5 (1.1 GB corpus; 0.007% sample sufficient for precedent assessment).
- 12th spike consecutively (S2+S2.5+S3+S2.10+S2.6+S4+S2.9+S2.9b+S5+S8+S6+S7) with logging-contract closed within same session.