Trigger event: Bratanič + Negro vocabulary-probe findings surfaced that full Option C (RDF/OWL + SHACL + JSON-LD + Rego + Akoma Ntoso + Oxigraph) does NOT have 2025 practitioner community endorsement. Both books converge on Neo4j + Cypher + schema-in-prompt; SHACL, JSON-LD, triplestore products, competency-questions all scored zero. Negro engages RDF/OWL as ontology-publication format but uses handleVocabUris: IGNORE to actively discard Web-architectural URIs at runtime.

Rich’s 18 April 2026 directive (verbatim): “when the books have been read, we need to have a good think about coming up with option D, and potentially E, which utilise the technologies you can see are, as of April 2026, regarded as the most current and usable. I still like my ‘Lines of Code’ metric but am open to other ways of looking at it. I think some potential investors/buyers will expect Testate Technologies to be using our own LLMS in some capacity”

Rich’s 18 April 2026 reinforcement (after Tamò-Larrieux read): “I am delighted to see option D. I would like you to be brave and work on D and E (AI-native) more than C”

Directional consequence: the synthesis should now centre D and E as Rich’s preferred direction, with C demoted to “legacy comparison baseline” alongside B. Be brave in the Option E design — TT fine-tuned LLM as canonical interpreter is explicitly on-the-table, not a hedging alternative.

Tamò-Larrieux adds (19 April 2026):

  • Catala DSL already has worked inheritance tax examples — direct domain adjacency for Option E’s formal-rules layer
  • OpenFisca is production rule-as-code infrastructure across 30+ countries — viable simulation-grade engine
  • ACE Attempto controlled language has empirical readability evidence (Flesch-Kincaid +14, comprehension improvement) — a drafting-layer option
  • ELI/ECLI IRI identifiers are the European-Commission-backed pattern for legal-document identity — stable URIs for 80-year legislative replay
  • Zero Akoma Ntoso across a 209pp specialist legal-informatics book — even more damning than Bratanič/Negro’s zeros; Akoma Ntoso is not current in the 2025 legal-informatics practitioner mainstream
  • Zero SHACL / SPARQL / JSON-LD / triplestores in Tamò-Larrieux — convergent disconfirmation across both KG+LLM and legal-informatics literatures

Constraints carried into D/E synthesis:

  1. LOC metric preserved as primary elegance axis. Rich said “I still like my Lines of Code metric but am open to other ways of looking at it” — open to supplementary metrics (e.g. maintenance-weight ratio, time-to-new-jurisdiction, onboarding time, acquirer-valuation proxies) but LOC remains load-bearing.

  2. Own-LLM requirement is new. TT must have a credible “we operate our own LLM layer” story for acquirer appeal. This changes the Option set — the standard and implementation now need to accommodate (a) TT-hosted fine-tuned LLM, (b) MCP server exposing standard operations, (c) embedded model-as-reference-interpreter possibility, (d) training-data strategy that converts INHERIT corpus into an LLM moat.

  3. Currency filter (April 2026). Only technologies that the 2025-2026 practitioner literature endorses as mature + usable count as “current”. Disqualifies (by April 2026 evidence):

    • SHACL as runtime validator (zero mentions across Bratanič + Negro)
    • Triplestore products (Oxigraph, Jena, Stardog, GraphDB, Blazegraph, Virtuoso) at runtime
    • Competency-question-driven design as primary methodology (zero mentions)
    • Full pure-RDF stack as production substrate
  4. Qualifies as current (April 2026 evidence):

    • Neo4j + Cypher + neosemantics (Negro, Bratanič: both centre here)
    • Pydantic + structured outputs for LLM extraction
    • RDF/OWL as ontology-publication format (Negro engages, HPO/SNOMED/UMLS)
    • Schema-in-prompt pattern
    • Graph-RAG (vector + graph hybrid retrieval)
    • MCP (Model Context Protocol) for agent interop
    • JSON-LD for Web interop (neutral — no strong evidence either way yet)
    • Rego/OPA for policy-as-code (from Jimmy Ray, O’Reilly 2024 — already in library)

Likely shape of Option D (hypothesis, to be refined after books 3 + 4):

“LPG-first hybrid — property graph substrate, JSON Schema / Pydantic wire shape, RDF/OWL as published ontology layer for semantic interop, Rego for policy, TT fine-tuned LLM as model-layer reference implementation, MCP server for agent access.”

  • LPG (Neo4j / Memgraph / FalkorDB) as runtime substrate — endorsed
  • Schema: JSON Schema 2020-12 or Pydantic v2 (whichever Rich prefers; both are schema-in-prompt ergonomic)
  • RDF/OWL core ontology published once (like Negro’s HPO/SNOMED pattern) for semantic interop — does not require RDF at runtime
  • Rego for jurisdiction-specific policy
  • TT-hosted fine-tuned LLM as canonical interpreter for cross-jurisdiction semantics
  • MCP server as the primary consumer-facing interface
  • JSON-LD optional for Web-agent interop

Likely shape of Option E (AI-native — Rich has explicitly asked this be developed, not hedged):

“AI-native standard — TT’s fine-tuned LLM IS the reference implementation; schema defines IO contracts; Catala DSL provides formal-rules layer; OpenFisca-style simulator provides what-if analysis; consumer interaction via MCP; conformance via model-based + DSL-executed test vectors.”

  • LLM-as-canonical-interpreter: TT fine-tuned LLM holds the cross-jurisdiction semantic knowledge; schema defines structured IO; model answers “what does this document mean under jurisdiction X?”
  • Catala DSL for formally-provable rules (inheritance tax, intestacy hierarchy, statutory legacy calculation) — Catala has worked inheritance-tax examples, direct domain match
  • OpenFisca-style simulation engine for what-if analysis across 21 jurisdictions — production pattern in 30+ countries
  • Controlled natural language (Attempto ACE or similar) for legal drafting layer — readable by non-lawyers, machine-parseable
  • ELI/ECLI IRI identifiers for legal-source grounding — stable across 80-year legislative replay
  • MCP server as the primary consumer-facing interface — agent-native
  • Training-data corpus becomes a commercial asset — fits AI-vendor commercial memo already in project context
  • Revenue model: MCP access fees + commercial embedded licence for InheritKit + training-data bundle licence to OpenAI/Anthropic/Google/Meta/xAI
  • Conformance: Catala-verified arithmetic + OpenFisca simulation test vectors + LLM-output comparison against golden reference
  • Moat: the fine-tuned LLM + training corpus + Catala rules library are three distinct commercial assets, each sellable separately or as a bundle
  • Highest acquirer-appeal (AI-native positioning) but highest execution risk — explicitly endorsed by Rich as the direction to be brave on

How to apply:

  1. After Raieli completes (last of the 4 books), do a vocabulary-probe summary across all 4 books plus the 19 round-1 books — what DOES the 2025-2026 literature endorse for Rich’s stack?
  2. Draft Option D spec with the LPG + RDF-published-ontology + LLM + MCP shape.
  3. Test whether Option E is distinct from D or an extension of it — may collapse into one option.
  4. Score B vs C vs D vs E on the existing 14-criterion scorecard, plus additions if Rich accepts them: maintenance-weight ratio, time-to-new-jurisdiction, acquirer-appeal-via-own-LLM, MCP-readiness.
  5. LOC estimates for D and E using the same baseline methodology as the C estimate (~135-190k).
  6. Bring the 5-option comparison back to Rich for a decision gate, before any build.