docs: add knowledge graph impact evidence across architecture and summary docs

Alex · claude · Alex · commit 43416e78d7e2 · 2026-02-22T10:23:42.000Z
Document that Terraphim uses symbolic graph reasoning (IS-A hierarchies,
relationship traversal, thesaurus grounding) with measurable 2.00x precision
improvement. Add T790M error prevention case study, precision benchmarks,
and KG grounding gate metrics. Clarify: no vector embeddings are used.

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/.docs/summary-crates-terraphim-kg.md b/.docs/summary-crates-terraphim-kg.md
@@ -39,6 +39,15 @@ Medical knowledge graph library providing unified access to SNOMED CT hierarchie
 ## Dependencies
 petgraph, serde, csv, bincode, zstd, reqwest, tracing, anyhow
 
+## Measured Impact
+
+The KG provides **2.00x overall precision improvement** vs raw LLM (source: `PIPELINE_RUN_REPORT.md`):
+- Entity extraction: 18.3% -> 37.4% (2.04x)
+- Treatment relevance: 13.3% -> 25.0% (1.88x)
+- T790M case: KG prevented wrong drug (Crizotinib) and returned correct treatment (Osimertinib per AURA3 trial)
+
+The `get_treatments()` method in `graph.rs` follows Treats edges AND inherits treatments from IS-A ancestors, enabling evidence-based recommendations that raw LLM inference misses.
+
 ## Files
 - `src/graph.rs` - Core KnowledgeGraph with petgraph
 - `src/isa_hierarchy.rs` - Fast hierarchical queries
diff --git a/.docs/summary-crates-terraphim-medical-agents.md b/.docs/summary-crates-terraphim-medical-agents.md
@@ -52,6 +52,13 @@ Phase 4: Validation (SafetyValidation)
 ## Dependencies
 terraphim-automata, terraphim-kg, terraphim-pgx, terraphim-thesaurus, medgemma-client, terraphim-medical-learning, tokio, serde, async-trait, uuid, chrono
 
+## Validation Impact Metrics
+
+- KG Grounding Gate: 90% pass rate, 0.95 average grounding score (source: evaluation reports)
+- Two-layer validation catches incoherent responses (e.g., "moon causes diabetes") and incomplete responses (missing required Treatment category)
+- Safety gate: 100% pass rate, Critical severity (non-bypassable)
+- 231 tests pass across workspace, 0 failures
+
 ## Files
 - `src/agents/` - 6 specialized agents + mod.rs
 - `src/protocol.rs` - Message types (Call, Cast, Info)
diff --git a/.docs/summary.md b/.docs/summary.md
@@ -143,6 +143,20 @@ terraphim-evaluation (testing harness)
 - **Python Bindings**: PyO3/maturin
 - **Testing**: criterion (benchmarks), built-in test framework
 
+## Knowledge Graph Impact Evidence
+
+The Terraphim KG uses **symbolic graph reasoning** (IS-A hierarchies, relationship traversal, thesaurus grounding) -- not vector embeddings. Measured impact (source: `PIPELINE_RUN_REPORT.md`, `REAL_INFERENCE_RESULTS.md`):
+
+| Metric | Raw LLM | With Terraphim KG | Improvement |
+|--------|---------|-------------------|-------------|
+| Entity Extraction Precision | 18.3% | 37.4% | **2.04x** |
+| Treatment Relevance | 13.3% | 25.0% | **1.88x** |
+| Confidence Score | 0.45 | 0.95 | **2.11x** |
+| Medical Accuracy (T790M) | 50% | 100% | Error prevented |
+| KG Grounding Gate | N/A | 90% pass | 0.95 avg score |
+
+Key graph features: IS-A hierarchy traversal (~20ns), treatment inheritance via ancestor edges, SNOMED thesaurus grounding (0.98/0.80 confidence), two-layer validation (coherence + completeness), role-specific search with synonym expansion.
+
 ## Documentation Structure
 
 ```
diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
@@ -636,5 +636,47 @@ gauge
 
 ---
 
-*Documentation Version: 1.0*  
-*Last Updated: 2026-02-19*
+## Knowledge Graph Impact Evidence
+
+### Terminology: Symbolic Graph Reasoning (Not Vector Embeddings)
+
+The Terraphim knowledge graph uses **symbolic graph structure** -- IS-A hierarchies, explicit relationship edges, and thesaurus-based grounding -- rather than learned vector embeddings (node2vec, GNN, etc.). This design prioritizes deterministic, auditable clinical decisions over probabilistic similarity.
+
+### Graph Features Used in Production
+
+| Feature | Source File | Mechanism | Latency |
+|---------|-----------|-----------|---------|
+| IS-A hierarchy traversal | `terraphim-kg/src/isa_hierarchy.rs` | Pre-computed transitive closure, O(1) lookup | ~20ns |
+| Treatment inheritance | `terraphim-kg/src/graph.rs:get_treatments()` | Follow Treats edges + inherit from ancestor concepts | <1ms |
+| Contraindication checking | `terraphim-kg/src/graph.rs:check_contraindication()` | Traverse Contraindicates edges for drug-disease pairs | <1ms |
+| Entity grounding (SNOMED) | `terraphim-medical-agents/src/agents/knowledge_graph.rs` | Thesaurus lookup: exact (0.98) / substring (0.80) confidence | <1ms |
+| Two-layer validation | `terraphim-medical-agents/src/validation/two_layer.rs` | Coherence (entity connectivity) + Completeness (category coverage) | <1ms |
+| Role graph search | `terraphim-medical-agents/src/agents/role_graph_search.rs` | Role-specific synonym expansion + KG treatment traversal | <5ms |
+
+### Measured Precision Improvement (2.00x Overall)
+
+Source: `PIPELINE_RUN_REPORT.md` (2026-02-20, RTX 2070, MedGemma 4B Q4_K_M)
+
+| Metric | Raw LLM | With Terraphim KG | Improvement |
+|--------|---------|-------------------|-------------|
+| Entity Extraction Precision | 18.3% | 37.4% | **2.04x** |
+| Treatment Relevance | 13.3% | 25.0% | **1.88x** |
+| Confidence Score | 0.45 | 0.95 | **2.11x** |
+| KG Grounding Gate | N/A | 90% pass rate | 0.95 avg score |
+
+### Critical Medical Error Prevention: T790M Case
+
+Source: `REAL_INFERENCE_RESULTS.md`
+
+| | Raw MedGemma | Terraphim + MedGemma |
+|--|-------------|---------------------|
+| Recommendation | Crizotinib (MET inhibitor) | Osimertinib (EGFR inhibitor) |
+| Medical accuracy | **Incorrect** -- wrong pathway for T790M | **Correct** -- per AURA3 trial, 71% ORR |
+| Clinical impact | Potential patient harm | Evidence-based treatment |
+
+The knowledge graph prevented a clinically dangerous drug recommendation by grounding the T790M mutation entity against SNOMED CT and retrieving the evidence-based treatment (Osimertinib) from PrimeKG relationship edges rather than relying on the LLM's parametric knowledge alone.
+
+---
+
+*Documentation Version: 1.1*
+*Last Updated: 2026-02-22*