Skip to content

Commit 43416e7

Browse files
Alexclaude
andcommitted
docs: add knowledge graph impact evidence across architecture and summary docs
Document that Terraphim uses symbolic graph reasoning (IS-A hierarchies, relationship traversal, thesaurus grounding) with measurable 2.00x precision improvement. Add T790M error prevention case study, precision benchmarks, and KG grounding gate metrics. Clarify: no vector embeddings are used. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent c3e0263 commit 43416e7

File tree

4 files changed

+74
-2
lines changed

4 files changed

+74
-2
lines changed

.docs/summary-crates-terraphim-kg.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,15 @@ Medical knowledge graph library providing unified access to SNOMED CT hierarchie
3939
## Dependencies
4040
petgraph, serde, csv, bincode, zstd, reqwest, tracing, anyhow
4141

42+
## Measured Impact
43+
44+
The KG provides **2.00x overall precision improvement** vs raw LLM (source: `PIPELINE_RUN_REPORT.md`):
45+
- Entity extraction: 18.3% -> 37.4% (2.04x)
46+
- Treatment relevance: 13.3% -> 25.0% (1.88x)
47+
- T790M case: KG prevented wrong drug (Crizotinib) and returned correct treatment (Osimertinib per AURA3 trial)
48+
49+
The `get_treatments()` method in `graph.rs` follows Treats edges AND inherits treatments from IS-A ancestors, enabling evidence-based recommendations that raw LLM inference misses.
50+
4251
## Files
4352
- `src/graph.rs` - Core KnowledgeGraph with petgraph
4453
- `src/isa_hierarchy.rs` - Fast hierarchical queries

.docs/summary-crates-terraphim-medical-agents.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,13 @@ Phase 4: Validation (SafetyValidation)
5252
## Dependencies
5353
terraphim-automata, terraphim-kg, terraphim-pgx, terraphim-thesaurus, medgemma-client, terraphim-medical-learning, tokio, serde, async-trait, uuid, chrono
5454

55+
## Validation Impact Metrics
56+
57+
- KG Grounding Gate: 90% pass rate, 0.95 average grounding score (source: evaluation reports)
58+
- Two-layer validation catches incoherent responses (e.g., "moon causes diabetes") and incomplete responses (missing required Treatment category)
59+
- Safety gate: 100% pass rate, Critical severity (non-bypassable)
60+
- 231 tests pass across workspace, 0 failures
61+
5562
## Files
5663
- `src/agents/` - 6 specialized agents + mod.rs
5764
- `src/protocol.rs` - Message types (Call, Cast, Info)

.docs/summary.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,20 @@ terraphim-evaluation (testing harness)
143143
- **Python Bindings**: PyO3/maturin
144144
- **Testing**: criterion (benchmarks), built-in test framework
145145

146+
## Knowledge Graph Impact Evidence
147+
148+
The Terraphim KG uses **symbolic graph reasoning** (IS-A hierarchies, relationship traversal, thesaurus grounding) -- not vector embeddings. Measured impact (source: `PIPELINE_RUN_REPORT.md`, `REAL_INFERENCE_RESULTS.md`):
149+
150+
| Metric | Raw LLM | With Terraphim KG | Improvement |
151+
|--------|---------|-------------------|-------------|
152+
| Entity Extraction Precision | 18.3% | 37.4% | **2.04x** |
153+
| Treatment Relevance | 13.3% | 25.0% | **1.88x** |
154+
| Confidence Score | 0.45 | 0.95 | **2.11x** |
155+
| Medical Accuracy (T790M) | 50% | 100% | Error prevented |
156+
| KG Grounding Gate | N/A | 90% pass | 0.95 avg score |
157+
158+
Key graph features: IS-A hierarchy traversal (~20ns), treatment inheritance via ancestor edges, SNOMED thesaurus grounding (0.98/0.80 confidence), two-layer validation (coherence + completeness), role-specific search with synonym expansion.
159+
146160
## Documentation Structure
147161

148162
```

docs/ARCHITECTURE.md

Lines changed: 44 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -636,5 +636,47 @@ gauge
636636

637637
---
638638

639-
*Documentation Version: 1.0*
640-
*Last Updated: 2026-02-19*
639+
## Knowledge Graph Impact Evidence
640+
641+
### Terminology: Symbolic Graph Reasoning (Not Vector Embeddings)
642+
643+
The Terraphim knowledge graph uses **symbolic graph structure** -- IS-A hierarchies, explicit relationship edges, and thesaurus-based grounding -- rather than learned vector embeddings (node2vec, GNN, etc.). This design prioritizes deterministic, auditable clinical decisions over probabilistic similarity.
644+
645+
### Graph Features Used in Production
646+
647+
| Feature | Source File | Mechanism | Latency |
648+
|---------|-----------|-----------|---------|
649+
| IS-A hierarchy traversal | `terraphim-kg/src/isa_hierarchy.rs` | Pre-computed transitive closure, O(1) lookup | ~20ns |
650+
| Treatment inheritance | `terraphim-kg/src/graph.rs:get_treatments()` | Follow Treats edges + inherit from ancestor concepts | <1ms |
651+
| Contraindication checking | `terraphim-kg/src/graph.rs:check_contraindication()` | Traverse Contraindicates edges for drug-disease pairs | <1ms |
652+
| Entity grounding (SNOMED) | `terraphim-medical-agents/src/agents/knowledge_graph.rs` | Thesaurus lookup: exact (0.98) / substring (0.80) confidence | <1ms |
653+
| Two-layer validation | `terraphim-medical-agents/src/validation/two_layer.rs` | Coherence (entity connectivity) + Completeness (category coverage) | <1ms |
654+
| Role graph search | `terraphim-medical-agents/src/agents/role_graph_search.rs` | Role-specific synonym expansion + KG treatment traversal | <5ms |
655+
656+
### Measured Precision Improvement (2.00x Overall)
657+
658+
Source: `PIPELINE_RUN_REPORT.md` (2026-02-20, RTX 2070, MedGemma 4B Q4_K_M)
659+
660+
| Metric | Raw LLM | With Terraphim KG | Improvement |
661+
|--------|---------|-------------------|-------------|
662+
| Entity Extraction Precision | 18.3% | 37.4% | **2.04x** |
663+
| Treatment Relevance | 13.3% | 25.0% | **1.88x** |
664+
| Confidence Score | 0.45 | 0.95 | **2.11x** |
665+
| KG Grounding Gate | N/A | 90% pass rate | 0.95 avg score |
666+
667+
### Critical Medical Error Prevention: T790M Case
668+
669+
Source: `REAL_INFERENCE_RESULTS.md`
670+
671+
| | Raw MedGemma | Terraphim + MedGemma |
672+
|--|-------------|---------------------|
673+
| Recommendation | Crizotinib (MET inhibitor) | Osimertinib (EGFR inhibitor) |
674+
| Medical accuracy | **Incorrect** -- wrong pathway for T790M | **Correct** -- per AURA3 trial, 71% ORR |
675+
| Clinical impact | Potential patient harm | Evidence-based treatment |
676+
677+
The knowledge graph prevented a clinically dangerous drug recommendation by grounding the T790M mutation entity against SNOMED CT and retrieving the evidence-based treatment (Osimertinib) from PrimeKG relationship edges rather than relying on the LLM's parametric knowledge alone.
678+
679+
---
680+
681+
*Documentation Version: 1.1*
682+
*Last Updated: 2026-02-22*

0 commit comments

Comments
 (0)