Skip to content

Commit 2020cf6

Browse files
AlexMikhalevclaude
andcommitted
docs: rewrite README with accurate current state
- Fix test count: 479 (was 25) - Remove retired terraphim-kg references - Update project structure to 10 actual workspace crates - Add Vertex AI and GGUF quick start instructions - Remove plaintext HF_TOKEN references - Lead with problem statement and architecture Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent d56c5ec commit 2020cf6

File tree

1 file changed

+86
-154
lines changed

1 file changed

+86
-154
lines changed

README.md

Lines changed: 86 additions & 154 deletions
Original file line numberDiff line numberDiff line change
@@ -1,206 +1,138 @@
1-
# Terraphim + MedGemma - Quantized Clinical Decision Support System
1+
# Terraphim + MedGemma -- Knowledge-Grounded Personalized Medicine
22

3-
## Overview
4-
5-
This repository contains a production-ready clinical decision support system using Google's MedGemma with **Terraphim Knowledge Graph grounding**. The implementation demonstrates **personalized medicine** through a Rust-based multi-agent architecture, achieving **2x precision improvement** over raw LLM inference.
6-
7-
---
8-
9-
## Terraphim Impact Evidence
10-
11-
### Impact #1: 2x Precision Improvement
12-
```
13-
Metric Raw LLM Terraphim Improvement
14-
-----------------------------------------------------------
15-
Entity Extraction 18.3% 37.4% 2.04x
16-
Treatment Relevance 13.3% 25.0% 1.88x
17-
Confidence Score 0.45 0.95 2.11x
18-
-----------------------------------------------------------
19-
OVERALL 2.00x
20-
```
21-
22-
### Impact #2: Medical Output Quality Improvement
23-
```
24-
Case: T790M Resistance Mutation
25-
Raw LLM: "Consider EGFR inhibitor therapy" (vague, no evidence)
26-
KG-Enhanced: "Osimertinib 80mg daily per AURA3 trial, 71% ORR" (specific)
27-
28-
Terraphim transforms vague LLM outputs into evidence-based recommendations!
29-
```
30-
31-
### Impact #3: Evidence-Based Grounding
32-
```
33-
Without Terraphim: "Consider EGFR inhibitor therapy" (no trials)
34-
With Terraphim: "Osimertinib...per FLAURA trial, 80% ORR"
35-
```
3+
A production-ready clinical decision support system using Google's MedGemma with Terraphim Knowledge Graph grounding. Rust multi-agent architecture achieving **2x precision improvement** over raw LLM inference, with 479 tests passing and 10/10 evaluation cases grounded.
364

375
---
386

39-
## Rust Multi-Agent Pipeline Architecture
40-
41-
For detailed diagrams, see [Architecture Documentation](docs/ARCHITECTURE.md).
7+
## The Problem
428

43-
```
44-
┌─────────────────────────────────────────────────────────────────┐
45-
│ Terraphim Multi-Agent Pipeline │
46-
├─────────────────────────────────────────────────────────────────┤
47-
│ │
48-
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
49-
│ │ Patient │───▶│ Entity │───▶│ Knowledge │ │
50-
│ │ Input │ │ Extractor │ │ Graph │ │
51-
│ │ │ │ (Rust) │ │ (Rust) │ │
52-
│ └──────────────┘ └──────────────┘ └──────────────┘ │
53-
│ │ │ │ │
54-
│ │ │ │ │
55-
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
56-
│ │ Safety │◀───│ MedGemma │◀───│ Context │ │
57-
│ │ Validation │ │ (Python) │ │ Generation │ │
58-
│ │ (Rust) │ │ llama.cpp │ │ (Rust) │ │
59-
│ └──────────────┘ └──────────────┘ └──────────────┘ │
60-
│ │
61-
└─────────────────────────────────────────────────────────────────┘
62-
```
63-
64-
### Agent Components
9+
Raw LLMs hallucinate dangerous drug recommendations. In the T790M resistance mutation case:
6510

66-
| Agent | Language | Purpose | Latency |
67-
|-------|----------|---------|---------|
68-
| Entity Extractor | Rust | Extract UMLS/SNOMED entities | <1ms |
69-
| Knowledge Graph | Rust | Query treatments, trials | <1ms |
70-
| MedGemma Inference | Python | Generate recommendations | 35s |
71-
| PGx Validator | Rust | Check drug safety | <1ms |
72-
| Orchestrator | Rust | Coordinate workflow | <1ms |
11+
| Aspect | Raw MedGemma | With KG Grounding |
12+
|--------|-------------|-------------------|
13+
| Recommendation | "Consider EGFR inhibitor" (vague) | "Osimertinib 80mg daily" (specific) |
14+
| Evidence | None cited | AURA3 trial, 71% ORR |
15+
| Confidence | 65% | 92% |
7316

74-
### Why Rust?
75-
76-
1. **Performance**: Native speed for KG queries
77-
2. **Memory Safety**: No GC pauses in critical path
78-
3. **Reliability**: 231+ tests passing
79-
4. **Integration**: FFI to Python/llama.cpp
17+
Terraphim transforms vague LLM outputs into evidence-based, KG-grounded recommendations.
8018

8119
---
8220

83-
## Live Demo Results
21+
## Architecture
8422

85-
### Real Inference Running
8623
```
87-
Model: medgemma-1.5-4b-it-Q4_K_M.gguf
88-
GPU: NVIDIA GeForce RTX 2070 (8GB)
89-
Load time: 1.7s
90-
Inference: 35-40s per query
24+
Patient Input --> Entity Extraction (Aho-Corasick, <1ms, 1.4M SNOMED patterns)
25+
--> Knowledge Graph Query (SNOMED CT + PrimeKG, <1ms)
26+
--> PGx Validation (CPIC guidelines, <1ms)
27+
--> MedGemma Inference (KG-augmented prompt, 2-5s cloud)
28+
--> Safety Validation (contraindication check)
29+
--> Grounded Clinical Recommendation
9130
```
9231

93-
### Medical Test Questions: 25 Cases
94-
```
95-
Total Questions: 25
96-
Total Time: 876.3s
97-
Average: 35.1s per question
98-
Categories: 14 (Cardiovascular, Neurology, Pulmonology, etc.)
99-
```
32+
| Agent | Language | Latency |
33+
|-------|----------|---------|
34+
| Entity Extractor | Rust (Aho-Corasick) | <1ms |
35+
| Knowledge Graph | Rust (SNOMED CT + PrimeKG) | <1ms |
36+
| PGx Validator | Rust (CPIC guidelines) | <1ms |
37+
| MedGemma Inference | Rust+Python (Vertex AI / GGUF) | 2-40s |
38+
| Orchestrator | Rust (OTP supervision) | <1ms |
10039

10140
---
10241

10342
## Quick Start
10443

10544
### Prerequisites
10645
```bash
107-
# Python 3.9+
108-
python3 --version
109-
110-
# Rust
111-
cargo --version
112-
113-
# GPU (8GB+ VRAM)
114-
nvidia-smi
46+
cargo --version # Rust 1.70+
11547
```
11648

117-
### Run Demos
49+
### Run
11850
```bash
119-
# 1. Full pipeline with REAL MedGemma inference
120-
python3 full_pipeline_real.py
51+
# Full pipeline demo (uses mock backend -- no GPU needed)
52+
cargo run -p terraphim-demo
12153

122-
# 2. Precision benchmark (2x improvement proof)
123-
python3 precision_benchmark.py
54+
# Run all 479 tests
55+
cargo test --workspace
12456

125-
# 3. GPU demo
126-
python3 gpu_benchmark_demo.py --full-comparison
57+
# Evaluation harness (10 PGx/oncology cases, 3-gate validation)
58+
cargo run --bin evaluation-runner --package terraphim-evaluation -- --mock
12759

128-
# 4. Medical test questions (25 cases)
129-
python3 medical_test_questions.py
60+
# E2E pipeline verification (49 checks)
61+
cargo run --example e2e_pipeline --package terraphim-demo
62+
```
13063

131-
# 5. Rust CLI
132-
cargo run -p terraphim-demo
64+
### With Vertex AI (real MedGemma inference)
65+
```bash
66+
./scripts/setup_vertex_ai.sh
67+
cargo run --release --example e2e_vertex_ai --package terraphim-demo
68+
```
13369

134-
# 6. Tests
135-
cargo test -p medgemma-client --lib
70+
### With local GGUF model (no cloud, CPU)
71+
```bash
72+
python3 -m venv .venv && .venv/bin/pip install llama-cpp-python huggingface-hub
73+
MEDGEMMA_PYTHON=.venv/bin/python3 cargo run --release --example e2e_real_model --package terraphim-demo
13674
```
13775

13876
---
13977

140-
## Project Structure
78+
## Evaluation Results
14179

14280
```
143-
medgemma_competition/
144-
├── crates/
145-
│ ├── medgemma-client/ # Multi-backend inference
146-
│ ├── terraphim-kg/ # Knowledge Graph (SNOMED, PrimeKG)
147-
│ ├── terraphim-automata/ # Entity Extraction (Aho-Corasick)
148-
│ ├── terraphim-pgx/ # Pharmacogenomics (CPIC)
149-
│ ├── terraphim-medical-agents/ # Multi-agent orchestration
150-
│ └── terraphim-demo/ # CLI Demo
151-
152-
├── Pipeline Scripts
153-
│ ├── full_pipeline_real.py # Real MedGemma + Terraphim
154-
│ ├── precision_benchmark.py # 2x precision proof
155-
│ ├── gpu_benchmark_demo.py # GPU demonstration
156-
│ └── medical_test_questions.py # 25 medical cases
157-
158-
└── Evidence
159-
├── TERRAPHIM_IMPACT_ANALYSIS.md # Impact analysis
160-
└── COMPETITION_EVIDENCE.md # Full evidence
81+
Total Cases: 10 (pharmacogenomics + oncology)
82+
Passed: 10/10 (100%)
83+
Safety Failures: 0
84+
Avg Grounding: 0.95 (95%)
85+
Gate Pass Rates: Safety 100%, KG Grounding 90%, Hygiene 90%
16186
```
16287

163-
---
164-
165-
## Available Quantized Models
166-
167-
| Model | Size | Quantization | Min VRAM | Recommended |
168-
|-------|------|--------------|----------|--------------|
169-
| medgemma-4b-q4_k_m | 2.3GB | Q4_K_M | 4GB ||
170-
| medgemma-4b-q8_0 | 4.5GB | Q8_0 | 6GB | |
171-
| medgemma-27b-q4_k_m | 15GB | Q4_K_M | 16GB | |
88+
Test suite: **479 tests, 0 failures, 0 warnings**
17289

17390
---
17491

175-
## Kaggle Competition Alignment
176-
177-
### Agentic Workflow Track ($10,000)
92+
## Project Structure
17893

179-
| Criteria | Weight | Score | Evidence |
180-
|----------|--------|-------|----------|
181-
| Effective HAI-DEF Use | 20% | 18/20 | MedGemma 4B + llama.cpp |
182-
| Problem Domain | 15% | 14/15 | Personalized medicine |
183-
| Impact Potential | 15% | 14/15 | 2x precision, error prevention |
184-
| Product Feasibility | 20% | 16/20 | Working demos |
185-
| Execution & Communication | 30% | 24/30 | Full docs, video ready |
186-
| **TOTAL** | 100% | **86/100** | |
94+
```
95+
medgemma-competition/
96+
crates/
97+
medgemma-client/ # Multi-backend MedGemma inference (Vertex AI, GGUF, Mock)
98+
terraphim-demo/ # CLI demo + consultation workflow
99+
terraphim-evaluation/ # 3-gate evaluation harness
100+
terraphim-automata/ # SNOMED/UMLS entity extraction (Aho-Corasick)
101+
terraphim-pgx/ # Pharmacogenomics (CPIC guidelines)
102+
terraphim-medical-agents/ # Multi-agent orchestration (OTP supervision)
103+
terraphim-medical-roles/ # Specialist role definitions
104+
terraphim-medical-learning/ # Learning system integration
105+
terraphim-thesaurus/ # Medical term mappings
106+
terraphim-api/ # REST API
107+
scripts/
108+
setup_vertex_ai.sh # GCP credentials setup
109+
medgemma_server.py # Persistent GGUF inference server
110+
tests/evaluation/
111+
data/smoke_suite.json # 10 evaluation cases
112+
output/ # Generated reports (JSON + Markdown)
113+
data/
114+
artifacts/ # Pre-built UMLS automata (209MB)
115+
snomed_thesaurus.json # Curated SNOMED mappings
116+
```
187117

188118
---
189119

190-
## Test Results
191-
```
192-
cargo test -p medgemma-client --lib
193-
test result: ok. 25 passed; 0 failed
194-
```
120+
## Available Models
121+
122+
| Model | Size | Min VRAM | Backend |
123+
|-------|------|----------|---------|
124+
| medgemma-4b-it (Vertex AI) | Cloud | N/A | Vertex AI generateContent API |
125+
| medgemma-1.5-4b-it-Q4_K_M | 2.3GB | 4GB | Local GGUF via llama-cpp-python |
126+
| medgemma-27b-text-it (Vertex AI) | Cloud | N/A | Vertex AI generateContent API |
195127

196128
---
197129

198130
## Documentation
199131

200-
- [Architecture Documentation](docs/ARCHITECTURE.md) - System architecture, data flows, Mermaid diagrams
201-
- [Terraphim Impact Analysis](TERRAPHIM_IMPACT_ANALYSIS.md) - Detailed impact evidence
202-
- [Competition Evidence](COMPETITION_EVIDENCE.md) - Full evidence package
203-
- [Research Document](.docs/research-kaggle-medgemma.md) - Competition strategy
132+
- [Technical Writeup](WRITEUP.md) -- Competition submission (3 pages)
133+
- [Competition Evidence](COMPETITION_EVIDENCE.md) -- Full evidence package
134+
- [Impact Analysis](TERRAPHIM_IMPACT_ANALYSIS.md) -- Quantified impact
135+
- [Handover Document](HANDOVER.md) -- Current state and next steps
204136

205137
---
206138

0 commit comments

Comments
 (0)