omgs-nccn

This repository builds typed NCCN graph assets for omgs_engine.

It serves as a topology-constrained, path-first guideline graph workflow for downstream clinician-in-the-loop retrieval and decision support.

Licensed NCCN source files, PDFs, and extracted guideline content are not distributed in this repository.

Physician-led typed NCCN graph review

You are free to build your own NCCN graph-RAG pipeline.

Our view is that guideline graphs and automated retrieval should support, not replace, clinician judgement. Real-world oncology decision-making often depends not only on published guideline content, but also on the latest practice changes, emerging or not-yet-published clinical trial signals, and region-specific experience. For that reason, we intentionally adopt a semi-automated approach that keeps final knowledge interpretation and decision-making in the hands of physicians.

Our framework normalises NCCN flowcharts into a typed directed graph with four core node classes—Condition, Evaluation, Treatment, and Page Jump—and a constrained set of relations, including is followed by, requires, and indicates. This representation preserves decision topology and path constraints instead of flattening the guideline into isolated text chunks.

Each treatment option is further linked to reviewed footnotes, principle statements, and reference pages, allowing information that is otherwise dispersed across the flowchart, annotations, and main text to be assembled into auditable, page-grounded knowledge units. These graph assets provide the substrate for topology-constrained, path-first retrieval and downstream clinician-in-the-loop decision support.

License (this repository)

The source code and tooling in this repository are licensed under the MIT License.

That license applies only to what is actually stored here (for example Python/JS under src/, scripts/, review/). It does not grant any rights in third-party materials such as NCCN guideline PDFs; see the next section.

NCCN Guidelines®: permissions and disclaimer

English

This repository does not include, redistribute, or sublicense NCCN Clinical Practice Guidelines in Oncology (NCCN Guidelines®) PDFs, full text, or NCCN-owned artwork. You must obtain licensed copies and any required permissions directly from NCCN or through your institution’s agreement.
NCCN Guidelines® and related materials are protected by copyright and trademark (National Comprehensive Cancer Network®). Use of those materials is governed by NCCN’s terms and by your own license or subscription. This project is an engineering workspace only; you are responsible for compliance with NCCN’s terms, your contracts, and applicable laws (including clinical use and any commercial or research restrictions).
Official entry points: NCCN Guidelines by Cancer Type, Recently Updated Guidelines. For permissions or business use, follow the contact and legal information published on nccn.org.

中文（概要）

本仓库不包含、也不转发或再许可 NCCN 指南 PDF、全文或 NCCN 专有素材；你需要自行通过 NCCN 或机构协议取得合法副本及所需授权。
NCCN 指南及相关内容受版权与商标保护；使用方式以 NCCN 条款及你与 NCCN/机构的许可为准。本仓库仅为工程工具与流程，不构成医疗建议，也不替代你对许可合规与本地法规（含临床使用、商业或研究限制）的判断与责任。

1. Create The Environment

conda env create -f environment.yml
conda activate omgs_nccn
pip install -r requirements.txt
pip install -e .

Linux/macOS were validated as the primary path for the full 00-06 workflow.

For Linux production use, the recommended path for bash scripts/00_prepare_local_inputs.sh is NVIDIA GPU plus a CUDA-enabled torch==2.11.0 build.

The repository pins the Torch version, but not the Linux compute flavor inside requirements.txt. A local CUDA build such as 2.11.0+cuXXX still satisfies the repository pin torch==2.11.0.

bash scripts/00_prepare_local_inputs.sh auto-selects cuda, mps, or cpu for Marker based on the local machine.

For Linux with NVIDIA GPU, use this install order:

conda create -n omgs_nccn python=3.10.16 pip=24.3.1 setuptools=75.8.0 wheel=0.45.1 sqlite=3.47.2
conda activate omgs_nccn
# Install the CUDA-enabled torch==2.11.0 build from the official PyTorch Linux selector.
pip install -r requirements.txt
pip install -e . --no-deps

Official PyTorch install selector:

https://docs.pytorch.org/get-started/locally/

2. Put Your Licensed NCCN Files Under `data/`

Place your privately licensed NCCN guideline PDF(s) under data/ref/.

Examples:

data/ref/nccn_ovarian_cancer_v3_2025.pdf
data/ref/nccn_ovarian_cancer_v3_2026.pdf

You may also use your own licensed NCCN version, file name, and disease site. This repository is not limited to ovarian cancer; the same workflow can be adapted to different tumour types as long as you provide the corresponding NCCN source file locally.

Official NCCN entry points:

Licensed NCCN source files are not included in this repository.

These repository-owned, manually curated manifests are already included. They are pipeline inputs, not generated outputs:

data/manifests/ov_2025_stitch_map.json

3. Prepare The Local NCCN Raw Inputs

Run this from the repository root if you want the repo-owned bootstrap path:

bash scripts/00_prepare_local_inputs.sh

This writes:

data/raw/ov_2025/page_assets/
data/raw/ov_2025/page_assets/page_inventory.json
data/raw/ov_2025/text_extraction/22_nccn_ovarian_cancer_v3_2025/raw/primary.md
data/raw/ov_2025/text_extraction/22_nccn_ovarian_cancer_v3_2025/raw/native/pages.json

4. Export The API Keys You Want To Use

Phase 1, phase 2, and phase 6 are LLM-backed.

Example:

export AZURE_OPENAI_API_KEY=...
export AZURE_OPENAI_ENDPOINT=...
export AZURE_OPENAI_GPT5_DEPLOYMENT=...

export QWEN_COMPAT_API_KEY=...
export QWEN_COMPAT_BASE_URL=...
export QWEN_COMPAT_MODEL=qwen3-max

You can also use a repo-root .env file for the phase-6 wrappers.

5. Install The Doctor Review App

The physician review surface lives under:

review/tldraw_app/

Install exact locked frontend dependencies:

cd review/tldraw_app
npm ci
cd ../..

6. Generate Page Drafts

Run phase 1 for the page you want to review:

bash scripts/01_build_phase1_drafts.sh OV-1

This creates the page draft graph under:

data/processed/ov_2025/pages/OV-1/

It first extracts nodes, then extracts edges; edge extraction uses the nodes from the previous step.

7. Physician Review The Page Draft

Open the review app:

cd review/tldraw_app
npm run dev:draft -- OV-1 --host 127.0.0.1 --port 4173

Inside the app:

review the draft nodes and edges
edit the graph as needed
use Export Review JSON

The exported file name is:

page_graph.reviewed.json

Place that exported file at:

data/processed/ov_2025/pages/OV-1/page_graph.reviewed.json

Repeat phase 1 plus physician review for each page you want to promote.

8. Build Page Semantics From Physician-Reviewed Pages

After physician-reviewed page graphs are in place:

bash scripts/02_build_page_semantics.sh

If needed, you can re-open a reviewed page for recheck:

cd review/tldraw_app
npm run dev:reviewed -- OV-1 --host 127.0.0.1 --port 4173

9. Stitch The Reviewed Global Graph

bash scripts/03_build_reviewed_global_graph.sh

To inspect the stitched global reviewed graph:

cd review/tldraw_app
npm run dev:global -- --host 127.0.0.1 --port 4173

Use this mode for graph inspection, not page-level export.

10. Build Rule Graph And Engine Handoff Assets

bash scripts/04_build_rule_graph.sh
bash scripts/05_build_engine_handoff_assets.sh

Formal graph and handoff assets are written under data/processed/.

Reports, freeze copies, and runtime side effects are written under tmp/.

If a physician changes a reviewed page graph, rerun:

bash scripts/02_build_page_semantics.sh
bash scripts/03_build_reviewed_global_graph.sh
bash scripts/04_build_rule_graph.sh
bash scripts/05_build_engine_handoff_assets.sh

11. Optional Query Smoke

Public examples are already included:

example/query_cases.json
example/query_test.json

Start a local Neo4j container:

docker run -d \
  --name omgs-nccn-neo4j \
  -p 7474:7474 \
  -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/omgs-nccn-dev \
  neo4j:5.26

Then load the phase-5 CSV exports and run a smoke query:

bash scripts/06_load_neo4j_for_query_smoke.sh
bash scripts/06_run_query_smoke.sh --case-id 0

The loader copies the phase-5 CSV exports from data/processed/ov_2025/query/ into the Neo4j container import directory before loading them.

The default loader settings are:

container: omgs-nccn-neo4j
password: omgs-nccn-dev
import dir inside the container: /import

If you use a different container name or password, set:

export OMGS_NCCN_NEO4J_CONTAINER=your-container-name
export OMGS_NCCN_NEO4J_PASSWORD=your-password

Included In This Snapshot

LICENSE
src/omgs_nccn/
scripts/
review/tldraw_app/
data/
example/
fig/ (README screenshot)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

omgs-nccn

Physician-led typed NCCN graph review

License (this repository)

NCCN Guidelines®: permissions and disclaimer

1. Create The Environment

2. Put Your Licensed NCCN Files Under `data/`

3. Prepare The Local NCCN Raw Inputs

4. Export The API Keys You Want To Use

5. Install The Doctor Review App

6. Generate Page Drafts

7. Physician Review The Page Draft

8. Build Page Semantics From Physician-Reviewed Pages

9. Stitch The Reviewed Global Graph

10. Build Rule Graph And Engine Handoff Assets

11. Optional Query Smoke

Included In This Snapshot

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
example		example
fig		fig
review/tldraw_app		review/tldraw_app
scripts		scripts
src/omgs_nccn		src/omgs_nccn
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

omgs-nccn

Physician-led typed NCCN graph review

License (this repository)

NCCN Guidelines®: permissions and disclaimer

1. Create The Environment

2. Put Your Licensed NCCN Files Under data/

3. Prepare The Local NCCN Raw Inputs

4. Export The API Keys You Want To Use

5. Install The Doctor Review App

6. Generate Page Drafts

7. Physician Review The Page Draft

8. Build Page Semantics From Physician-Reviewed Pages

9. Stitch The Reviewed Global Graph

10. Build Rule Graph And Engine Handoff Assets

11. Optional Query Smoke

Included In This Snapshot

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

2. Put Your Licensed NCCN Files Under `data/`

Packages