Thanks for considering a contribution. Cortex is a persistent memory engine built on 20 biological mechanisms with 41 academic citations backing the algorithms. Every change is held to that bar.
A Python 3.10+ MCP server with 47 tools, 9 automatic hooks, persisting to PostgreSQL + pgvector. Implements rate-distortion forgetting, predictive-coding write gating, retrieval-induced reconsolidation, pattern separation, sleep-cycle consolidation, emotional-valence weighting, and more. See README for the full architecture and benchmark results (LongMemEval Recall@10 = 97.8%, LoCoMo Recall@10 = 92.6%, BEAM-10M +33.4% over the published baseline).
Prerequisites: Python 3.10+, PostgreSQL 17 + pgvector extension,
uvx (pip install uv or pipx install uv).
git clone https://github.com/cdeust/Cortex.git
cd Cortex
# Install with dev + benchmark extras
pip install -e ".[postgresql,benchmarks,dev]"
# Or use the setup script (handles PostgreSQL + pgvector + DB init)
bash scripts/setup.sh # macOS / Linux
# Verify everything is wired
uvx --python 3.13 --from "neuro-cortex-memory[postgresql]" cortex-doctor
# Run tests (2500+ tests across functional + benchmark suites)
pytest
# Run a benchmark
python benchmarks/longmemeval/run_benchmark.py --variant smainis the integration branch.- Branch naming:
feature/<short-slug>,fix/<short-slug>,docs/<short-slug>,mechanism/<name>(for new biological mechanisms),benchmark/<name>(for new benchmark integrations). - One mechanism per PR when adding new biological mechanisms.
- Conventional commit messages preferred.
Cortex's twenty mechanisms are not metaphors — each maps to a specific neuroscience finding with a specific algorithmic implementation. A new mechanism PR must include:
- Primary citation. What published neuroscience or cognitive-science
work motivates this mechanism? Include the paper's bibliographic
reference in
docs/papers/science.md. - The mathematical form. Equations or pseudocode showing the exact computation. If you're adapting an algorithm from the literature, call out the divergence and justify it.
- The biological grounding. Which brain region / circuit / molecular pathway does this mirror? A one-paragraph mapping is required.
- Empirical validation. A benchmark or unit test demonstrating the mechanism behaves as predicted. Quantitative claims need numbers.
- Ablation. A test showing the system's behavior with the mechanism disabled, so its contribution is observable.
A mechanism PR without these five elements does not pass review.
Cortex fuses five retrieval signals (vector similarity, full-text search, trigram matching, thermodynamic heat, recency) plus a cross-encoder reranker. Changes here:
- Run the full benchmark suite. LongMemEval, LoCoMo, BEAM at both 100K and 10M scales. A regression on any of those is blocking unless explicitly justified.
- Document the delta. A markdown row in
benchmarks/results.mdshowing before/after MRR + Recall@10 per category. - Cite the source. If you're adding a new signal, reference the IR literature (BM25 → Robertson; pgvector HNSW → Malkov et al.; trigram → Lehmann; etc.).
- Preserve the 22MB embedding-model footprint. Cortex runs entirely on the user's machine; bringing in a 1GB model is out of scope.
Standard Python style (black, ruff, mypy --strict) plus
project-specific rules:
- No
Anyin production code. UseProtocolor generic typing. - §8 Source discipline. Every numeric constant ≥3 significant digits
needs a
# source:annotation. - No mutable default arguments. No globals except for read-once configuration objects.
- No bare
except:. Catch the specific exception you mean. - Type-checked at
--strict.mypy --strict src/cortex/must pass. - §4.1 File ≤500 lines, §4.2 function ≤50 lines.
The full standard lives in zetetic coding standards.
pytest # full suite (~2500 tests)
pytest tests/unit # unit only
pytest tests/integration # PostgreSQL-backed integration
pytest tests/benchmark -k locomo # subset
pytest -x --ff # stop on first fail, run failures firstTests run against a local PostgreSQL instance. CI provisions a fresh DB per run.
47 tools currently. Adding a new one:
- Define the JSON schema in the tool's module-level decorator.
- Implement the handler following the
BaseToolprotocol. - Add to the tool registry at the canonical registration site.
- Document in
docs/MCP-TOOLS.mdwith the tool's purpose, inputs, outputs, and an example call. - Add a unit test for the tool's contract.
- Add an integration test if the tool touches the database.
- Don't claim a benchmark improvement without committing the actual benchmark output. Numbers without a reproducible run are unverified.
- Don't add a mechanism without academic grounding. "It seems brain-like" is not a citation.
- Don't introduce a heavy ML model dependency that breaks the runs-on-your-machine guarantee.
- Don't bypass
mypy --strict. The type system is the contract. - Don't relax a test that fails on your branch. The test exists for a reason; understand the reason before changing it.
This project follows CODE_OF_CONDUCT.md.
See SECURITY.md. The memory engine handles potentially
sensitive user data (PII in conversation transcripts); any data-exposure
or injection issue is high-priority. The pre-tool-secret-shield hook
already gates .env, .aws/credentials, *.pem, *.key, and shell
history — but new code paths that touch the filesystem need similar
review.
MIT. Contributions are licensed under the same. See LICENSE.
The neuroscience and IR algorithms remain attributable to the cited
sources; the MIT license covers this implementation.