Oblivion

OBLIVION: A Self-Adaptive Memory Activation Framework for AI agents.

A Self-Adaptive Memory Activation Framework that enables AI agents to detect current memory's utility decay, autonomously activate relevant memories, and refine recall through feedback. Further experimental details are provided in our paper, while implementation details and an interactive workflow can be found on the package’s website.

Version: 0.1.0 Status: Core modules implemented (Decayer, Activator, Recognizer, Manager)

Overview

Oblivion provides a three-tier memory hierarchy (L1/L2/L3) for LLM agents with incremental learning capabilities:

L1 (Topic Summaries): Clustering topic metadata — summaries, utility scores, access frequency (fastest access)
L2 (Semantic Memory): Time-bounded semantic facts keyed by topic
L3 (Episodic Memory): Preemptive episodic memories — experiences, strategies, traces (largest)
Three-Level Routing: Progressive retrieval depth (summaries → buffer → deep retrieval) minimizes cost
Temporal Reasoning: Elapsed-time-aware memory matching and time-bounded queries
RRF Curation: Reciprocal Rank Fusion of LLM-based and heuristic buffer curation rankings

Installation

Python 3.12.9 [if needed]

The repo pins 3.12.9 in .python-version. If you need that interpreter, use pyenv:

curl https://pyenv.run | bash
# Add pyenv to PATH per the installer output, then:
pyenv install 3.12.9

After cloning (below), from the repo root run pyenv local 3.12.9, then poetry env use "$(command -v python3.12)" once so Poetry’s venv uses that interpreter (skip if you already use the right python3.12).

Oblivion uses Python ^3.12. Dependencies are managed with Poetry (install Poetry via pipx — pipx does not install the project’s Python):

On MacOS:

brew install pipx
pipx ensurepath

On Linux:

sudo apt update
sudo apt install pipx
pipx ensurepath

Then install Poetry:

pipx install poetry

Finally, clone the repo and install dependencies:

git clone https://github.com/nec-research/oblivion.git
cd oblivion
poetry env use "$(command -v python3.12)"   # omit if already using the right 3.12.x
poetry install

Installation Variants

Oblivion uses a four-tier incremental dependency model. Each higher tier includes all dependencies from the previous tiers:

Tier	Command	What You Get
1 — Core	`poetry install`	Bare minimum to use the oblivion framework as a package (OpenAI, Azure OpenAI, OpenRouter clients, Qdrant storage, Pydantic models)
2 — Local Models + Deployment	`poetry install --extras local`	Tier 1 + `sentence-transformers` for local embeddings. Use `--extras deployer` for full vLLM/PyTorch deployment (Linux/CUDA only)
3 — GoodAI-LTM Benchmark	`poetry install --extras goodai-benchmark`	Tier 1 + Hydra, Streamlit, Plotly, Matplotlib, Pandas, and other visualization/benchmark dependencies for running the GoodAI Long-Term Memory benchmark
4 — LongMemEval Benchmark	`poetry install --extras lme-benchmark`	Tier 3 + Rich, Backoff, NumPy, sentence-transformers for running the LongMemEval benchmark pipeline

Additional install options:

# All experiments combined (Tier 3 + Tier 4)
poetry install --extras experiments

# Development installation (includes dev tools: pytest, ruff, mypy, pre-commit)
poetry install --with dev

Note: The core poetry install (Tier 1) is sufficient for using the oblivion framework as a standalone package. Optional dependencies are only needed for running experiments and benchmarks. See pyproject.toml for the full dependency list.

Environment Configuration

You need to provide configuration information, such as API keys. Create a .keys.ini file in the project root directory:

cp .keys.ini.sample .keys.ini
# Edit .keys.ini with your API keys

Local Model URL Configuration

Important: All local LLM server endpoint URLs throughout the codebase have been replaced with placeholder values (e.g., YOUR_LOCAL_LLM_HOST, PLACEHOLDER_*). These placeholders must be replaced with actual server URLs before using local model features. If placeholders are detected at runtime, a RuntimeError with specific instructions will be raised.

Placeholder URLs appear in the following locations:

src/oblivion/llm/local_servers.py — local LLM endpoint registry
deployer/ scripts — deployment configuration
experiments/longmemeval_benchmark/config/ — YAML experiment configurations

Method 1: Environment variables (recommended):

# Phi-4-mini endpoints (comma-separated for multi-endpoint)
export LOCAL_LLM_PHI4_URLS="http://your-server:8101/v1,http://your-server:8102/v1"

# Qwen3-30B endpoints
export LOCAL_LLM_QWEN_URLS="http://your-server:8002/v1"

# Deployer host
export DEPLOYER_HOST="your-deploy-server"
export DEPLOYER_PHI4_URL="http://your-deploy-server:8101/v1"
export DEPLOYER_QWEN3_URL="http://your-deploy-server:8002/v1"

Method 2: Edit source directly in src/oblivion/llm/local_servers.py.

See deployer/README.md for deployer-specific placeholder configuration.

Pre-commit

Install hooks so commits run the configured checks:

poetry run pre-commit install --install-hooks

This will run a set of checks whenever you commit. You can run the checks manually using:

poetry run pre-commit run --all-files

Before committing, run this command 2-3 times, since at each pass it applies automatic corrections when possible.

If you are using VSCode and would like the pre-commit environment to be correctly set, launch VSCode from terminal:

cd oblivion
poetry shell
code .

Commit messages

conventional-pre-commit runs on the commit-msg hook (see .pre-commit-config.yaml). Messages must follow Conventional Commits.

A commit message should consist of a header, an optional body, and an optional footer. The header is mandatory and should follow a specific structure:

<type>(<scope>): <subject>

Types:

feat: A new feature
fix: A bug fix
docs: Documentation only changes
style: Changes that do not affect the meaning of the code
refactor: A code change that neither fixes a bug nor adds a feature
perf: A code change that improves performance
test: Adding missing tests or correcting existing tests
chore: Changes to the build process or auxiliary tools
build: Changes that affect the build system or external dependencies
ci: Changes to CI configuration files and scripts

Examples:

feat(decayer): add decay-aware memory modeling
fix(activator): correct query expansion logic
docs(readme): update installation instructions
refactor(core): simplify data processing logic

Configuration

Environment Variables

Sensitive data in .env:

# LLM API Keys
OPENAI_API_KEY=sk-your-key-here

# Optional: OpenRouter API
OPENROUTER_API_KEY=sk-or-your-api-key-here

Azure OpenAI Configuration

For Azure OpenAI, create .keys.ini in the project root (not committed to git):

[Azure_OpenAI]
api_key = your_azure_openai_api_key
azure_endpoint = https://your-resource-name.openai.azure.com/

Basic Usage

from oblivion.config import load_config, OblivionConfig

# Load configuration
config = load_config("config/config.yaml")

# Use individual modules
from oblivion.agents.executor import ExecutorAgent
from oblivion.memory.working_memory import WorkingMemory

Architecture

Core Modules

src/oblivion/
├── core/                    # Entry point
├── config/                  # Config loader + models
├── llm/                     # LLM client factory
├── agents/                  # ExecutorAgent orchestration + prompts
├── memory/
│   ├── decayer/             # Decayer: Uncertainty assessment + routing
│   ├── activator/           # Activator: Query expansion + buffer curation
│   ├── recognizer/          # Recognizer: Memory extraction + utility assessment
│   └── manager/             # Manager: Qdrant storage + CRUD operations

Three-Level Retrieval Routing

Decayer assesses uncertainty and selects retrieval depth to minimize cost:

Level	Name	Content Used	Token Cost	Use Case
1	`cluster_summaries`	L1 topic summaries only	~500	Query fully answerable from summaries
2	`cluster_memories_buffer`	Summaries + L2/L3 memories in buffer	~2.5k	Partial match — need specific memories
3	`memory_manager_retrieval`	Full vector search via Manager	Variable	Buffer insufficient — deep retrieval needed

Experiments and Benchmarks

This repository includes evaluation infrastructure for two memory benchmarks. See experiments/README.md for the full overview.

Benchmark	Description	Documentation
GoodAI-LTM	Dynamic benchmark, 33 test cases, 7 categories, 1K–500K context lengths	`experiments/goodai_ltm_benchmark/README.md`
LongMemEval	Static benchmark, 500 test cases, 6 categories, oracle & systematic splits	`experiments/longmemeval_benchmark/README.md`
Ablation Experiments	Decayer temperature, hallucination analysis, error analysis	`experiments/longmemeval_ablation_experiments/`

Experiment Directory Structure

experiments/
├── longmemeval_data_utils/           # Shared LongMemEval data loading utilities
├── longmemeval_benchmark/           # LongMemEval evaluation framework
│   ├── runner/                      # Benchmark runner, preparation & query pipelines
│   ├── preparation/                 # Memory preparation strategies
│   │   ├── strategies/              # Strategy implementations (inspired by Recognizer)
│   │   └── prompts/                 # Per-strategy prompt templates
│   ├── metrics/                     # Retrieval metrics, cost estimation
│   ├── cache/                       # Preparation cache management
│   ├── llm/                         # Async structured calls, throttling
│   ├── models/                      # Pydantic response/execution models
│   ├── config/                      # YAML experiment configurations
│   └── analysis/                    # Pipeline trace, exclusions
├── goodai_ltm_benchmark/            # GoodAI-LTM benchmark integration
└── longmemeval_ablation_experiments/ # Archived ablation experiments

Benchmark Datasets

Both datasets are managed as optional git submodules under data/benchmarks/. See data/benchmarks/README.md for dataset details, licenses, and setup.

Hardware Configuration

Local model experiments were conducted on a compute node with 8× NVIDIA GeForce RTX 5090 GPUs (32 GB VRAM each), used for self-hosting LLMs via vLLM. Typical deployment: one or more GPUs per model instance, with multiple vLLM endpoints behind load-balanced routing. Azure experiments use the Azure OpenAI API and do not require local GPU resources.

Hyperparameter Sweeps

Automated sweep scripts are available for large-scale experimentation:

Script	Purpose
`run_hyperparameter_sweep.py`	Multi-model sweep across temperature, buffer, linking, episodic mode
`run_large_scale_hyperparameter_sweep.py`	Endpoint-pinned parallelization for 32K and 120K benchmarks
`run_topics_and_rewards_sweep.py`	Sweep over taxonomy, reward criteria, and topic initialization

Note: Hyperparameter sweep scripts that target local models require local LLM server URLs to be configured via environment variables before execution. See Local Model URL Configuration.

Deployment

Server deployment scripts are in deployer/. All deployment URLs use placeholder values that must be configured before use.

Documentation

Project documentation is available at the Oblivion website (static HTML).

Testing

Test Directory Structure

tests/
├── unit/                    # Unit tests (no API calls)
│   ├── decayer/            # Decayer module tests
│   ├── recognizer/         # Recognizer module tests
│   ├── activator/          # Activator module tests
│   ├── manager/            # Manager module tests
│   ├── test_executor.py    # ExecutorAgent tests
│   └── test_url_placeholders.py  # URL anonymization tests
├── integration/            # Integration tests (require LLM API)
│   ├── test_decayer_routing_modes.py
│   ├── test_recognizer_parallel.py
│   ├── test_executor_respond.py
│   ├── test_pipeline.py
│   └── ...                 # ~25 integration test files
└── data/                   # Test data files

Running Tests

# Run unit tests only (fast, no API calls)
poetry run pytest tests/unit/

# Run integration tests (requires API key + --integration flag)
export OPENAI_API_KEY="your-key"
poetry run pytest tests/integration/ --integration

# Run all tests (skips integration if no API key)
poetry run pytest tests/

# Run specific module tests
poetry run pytest tests/unit/decayer/
poetry run pytest tests/unit/test_executor.py

Test Categories

Category	Location	LLM Required	Run Command
Unit	`tests/unit/`	No	`pytest tests/unit/`
Integration	`tests/integration/`	Yes	`pytest tests/integration/ --integration`

Integration Tests

Integration tests in tests/integration/ require:

OPENAI_API_KEY or AZURE_OPENAI_API_KEY environment variable
--integration command line flag
These tests make actual LLM API calls and incur costs

Optional Git Submodules

Benchmark datasets are managed as optional git submodules:

# Initialize all submodules
git submodule update --init

# Or initialize individually
git submodule update --init data/benchmarks/longmemeval
git submodule update --init data/benchmarks/goodai-ltm

Submodules are only required for running specific benchmark experiments. See data/benchmarks/README.md for details.

Langfuse (Deprecated)

Langfuse integration is deprecated and will be removed in a future version. The oblivion framework supports Azure OpenAI, OpenAI, and OpenRouter clients independently without Langfuse. If you have a [Langfuse] section in your .keys.ini, a DeprecationWarning will be emitted. Install via poetry install --extras deprecated if you still need it.

Citation

If you find Oblivion useful in your research, please consider citing us:

@misc{rana2026oblivionselfadaptiveagenticmemory,
      title={Oblivion: Self-Adaptive Agentic Memory Control through Decay-Driven Activation},
      author={Ashish Rana and Chia-Chien Hung and Qumeng Sun and Julian Martin Kunkel and Carolin Lawrence},
      year={2026},
      eprint={2604.00131},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2604.00131},
}

Authors and Acknowledgment

Ashish Rana
Qumeng Sun
Chia-Chien Hung

License

This code is proprietary software of NEC Laboratories Europe GmbH. See LICENSE for details.

If you plan to use the code outside NEC R&D environment, please contact the authors.

Project Status

This is an active research project. Core modules are implemented:

Decayer: Interaction-based Ebbinghaus decay, uncertainty assessment, three-level retrieval routing
Activator: DAG query expansion, RRF-merged buffer curation (LLM + heuristic)
Recognizer: Single LLM call memory extraction, topic construction, reward criteria
Manager: Qdrant local storage, namespace-based organization, temporal filtering
ExecutorAgent: Thinking loop orchestration, configurable module composition

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data/benchmarks		data/benchmarks
deployer		deployer
docs		docs
experiments		experiments
src/oblivion		src/oblivion
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
.keys.ini.sample		.keys.ini.sample
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

Folders and files

Latest commit

History

Repository files navigation

Oblivion

Table of Contents

Overview

Installation

Python 3.12.9 [if needed]

Installation Variants

Environment Configuration

Local Model URL Configuration

Pre-commit

Commit messages

Configuration

Environment Variables

Azure OpenAI Configuration

Basic Usage

Architecture

Core Modules

Three-Level Retrieval Routing

Experiments and Benchmarks

Experiment Directory Structure

Benchmark Datasets

Hardware Configuration

Hyperparameter Sweeps

Deployment

Documentation

Testing

Test Directory Structure

Running Tests

Test Categories

Integration Tests

Optional Git Submodules

Langfuse (Deprecated)

Citation

Authors and Acknowledgment

License

Project Status

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 1

Languages