Skip to content

nec-research/oblivion

Repository files navigation

Oblivion

OBLIVION: A Self-Adaptive Memory Activation Framework for AI agents.

A Self-Adaptive Memory Activation Framework that enables AI agents to detect current memory's utility decay, autonomously activate relevant memories, and refine recall through feedback. Further experimental details are provided in our paper, while implementation details and an interactive workflow can be found on the package’s website.

Version: 0.1.0 Status: Core modules implemented (Decayer, Activator, Recognizer, Manager)

Table of Contents

Overview

Oblivion provides a three-tier memory hierarchy (L1/L2/L3) for LLM agents with incremental learning capabilities:

  • L1 (Topic Summaries): Clustering topic metadata — summaries, utility scores, access frequency (fastest access)
  • L2 (Semantic Memory): Time-bounded semantic facts keyed by topic
  • L3 (Episodic Memory): Preemptive episodic memories — experiences, strategies, traces (largest)
  • Three-Level Routing: Progressive retrieval depth (summaries → buffer → deep retrieval) minimizes cost
  • Temporal Reasoning: Elapsed-time-aware memory matching and time-bounded queries
  • RRF Curation: Reciprocal Rank Fusion of LLM-based and heuristic buffer curation rankings

Installation

Python 3.12.9 [if needed]

The repo pins 3.12.9 in .python-version. If you need that interpreter, use pyenv:

curl https://pyenv.run | bash
# Add pyenv to PATH per the installer output, then:
pyenv install 3.12.9

After cloning (below), from the repo root run pyenv local 3.12.9, then poetry env use "$(command -v python3.12)" once so Poetry’s venv uses that interpreter (skip if you already use the right python3.12).


Oblivion uses Python ^3.12. Dependencies are managed with Poetry (install Poetry via pipx — pipx does not install the project’s Python):

On MacOS:

brew install pipx
pipx ensurepath

On Linux:

sudo apt update
sudo apt install pipx
pipx ensurepath

Then install Poetry:

pipx install poetry

Finally, clone the repo and install dependencies:

git clone https://github.com/nec-research/oblivion.git
cd oblivion
poetry env use "$(command -v python3.12)"   # omit if already using the right 3.12.x
poetry install

Installation Variants

Oblivion uses a four-tier incremental dependency model. Each higher tier includes all dependencies from the previous tiers:

Tier Command What You Get
1 — Core poetry install Bare minimum to use the oblivion framework as a package (OpenAI, Azure OpenAI, OpenRouter clients, Qdrant storage, Pydantic models)
2 — Local Models + Deployment poetry install --extras local Tier 1 + sentence-transformers for local embeddings. Use --extras deployer for full vLLM/PyTorch deployment (Linux/CUDA only)
3 — GoodAI-LTM Benchmark poetry install --extras goodai-benchmark Tier 1 + Hydra, Streamlit, Plotly, Matplotlib, Pandas, and other visualization/benchmark dependencies for running the GoodAI Long-Term Memory benchmark
4 — LongMemEval Benchmark poetry install --extras lme-benchmark Tier 3 + Rich, Backoff, NumPy, sentence-transformers for running the LongMemEval benchmark pipeline

Additional install options:

# All experiments combined (Tier 3 + Tier 4)
poetry install --extras experiments

# Development installation (includes dev tools: pytest, ruff, mypy, pre-commit)
poetry install --with dev

Note: The core poetry install (Tier 1) is sufficient for using the oblivion framework as a standalone package. Optional dependencies are only needed for running experiments and benchmarks. See pyproject.toml for the full dependency list.

Environment Configuration

You need to provide configuration information, such as API keys. Create a .keys.ini file in the project root directory:

cp .keys.ini.sample .keys.ini
# Edit .keys.ini with your API keys

Local Model URL Configuration

Important: All local LLM server endpoint URLs throughout the codebase have been replaced with placeholder values (e.g., YOUR_LOCAL_LLM_HOST, PLACEHOLDER_*). These placeholders must be replaced with actual server URLs before using local model features. If placeholders are detected at runtime, a RuntimeError with specific instructions will be raised.

Placeholder URLs appear in the following locations:

  • src/oblivion/llm/local_servers.py — local LLM endpoint registry
  • deployer/ scripts — deployment configuration
  • experiments/longmemeval_benchmark/config/ — YAML experiment configurations

Method 1: Environment variables (recommended):

# Phi-4-mini endpoints (comma-separated for multi-endpoint)
export LOCAL_LLM_PHI4_URLS="http://your-server:8101/v1,http://your-server:8102/v1"

# Qwen3-30B endpoints
export LOCAL_LLM_QWEN_URLS="http://your-server:8002/v1"

# Deployer host
export DEPLOYER_HOST="your-deploy-server"
export DEPLOYER_PHI4_URL="http://your-deploy-server:8101/v1"
export DEPLOYER_QWEN3_URL="http://your-deploy-server:8002/v1"

Method 2: Edit source directly in src/oblivion/llm/local_servers.py.

See deployer/README.md for deployer-specific placeholder configuration.

Pre-commit

Install hooks so commits run the configured checks:

poetry run pre-commit install --install-hooks

This will run a set of checks whenever you commit. You can run the checks manually using:

poetry run pre-commit run --all-files

Before committing, run this command 2-3 times, since at each pass it applies automatic corrections when possible.

If you are using VSCode and would like the pre-commit environment to be correctly set, launch VSCode from terminal:

cd oblivion
poetry shell
code .

Commit messages

conventional-pre-commit runs on the commit-msg hook (see .pre-commit-config.yaml). Messages must follow Conventional Commits.

A commit message should consist of a header, an optional body, and an optional footer. The header is mandatory and should follow a specific structure:

<type>(<scope>): <subject>

Types:

  • feat: A new feature
  • fix: A bug fix
  • docs: Documentation only changes
  • style: Changes that do not affect the meaning of the code
  • refactor: A code change that neither fixes a bug nor adds a feature
  • perf: A code change that improves performance
  • test: Adding missing tests or correcting existing tests
  • chore: Changes to the build process or auxiliary tools
  • build: Changes that affect the build system or external dependencies
  • ci: Changes to CI configuration files and scripts

Examples:

  • feat(decayer): add decay-aware memory modeling
  • fix(activator): correct query expansion logic
  • docs(readme): update installation instructions
  • refactor(core): simplify data processing logic

Configuration

Environment Variables

Sensitive data in .env:

# LLM API Keys
OPENAI_API_KEY=sk-your-key-here

# Optional: OpenRouter API
OPENROUTER_API_KEY=sk-or-your-api-key-here

Azure OpenAI Configuration

For Azure OpenAI, create .keys.ini in the project root (not committed to git):

[Azure_OpenAI]
api_key = your_azure_openai_api_key
azure_endpoint = https://your-resource-name.openai.azure.com/

Basic Usage

from oblivion.config import load_config, OblivionConfig

# Load configuration
config = load_config("config/config.yaml")

# Use individual modules
from oblivion.agents.executor import ExecutorAgent
from oblivion.memory.working_memory import WorkingMemory

Architecture

Core Modules

src/oblivion/
├── core/                    # Entry point
├── config/                  # Config loader + models
├── llm/                     # LLM client factory
├── agents/                  # ExecutorAgent orchestration + prompts
├── memory/
│   ├── decayer/             # Decayer: Uncertainty assessment + routing
│   ├── activator/           # Activator: Query expansion + buffer curation
│   ├── recognizer/          # Recognizer: Memory extraction + utility assessment
│   └── manager/             # Manager: Qdrant storage + CRUD operations

Three-Level Retrieval Routing

Decayer assesses uncertainty and selects retrieval depth to minimize cost:

Level Name Content Used Token Cost Use Case
1 cluster_summaries L1 topic summaries only ~500 Query fully answerable from summaries
2 cluster_memories_buffer Summaries + L2/L3 memories in buffer ~2.5k Partial match — need specific memories
3 memory_manager_retrieval Full vector search via Manager Variable Buffer insufficient — deep retrieval needed

Experiments and Benchmarks

This repository includes evaluation infrastructure for two memory benchmarks. See experiments/README.md for the full overview.

Benchmark Description Documentation
GoodAI-LTM Dynamic benchmark, 33 test cases, 7 categories, 1K–500K context lengths experiments/goodai_ltm_benchmark/README.md
LongMemEval Static benchmark, 500 test cases, 6 categories, oracle & systematic splits experiments/longmemeval_benchmark/README.md
Ablation Experiments Decayer temperature, hallucination analysis, error analysis experiments/longmemeval_ablation_experiments/

Experiment Directory Structure

experiments/
├── longmemeval_data_utils/           # Shared LongMemEval data loading utilities
├── longmemeval_benchmark/           # LongMemEval evaluation framework
│   ├── runner/                      # Benchmark runner, preparation & query pipelines
│   ├── preparation/                 # Memory preparation strategies
│   │   ├── strategies/              # Strategy implementations (inspired by Recognizer)
│   │   └── prompts/                 # Per-strategy prompt templates
│   ├── metrics/                     # Retrieval metrics, cost estimation
│   ├── cache/                       # Preparation cache management
│   ├── llm/                         # Async structured calls, throttling
│   ├── models/                      # Pydantic response/execution models
│   ├── config/                      # YAML experiment configurations
│   └── analysis/                    # Pipeline trace, exclusions
├── goodai_ltm_benchmark/            # GoodAI-LTM benchmark integration
└── longmemeval_ablation_experiments/ # Archived ablation experiments

Benchmark Datasets

Both datasets are managed as optional git submodules under data/benchmarks/. See data/benchmarks/README.md for dataset details, licenses, and setup.

Hardware Configuration

Local model experiments were conducted on a compute node with 8× NVIDIA GeForce RTX 5090 GPUs (32 GB VRAM each), used for self-hosting LLMs via vLLM. Typical deployment: one or more GPUs per model instance, with multiple vLLM endpoints behind load-balanced routing. Azure experiments use the Azure OpenAI API and do not require local GPU resources.

Hyperparameter Sweeps

Automated sweep scripts are available for large-scale experimentation:

Script Purpose
run_hyperparameter_sweep.py Multi-model sweep across temperature, buffer, linking, episodic mode
run_large_scale_hyperparameter_sweep.py Endpoint-pinned parallelization for 32K and 120K benchmarks
run_topics_and_rewards_sweep.py Sweep over taxonomy, reward criteria, and topic initialization

Note: Hyperparameter sweep scripts that target local models require local LLM server URLs to be configured via environment variables before execution. See Local Model URL Configuration.

Deployment

Server deployment scripts are in deployer/. All deployment URLs use placeholder values that must be configured before use.

Documentation

Project documentation is available at the Oblivion website (static HTML).

Testing

Test Directory Structure

tests/
├── unit/                    # Unit tests (no API calls)
│   ├── decayer/            # Decayer module tests
│   ├── recognizer/         # Recognizer module tests
│   ├── activator/          # Activator module tests
│   ├── manager/            # Manager module tests
│   ├── test_executor.py    # ExecutorAgent tests
│   └── test_url_placeholders.py  # URL anonymization tests
├── integration/            # Integration tests (require LLM API)
│   ├── test_decayer_routing_modes.py
│   ├── test_recognizer_parallel.py
│   ├── test_executor_respond.py
│   ├── test_pipeline.py
│   └── ...                 # ~25 integration test files
└── data/                   # Test data files

Running Tests

# Run unit tests only (fast, no API calls)
poetry run pytest tests/unit/

# Run integration tests (requires API key + --integration flag)
export OPENAI_API_KEY="your-key"
poetry run pytest tests/integration/ --integration

# Run all tests (skips integration if no API key)
poetry run pytest tests/

# Run specific module tests
poetry run pytest tests/unit/decayer/
poetry run pytest tests/unit/test_executor.py

Test Categories

Category Location LLM Required Run Command
Unit tests/unit/ No pytest tests/unit/
Integration tests/integration/ Yes pytest tests/integration/ --integration

Integration Tests

Integration tests in tests/integration/ require:

  • OPENAI_API_KEY or AZURE_OPENAI_API_KEY environment variable
  • --integration command line flag
  • These tests make actual LLM API calls and incur costs

Optional Git Submodules

Benchmark datasets are managed as optional git submodules:

# Initialize all submodules
git submodule update --init

# Or initialize individually
git submodule update --init data/benchmarks/longmemeval
git submodule update --init data/benchmarks/goodai-ltm

Submodules are only required for running specific benchmark experiments. See data/benchmarks/README.md for details.

Langfuse (Deprecated)

Langfuse integration is deprecated and will be removed in a future version. The oblivion framework supports Azure OpenAI, OpenAI, and OpenRouter clients independently without Langfuse. If you have a [Langfuse] section in your .keys.ini, a DeprecationWarning will be emitted. Install via poetry install --extras deprecated if you still need it.

Citation

If you find Oblivion useful in your research, please consider citing us:

@misc{rana2026oblivionselfadaptiveagenticmemory,
      title={Oblivion: Self-Adaptive Agentic Memory Control through Decay-Driven Activation},
      author={Ashish Rana and Chia-Chien Hung and Qumeng Sun and Julian Martin Kunkel and Carolin Lawrence},
      year={2026},
      eprint={2604.00131},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2604.00131},
}

Authors and Acknowledgment

  • Ashish Rana
  • Qumeng Sun
  • Chia-Chien Hung

License

This code is proprietary software of NEC Laboratories Europe GmbH. See LICENSE for details.

If you plan to use the code outside NEC R&D environment, please contact the authors.

Project Status

This is an active research project. Core modules are implemented:

  • Decayer: Interaction-based Ebbinghaus decay, uncertainty assessment, three-level retrieval routing
  • Activator: DAG query expansion, RRF-merged buffer curation (LLM + heuristic)
  • Recognizer: Single LLM call memory extraction, topic construction, reward criteria
  • Manager: Qdrant local storage, namespace-based organization, temporal filtering
  • ExecutorAgent: Thinking loop orchestration, configurable module composition

About

The source for the Oblivion package and experimentation pipeline corresponding to the "Oblivion: Self-Adaptive Agentic Memory Control through Decay-Driven Activation" paper.

Topics

Resources

License

Stars

Watchers

Forks