policy: add HistoryBuffer for past-observation context by lucas-maes · Pull Request #199 · galilai-group/stable-worldmodel

lucas-maes · 2026-04-25T03:46:13Z

Summary

New HistoryBuffer: per-env ring buffer over batched info dicts, with strided history retrieval and macro-block aggregation for action keys (block_keys).
WorldModelPolicy now maintains one when history_len > 1 and feeds strided history into the planner. Auto-derived max_len = history_len * action_block — the smallest size that yields history_len full action blocks (the strided formula was one short for block keys).
_prepare_info runs before the buffer append, so the buffer stores already-processed tensors. Avoids re-applying the action scaler to block-aggregated shapes (which previously raised X has 10 features, but StandardScaler is expecting 2 features as input).
Docs: new docs/api/buffer.md page wired into nav; short "Observation history" subsection in quick_start.md.

Test plan

pytest tests/ (733 passed, 6 skipped)
mkdocs build (no new warnings)
Eval with history_len > 1, action_block > 1 to confirm action history reaches history_len blocks (previously capped at history_len - 1).

🤖 Generated with Claude Code

Introduce HistoryBuffer, a per-env ring buffer over batched info dicts, used by WorldModelPolicy when history_len > 1 to feed strided history into the planner. Actions are aggregated as macro-blocks of length action_block via the block_keys argument so the planner sees one block per stride point. Auto-derived history buffer max_len uses history_len * action_block, which is the smallest size that yields history_len full action blocks (the strided formula was one short for block keys). _prepare_info now runs before the buffer append so processed tensors are stored — avoids re-applying the action scaler to block-aggregated shapes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

policy: add HistoryBuffer for past-observation context#199

policy: add HistoryBuffer for past-observation context#199
lucas-maes wants to merge 1 commit intomainfrom
buffer

lucas-maes commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lucas-maes commented Apr 25, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant