Skip to content

Latest commit

 

History

History
291 lines (157 loc) · 8.23 KB

File metadata and controls

291 lines (157 loc) · 8.23 KB

CHANGELOG

v0.9.1 (2026-04-10)

Bug Fixes

  • Preconditioner bug in attributor.py (6232d1e)

v0.9.0 (2026-03-18)

Bug Fixes

Features

  • Add flag to enable TF32 (35ab164)

v0.8.1 (2026-03-18)

Bug Fixes

  • Release bergson without pinned transformers (ef9dc9a)

v0.8.0 (2026-03-08)

Features

  • Set default precision to fp32 in IndexConfig and ScoreConfig (92d4807)

Co-authored-by: Lucia Quirke luciaquirke@users.noreply.github.com

v0.7.2 (2026-03-04)

v0.7.1 (2026-03-03)

Bug Fixes

  • Always compute mixing coefficient in Trackstar pipeline (c990375)

Remove the conditional guard — lambda is always auto-computed from the preconditioner eigenvalues since the cost is negligible.

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

v0.7.0 (2026-03-03)

Bug Fixes

  • Standardize trace collector preconditioning (6a14e53)

Features

v0.6.2 (2026-03-02)

Bug Fixes

  • Convert PyArrow Column to list in allocate_batches (7fe4dd3)

HuggingFace Dataset column access (ds["length"]) returns a PyArrow Column, not a Python list. Iterating over it element-by-element (via sorted(), random indexing) is ~1000x slower than on a native list. For 10M items this caused allocate_batches to hang for 13+ hours instead of completing in ~17 seconds.

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

  • Convert PyArrow columns to list at callsites of allocate_batches (5d734dc)

Move the list conversion out of allocate_batches (which types doc_lengths as list[int]) to the callsites that pass HF Dataset columns. Use ds["length"][:] which returns a plain list[int].

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

  • Remove redundant zero-fill loop in MemmapSequenceScoreWriter (558829f)

np.memmap w+ mode already creates a zero-filled file, making the per-field written flag initialization loop unnecessary. For large datasets (10M+ items) with many query scores, the strided writes through the structured dtype caused multi-hour hangs.

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

  • Use [:] instead of list() for consistency (c76d131)

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

v0.6.1 (2026-03-02)

Bug Fixes

  • Unpin transformers by explicitly setting float32 dtype in tests (0b6c226)

Transformers 4.56+ changed from_config() to honor the config's torch_dtype field, causing test models (tiny-GPTNeoX, tiny-Phi3) to be created in float16 instead of float32. This caused gradient comparison tests to fail from reduced precision, not from any actual change in gradient collection logic.

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

v0.6.0 (2026-02-17)

Bug Fixes

  • Use _csv._writer type for csv_recorder annotation (6e6289c)

csv.writer is a function, not a class, so it cannot be used as a type annotation. Import the private _writer type from _csv and use it for the Generator yield type. Also fix the None check to use if not path since QueryConfig.record uses empty string as the sentinel value.

Co-authored-by: Lucia Quirke luciaquirke@users.noreply.github.com

Continuous Integration

  • Pin pyright version and fix faiss type error (b9f54cf)

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

  • Use Python 3.11 for typechecking (9ef4122)

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

  • Use Python 3.11 for typechecking (ea50dd8)

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

Features

  • Add --record flag to query CLI for saving results to CSV (59770ff)

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

Refactoring

  • Replace try/finally CSV block with context manager (6431320)

Co-authored-by: Lucia Quirke luciaquirke@users.noreply.github.com

v0.5.2 (2026-02-17)

Bug Fixes

  • Pass batches to CollectorComputer in fit_normalizers (c95d5d4)

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

Continuous Integration

  • Improve Claude workflows (fetch-depth, timeout, max-turns, pip install) (7a315e5)

  • Run tests and typechecking in parallel (e690fc0)

v0.5.1 (2026-01-30)

Bug Fixes

v0.5.0 (2026-01-08)

Features

  • Add optimizer-aware gradients (497edab)

v0.4.6 (2026-01-06)

Bug Fixes

v0.4.5 (2026-01-06)

Bug Fixes

  • Always use unstructured gradients in score (595ed92)

v0.4.4 (2026-01-05)

Bug Fixes

v0.4.3 (2026-01-05)

Bug Fixes

v0.4.2 (2025-12-22)

Bug Fixes

  • Unit normalize in float32 (cae8352)

v0.4.1 (2025-12-20)

Bug Fixes

  • Pin transformers to avoid fp error bug (9feac20)

v0.4.0 (2025-12-03)

Features

  • Enable specifying a custom tokenizer (9781a55)

v0.3.0 (2025-12-03)

Features

v0.2.0 (2025-11-13)

Features

v0.1.1 (2025-10-16)

Bug Fixes

v0.1.0 (2025-10-16)

Features

v0.0.0 (2025-10-07)