Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 60 additions & 36 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -443,6 +443,64 @@ jobs:
- name: Run AVX-512 SIMD tests under SDE
run: bash ci/sde_avx512.sh

# NEON correctness on a pinned native arm64 runner.
#
# The AVX SDE jobs cover x86 SIMD; the cross/test matrix
# `macos-latest` happens to be Apple Silicon today, but its
# architecture is a label-resolution detail that GitHub can change.
# Miri forces `--cfg diarization_force_scalar`, so it does not
# exercise the unsafe NEON kernels either. Without an arm64-pinned
# job, a load/deinterleave/tail bug in `dot_neon`, `window_mul_neon`,
# or `power_neon` could ship while every required safety job stayed
# green.
#
# Pattern mirrors `avx2-sde` / `avx512-sde`. Sets
# `--cfg diarization_assert_neon` so
# `dispatch_selects_neon_under_native_arm64` (in
# `ops::backend_selection_tests`) fails the build if NEON dispatch
# ever falls back to scalar on this runner.
neon-native:
name: NEON (native arm64)
runs-on: ubuntu-24.04-arm
steps:
- uses: actions/checkout@v6
- name: Cache cargo build and registry
uses: actions/cache@v5
with:
path: |
~/.cargo/registry
~/.cargo/git
target
key: ${{ runner.os }}-neon-native-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-neon-native-
- name: Install Rust
run: rustup update stable && rustup default stable
- name: "Run fbank + ops:: tests on arm64 (NEON dispatched)"
# `--cfg diarization_assert_neon` enables the
# `dispatch_selects_neon_under_native_arm64` test in
# `ops::backend_selection_tests`, failing if the runner image
# ever stops detecting NEON. Test scope mirrors `ci/sde_avx*.sh`:
# ops:: + embed::fbank::tests + the parity test modules whose
# threshold-sensitive decisions could flip under reduction-order
# drift.
env:
RUSTFLAGS: "-Dwarnings --cfg diarization_assert_neon"
run: |
cargo test \
--lib --no-default-features \
-- \
ops:: \
embed::fbank::tests \
pipeline::parity_tests \
cluster::ahc::parity_tests \
cluster::vbx::parity_tests \
cluster::centroid::parity_tests \
offline::parity_tests \
reconstruct::parity_tests \
aggregate::parity_tests \
plda::parity_tests

sanitizer:
name: sanitizer
runs-on: ubuntu-latest
Expand Down Expand Up @@ -535,42 +593,7 @@ jobs:
${{ runner.os }}-miri-
- name: Miri
run: |
bash ci/miri_sb.sh "${{ matrix.target }}"

# The previous `loom` job was carried over from the colconv ci.yml
# template but never wired — diarization has no concurrency primitives
# to verify with `loom`. Cargo would have rejected `--features loom`
# on every run because no such feature exists in `Cargo.toml`. Removed
# rather than adding a placeholder feature with no actual loom tests.

# valgrind:
# name: valgrind
# runs-on: ubuntu-latest
# steps:
# - uses: actions/checkout@v6
# - name: Cache cargo build and registry
# uses: actions/cache@v5
# with:
# path: |
# ~/.cargo/registry
# ~/.cargo/git
# target
# key: ubuntu-latest-valgrind-${{ hashFiles('**/Cargo.lock') }}
# restore-keys: |
# ubuntu-latest-valgrind-
# - name: Install Rust
# run: rustup update stable && rustup default stable
# - name: Install Valgrind
# run: |
# sudo apt-get update -y
# sudo apt-get install -y valgrind
# # Uncomment and customize when you have binaries to test:
# # - name: cargo build foo
# # run: cargo build --bin foo
# # working-directory: integration
# # - name: Run valgrind foo
# # run: valgrind --error-exitcode=1 --leak-check=full --show-leak-kinds=all ./target/debug/foo
# # working-directory: integration
bash ci/miri_sb.sh "${{ matrix.target }}"

coverage:
name: coverage
Expand All @@ -586,6 +609,7 @@ jobs:
- miri-sb
- avx2-sde
- avx512-sde
- neon-native
steps:
- uses: actions/checkout@v6
- name: Install Rust
Expand Down
8 changes: 6 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,12 @@ spikes/kaldi_fbank/python/uv.lock
spikes/kaldi_fbank/rust.csv
spikes/kaldi_fbank/python.csv

# Phase-0 parity capture: large local artifacts.
tests/parity/fixtures/*/clip_16k.wav
# `tests/parity/fixtures/*/clip_16k.wav` was previously gitignored to
# keep the repo lightweight, but the end-to-end parity test suite
# (`tests/parity_fixtures_endtoend.rs`) and the docstring
# parity-claim need every wav reproducible from a clean checkout. Now
# tracked in plain git (~295 MB total across 14 fixtures, comparable
# to existing `*.npz` capture artifacts).
# verify_capture.py writes a backup before re-running.
tests/parity/fixtures/.*.backup/

Expand Down
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,21 @@
# UNRELEASED

BREAKING (pre-1.0):

- `diarization::embed::Error` is now `#[non_exhaustive]`. Callers
with exhaustive `match` arms must add a `_ =>` wildcard. The
attribute is forward-looking — variants in this enum represent
low-level numerical / boundary conditions whose set evolves as
new failure modes are surfaced or as internal kernels stop
emitting one. The attribute lets future variant additions /
retirements stay non-breaking after this point.
- `diarization::embed::Error::Fbank(String)` variant removed. The
variant was tied to the previous `kaldi-native-fbank` C++ backend,
which has been replaced by an in-tree torchaudio-compliance fbank
port (no `Result<_, String>` boundary to wrap). Code that matched
the variant directly will not compile.


The pyannote-community-1 offline + streaming-offline pipelines now
ship in full: VBx clustering, PLDA, AHC, centroid + Hungarian
assignment, reconstruction, RTTM emission. The crate exposes both
Expand Down
15 changes: 9 additions & 6 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ rust-version = "1.95"
# for attribution. Downstream redistributors of any binary linking `dia`
# MUST reproduce both the MIT segmentation attribution and the CC-BY-4.0
# PLDA attribution.
license = "(MIT OR Apache-2.0) AND MIT AND CC-BY-4.0"
license = "(MIT OR Apache-2.0) AND MIT AND CC-BY-4.0 AND BSD-3-Clause"
repository = "https://github.com/al8n/diarization"
homepage = "https://github.com/al8n/diarization"
documentation = "https://docs.rs/diarization"
Expand Down Expand Up @@ -157,15 +157,17 @@ thiserror = "2"
ort = { version = "2.0.0-rc.12", optional = true }
tch = { version = "0.24", optional = true }

kaldi-native-fbank = "0.1"
# Real-valued FFT for the bit-exact torchaudio.compliance.kaldi.fbank
# port (see `src/embed/fbank.rs`). PyTorch's `torch.fft.rfft`
# routes to pocketfft on CPU; `realfft` wraps `rustfft`'s
# Cooley-Tukey radix-2 path which produces the same spectrum within
# ~1e-7 relative — small enough that the resnet+pooling output stays
# within sub-ULP of pyannote on the 14-audio bench.
realfft = "3"
Comment on lines +160 to +166
nalgebra = "0.34"
rand = { version = "0.10", default-features = false }
rand_chacha = { version = "0.10", default-features = false }

# Constrained Hungarian assignment.
ordered-float = "5.3"
pathfinding = "4.15"

# AHC initialization (centroid-method linkage).
kodama = "0.3"

Expand Down Expand Up @@ -347,4 +349,5 @@ unexpected_cfgs = { level = "warn", check-cfg = [
'cfg(diarization_disable_avx512)',
'cfg(diarization_assert_avx2)',
'cfg(diarization_assert_avx512)',
'cfg(diarization_assert_neon)',
] }
51 changes: 51 additions & 0 deletions NOTICE
Original file line number Diff line number Diff line change
Expand Up @@ -71,3 +71,54 @@ therefore NOT shipped with the crate. Callers obtain it via
`scripts/download-embed-model.sh` (Apache-2.0 source from the
WeSpeaker project; ONNX export from the `onnx-community` HuggingFace
organization).

────────────────────────────────────────────────────────────────────────
4. SciPy `rectangular_lsap.cpp` — direct Rust port

The file `src/cluster/hungarian/lsap.rs` is a Rust port of SciPy's
`rectangular_lsap.cpp`, the C++ reference implementation backing
`scipy.optimize.linear_sum_assignment`. Used for bit-for-bit
tie-break parity with pyannote's `constrained_argmax`.

Source:
scipy/scipy@main:scipy/optimize/rectangular_lsap/rectangular_lsap.cpp
https://github.com/scipy/scipy/blob/main/scipy/optimize/rectangular_lsap/rectangular_lsap.cpp

Authors:
PM Larsen (port author, original SciPy contribution)
DF Crouse (algorithm, IEEE TAES 52(4):1679–1696, 2016 —
doi:10.1109/TAES.2016.140952)

License: BSD-3-Clause

Copyright (c) 2008-2024, SciPy developers.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.

3. Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ diarization output.

```sh
# Pinned upstream revision + expected SHA-256 of the FP32 single-file ONNX.
DIA_EMBED_MODEL_REV="38168b544a562dec24d49e63786c16e80782eeaf"
DIA_EMBED_MODEL_SHA256="4c15c6be4235318d092c9d347e00c68ba476136d6172f675f76ad6b0c2661f01"
DIA_EMBED_MODEL_REV="6eef479c954ec180e79cee316af2f16d5f7720bd"
DIA_EMBED_MODEL_SHA256="f23f04aa9d0f6b8b0a28de016d226dcbe92d7461a6e58045401acfbed623838a"
mkdir -p models
TMP="$(mktemp "${TMPDIR:-/tmp}/wespeaker_resnet34_lm.XXXXXXXXXX")"
```
Expand Down
30 changes: 24 additions & 6 deletions ci/miri_sb.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,28 @@ cargo miri setup

export MIRIFLAGS="-Zmiri-strict-provenance -Zmiri-disable-isolation -Zmiri-symbolic-alignment-check"

# Same scope and configuration as `miri_tb.sh`: SIMD-only test filter
# (`ops::`), scalar dispatcher forced via `diarization_force_scalar`
# (miri can't evaluate intrinsics), `--no-default-features` (skips ort
# C++ runtime that miri can't FFI-call). See `miri_tb.sh` for the full
# rationale.
# Same scope and configuration as `miri_tb.sh`: SIMD-only test
# filter (`ops::` + `embed::fbank::tests`), scalar dispatcher forced
# via `diarization_force_scalar` (miri can't evaluate intrinsics),
# `--no-default-features` (skips ort C++ runtime that miri can't
# FFI-call), per-backend direct unsafe-call tests skipped because
# they call NEON/SSE2/AVX2/AVX-512F kernels directly (no SIMD
# evaluation under miri). See `miri_tb.sh` for the full rationale.
export RUSTFLAGS="${RUSTFLAGS:-} --cfg diarization_force_scalar"
cargo miri test --lib --target "$TARGET" --no-default-features ops::
# See `miri_tb.sh` for the rationale on the explicit fbank
# allowlist. Same set of tests under stacked-borrows.
cargo miri test \
--lib --target "$TARGET" --no-default-features \
-- \
ops:: \
embed::fbank::tests::dot_panics_on_length_mismatch_in_release \
embed::fbank::tests::window_panics_on_length_mismatch_in_release \
embed::fbank::tests::power_panics_on_length_mismatch_in_release \
embed::fbank::tests::dot_kernels_agree_with_scalar \
embed::fbank::tests::nan_propagates_through_log_floor \
embed::fbank::tests::force_scalar_cfg_routes_through_scalar_when_set \
embed::fbank::tests::shrink_before_resize_drops_oversized_when_call_small \
embed::fbank::tests::shrink_before_resize_keeps_buffer_when_call_huge \
embed::fbank::tests::shrink_before_resize_leaves_bounded_buffer \
embed::fbank::tests::shrink_after_loop_drops_oversized \
embed::fbank::tests::shrink_after_loop_keeps_bounded_buffer
54 changes: 43 additions & 11 deletions ci/miri_tb.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,28 +37,60 @@ export MIRIFLAGS="-Zmiri-strict-provenance -Zmiri-disable-isolation -Zmiri-symbo

# Scope and configuration:
#
# 1. Test filter `ops::` — every `unsafe` block in this crate's
# production source lives under `src/ops/` (verified by
# `grep -rn "unsafe " src/ --include='*.rs'`). The rest is safe
# Rust, so miri adds no signal there.
# 1. Test filters `ops::` and `embed::fbank::tests` — every `unsafe`
# block in this crate's production source lives under either
# `src/ops/` (cluster + embed numerical primitives) or
# `src/embed/fbank.rs` (NEON/SSE2/AVX2/AVX-512F window-mul,
# power-spectrum, dot kernels added with the torchaudio fbank
# port). The rest is safe Rust, so miri adds no signal there.
#
# 2. `--cfg diarization_force_scalar` — miri can't evaluate foreign
# LLVM intrinsics like `llvm.aarch64.neon.faddv.f64.v2f64` (NEON)
# or `llvm.x86.avx2.*`. Without this cfg, the dispatcher hits its
# arch-specific path and miri errors `unsupported operation`. With
# this cfg every `*_available()` helper short-circuits to `false`
# and the dispatcher falls through to the scalar reference. The
# and the dispatcher falls through to the scalar reference. Inside
# `src/embed/fbank.rs` the same `if cfg!(diarization_force_scalar)`
# guard at the top of `fma_dot_f32_to_f64` / `apply_window_inplace`
# / `power_spectrum` ensures miri sees the scalar path. The
# intrinsic paths themselves are exercised natively under SDE
# (AVX2 and AVX-512 — see ci/sde_avx2.sh, ci/sde_avx512.sh) and on
# the regular test job (NEON on aarch64 hosts; AVX2 on Linux x86
# hosts that have it).
# hosts that have it). Per-backend direct unsafe-call tests in
# `embed::fbank::tests` (e.g. `dot_neon_agrees_with_scalar_directly`)
# are filtered out under force_scalar because they call the unsafe
# SIMD kernels directly — miri only exercises the dispatcher /
# scratch / scalar paths.
#
# 3. `--no-default-features` — skips `ort` (the default feature) and
# its `ort-sys` C++ runtime, plus the transitive
# `kaldi-native-fbank` C bindings. miri can't execute foreign
# function calls anyway, so these would error before our test
# code runs.
# its `ort-sys` C++ runtime. miri can't execute foreign function
# calls anyway, so this would error before our test code runs.
#
# — pattern mirrors siglip2's miri job.
export RUSTFLAGS="${RUSTFLAGS:-} --cfg diarization_force_scalar"
cargo miri test --lib --target "$TARGET" --no-default-features ops::
# Explicit allowlist for `embed::fbank::tests` rather than the whole
# module: realfft (`= 3` with default features) pulls rustfft, whose
# default planners select NEON/SSE/AVX kernels at runtime. Miri can't
# evaluate those intrinsics. The tests in the allowlist below DO NOT
# call into the FFT path under force-scalar — they exercise the
# scalar dot/window/power/log paths, length-mismatch guards, NaN
# propagation, and TLS scratch capacity bookkeeping. The
# `caps_oversized_scratch_capacity` test does call
# `compute_full_fbank` once with a single-frame input (one size-512
# FFT) — Miri tolerates that at the time of writing, but if rustfft
# regresses on Miri-supported intrinsics this is the test to drop.
cargo miri test \
--lib --target "$TARGET" --no-default-features \
-- \
ops:: \
embed::fbank::tests::dot_panics_on_length_mismatch_in_release \
embed::fbank::tests::window_panics_on_length_mismatch_in_release \
embed::fbank::tests::power_panics_on_length_mismatch_in_release \
embed::fbank::tests::dot_kernels_agree_with_scalar \
embed::fbank::tests::nan_propagates_through_log_floor \
embed::fbank::tests::force_scalar_cfg_routes_through_scalar_when_set \
embed::fbank::tests::shrink_before_resize_drops_oversized_when_call_small \
embed::fbank::tests::shrink_before_resize_keeps_buffer_when_call_huge \
embed::fbank::tests::shrink_before_resize_leaves_bounded_buffer \
embed::fbank::tests::shrink_after_loop_drops_oversized \
embed::fbank::tests::shrink_after_loop_keeps_bounded_buffer
Loading
Loading