Conversation
…e A) Foundation for the planned dia-ort / dia-tch / dia-burn split. This commit is mechanical: move all current crate contents into `crates/dia-core/`, set up the workspace at the repo root, and preserve every API + behavior. No test regressions. Layout change: - `src/` → `crates/dia-core/src/` - `tests/` → `crates/dia-core/tests/` - `examples/` → `crates/dia-core/examples/` - `benches/` → `crates/dia-core/benches/` - `build.rs` → `crates/dia-core/build.rs` - `models/` → `crates/dia-core/models/` - `README.md` → `crates/dia-core/README.md` - `Cargo.toml` → `crates/dia-core/Cargo.toml` - New root `Cargo.toml` declares `[workspace]` + `[workspace.lints]` (extracted from the package-level `[workspace.lints.rust]` that was inadvertently making `dia-core` its own workspace root). - `crates/dia-core/tests/parity` stays excluded as a sub-Cargo (uv-managed parity harness, lives outside workspace resolution). Why now: enables Phase B/C — extracting `dia-ort`, `dia-tch`, and the new `dia-burn` (pure-Rust burn-onnx-backed inference) into sibling crates with isolated dep graphs. The current single-crate layout makes burn integration impossible because cargo's `links = "tch"` collision check evaluates burn 0.21's optional `burn-tch` (links to tch ^0.22) against our `tch = "0.24"` even when neither feature activates. Per-crate isolation breaks that graph-level conflict. Phase A in this commit. Phase B (extract per-backend crates) + Phase C (working dia-burn segmentation backend) are follow-up commits on this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two empty workspace members alongside `crates/dia-core/`:
- `crates/dia-ort/` — ONNX Runtime backend. Will host
`SegmentModel` + `EmbedModel` (ort path) + `ep` + `ort_serde`
in subsequent commits.
- `crates/dia-tch/` — TorchScript / libtorch backend. Will host
`EmbedModel::from_torchscript_file` + the `tch` `EmbedInner`
variant.
`tch` is gated behind a feature in `dia-tch` so a contributor
without `LIBTORCH` can still build the workspace; `torch-sys` only
links when the feature activates. Same pattern as the previous
top-level `tch` feature.
Both crates depend on `diarization = { path = "../dia-core" }` —
the dia-core package is still named `diarization` mid-migration so
internal `use crate::*` and downstream `use diarization::*` keep
compiling at every step. The package rename to `dia-core` lands
when the meta-crate `diarization` arrives in a later step.
No code moves in this commit — empty `lib.rs` placeholders only.
Phase B step 2 starts moving ort-coupled code from `crates/dia-core/`
into `crates/dia-ort/`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… on burn-onnx upstream) Adds `crates/dia-burn` as a workspace-excluded sibling crate — separate resolution graph dodges the `links = "tch"` collision between the parent's `tch = 0.24` and `burn 0.21.0-pre.5`'s optional `burn-tch ^0.22`. The crate is intentionally a documented stub today. Both dia ONNX models hit upstream `burn-onnx` 0.21.0-pre.5 codegen bugs: - pyannote/segmentation-3.0: `If`-op rank propagation gap makes the first `Conv1d` translator see rank-4 instead of rank-3 → codegen exits with no Rust emitted. - wespeaker_resnet34_lm: codegen succeeds (606 LoC + 25 MB .bpk) but emits an OOB array index in the `Resize` lowering → does not compile. Both are upstream-track. The full codegen + weight-load pipeline is preserved behind an `unstable-onnx-codegen` feature so contributors can flip one flag and reproduce the failures. Default builds get a working stub: `BurnEmbedModel::from_embedded()` constructs cleanly, `embed_chunk_with_frame_mask` returns `NotYetImplemented`, and the public types/consts mirror dia-ort's contract so the eventual swap won't be a breaking change. README.md walks through the failures + fix paths in detail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e through CI Phase B step 1 follow-up: dia-ort and dia-tch are now thin re-export shims over `diarization` (a.k.a. dia-core) with the relevant features pre-activated. Downstream users get a single `cargo add dia-ort` / `cargo add dia-tch` entry-point instead of feature-flag wrangling on the umbrella crate; the cfg-gated ORT and tch modules still live physically in dia-core for now and migrate across in step 2. CI changes for the workspace move: - All workspace-rooted cargo invocations targeting dia-core's feature surface now pass `-p diarization` explicitly (cargo-hack sweeps, cross builds, ep-link-check, docs.rs-equivalent doc build, tarpaulin coverage, the AVX2/AVX512 SDE jobs, both miri jobs, the sanitizer pass, and the AVX-asserting parity test runs). Without `-p`, a virtual workspace `--no-default-features` becomes a no-op and `--features X` resolves against a feature-less manifest. - New `dia-burn (standalone)` CI job that builds + tests `dia-burn` from its own crate dir. It's workspace-excluded by design (links = "tch" collision between the parent's `tch = 0.24` and burn's optional `burn-tch ^0.22`) so a plain `cargo build --workspace` never touches it; without this job, regressions there would ship undetected. - Drop the stale `silero-vad` feature reference from the docs.rs- equivalent doc build and tarpaulin coverage run (the feature was removed when silero became a registry dev-dep on `chore/cleanup-ci`, but the workflow lines were missed). - `tch-compile-check` now points at `-p dia-tch` to match the new shim layout. Profile cleanup: - Move `[profile.bench]` from `crates/dia-core/Cargo.toml` to the workspace root. Cargo only honors `[profile.*]` at the workspace root once a package becomes a member; the in-crate copy was being silently ignored, surfacing as a "profiles for the non root package will be ignored" warning on every workspace build. Local verification: 532 dia-core tests pass, dia-burn standalone tests pass, dia-ort/dia-tch clippy clean, fmt --check clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
crates/dia-corewith phase-A backend split groundwork:crates/dia-{ort,tch,burn}siblings around the algorithm core. The package keeps itsdiarizationname — only the directory moved.dia-burn, a documented placeholder for a pure-Rust burn-onnx backend targeting platforms ORT can't ship prebuilts to (powerpc64,riscv64,s390x,i686,wasm32-*). Inference itself is not yet wired up —burn-onnx 0.21.0-pre.5rejects pyannote/segmentation-3.0 at codegen and emits non-compiling Rust for wespeaker (full repro recipe in the crate README +build.rs).dia-ort/dia-tchinto thin re-export shims overdiarizationwith the right features pre-activated. Downstream gets a singlecargo add dia-ort/cargo add dia-tchentry-point; the actual cfg-gated modules still live in dia-core for now and migrate physically in a follow-up.-p diarizationthrough every workspace-rooted cargo invocation (cargo-hack sweeps, cross builds, ep-link-check, docs.rs-equivalent doc build, tarpaulin, both miri jobs, both SDE jobs, sanitizer, AVX-asserting parity tests). Without this, a virtual workspace--no-default-featuresbecomes a no-op and feature flags resolve against an empty manifest. Newdia-burn (standalone)CI job builds + tests it from its own crate dir (workspace-excluded —links = "tch"collision with parenttch = 0.24vs burn's optionalburn-tch ^0.22).silero-vadfeature reference from the docs.rs and tarpaulin commands, and lift[profile.bench]up to the workspace root so it's actually honored.Scope deliberately deferred
crates/dia-core/src/embed/andsegment/into the sibling crates. The current setup tightens the public contract (downstream stops feature-flag wrangling) without an atomic 2k-line move. Follow-up PR.burn-onnxcodegen failures (model-side:If-op rank propagation; runtime-side:Resize-op codegen) are upstream-track. The crate's public surface is stable so the eventual swap won't be a breaking change.Not planned
diarizationpackage todia-core. Per maintainer call we keep thediarizationname; the directory layout (crates/dia-core/) is purely workspace-organizational.Test plan
cargo build --workspaceclean (no profile-warning noise after the bench-profile move)cargo test --workspace --lib— 532 dia-core tests pass, dia-ort/dia-tch shims import cleanly, dia-burn lib (workspace-excluded) builds + 2 stub tests pass when run fromcrates/dia-burn/cargo clippy -p dia-ort -p dia-tchcleancargo clippyinsidecrates/dia-burncleancargo fmt --checkcleandia-burn (standalone)CI job goes greenbuild,test,clippy) still terminate in reasonable time on the runnerstch-compile-check(-p dia-tch) andep-link-checkmatrix (-p diarization) succeedcargo build --features unstable-onnx-codegenfromcrates/dia-burn/reproduces the wespeaker codegen → compile failure documented in the README🤖 Generated with Claude Code