feat: optional arbitrary + quickcheck support for the descriptor vocabulary#8
Merged
Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
uqio
added a commit
that referenced
this pull request
May 22, 2026
…ded enums + all 12 SampleFormat slugs Codex round-2 findings on PR #8: 1. [high] `Matrix`/`Primaries`/`Transfer`/`PixelFormat` were still on `arb_via_code!` (uniform u32 → from_u32). Their named codes cluster in low integers (Matrix 0..=17, Primaries 0..=22, Transfer 0..=18, PixelFormat 270 codes over 0..=947), so uniform u32 reaches a named variant essentially never — fuzz/property tests over colour + pixel descriptors ran almost entirely on the `Unknown(_)` path. Fix: new `arb_via_code_weighted_range!(Ty, max_named = N)` macro — 50/50 between `int_in_range(0..=max_named)` (lands on named variants most of the time; gaps fall on `Unknown`, fine) and full-range u32 (broad `Unknown` exercise). Applied to the four large enums; `max_named` read from each `to_u32` match. `TrackDisposition` stays on `arb_via_code!` — it's bitflags, every u32 bit pattern is meaningful. 2. [medium] The bespoke `SampleFormat` generator's named branch only sampled 6 of the 12 named slugs, so `dbl`/`u8p`/`s32p`/`dblp`/`s64`/ `s64p` were reachable only via the rare numeric branch drawing their exact 0..=11 code. Fix: `SLUGS` const now lists all 12 canonical slugs. Tests: - `reachability_sample_format_all_named_plus_arms` — strengthened to assert all 12 named variants observed (HashSet of `as_str()`), plus `Unknown(_)` and `Other(_)`, over 4096 rounds. - `reachability_range_weighted_enums_hit_named_codes` (new) — asserts the range-weighted enums hit a broad named-code set: Matrix ≥10, Primaries ≥8, Transfer ≥10, PixelFormat ≥40 distinct named codes over 8192 rounds. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features arbitrary --lib 163 pass check -p mediaframe / --features arbitrary,serde / --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…d enums + all 12 SampleFormat slugs Codex round-2 findings on PR #9 (mirror of the PR #8 fixes): 1. [medium] `matrix`/`primaries`/`transfer`/`pixel_format` helpers still decoded a uniform `u32` through `from_u32`. Their named codes cluster in low integers (Matrix 0..=17, Primaries 0..=22, Transfer 0..=18, PixelFormat 270 codes over 0..=947), so QuickCheck `u32::arbitrary` reaches a named variant essentially never — properties over colour / pixel descriptors ran almost entirely on the `Unknown(_)` path. Fix: new `qc_via_code_weighted_range!(fn => Ty, max_named = N)` macro — 50/50 between an in-range code (`u32 % (max+1)`, since `quickcheck:: Gen` has no `int_in_range`) and a full-range `u32`. `track_disposition` stays on `arb_via_code!` — bitflags, every bit pattern meaningful. 2. [medium] The bespoke `sample_format` helper's named arm only sampled 6 of 12 slugs; `dbl`/`u8p`/`s32p`/`dblp`/`s64`/`s64p` were reachable only via the rare numeric arm. `SLUGS` now lists all 12. Tests: - `reachability_sample_format_all_named_plus_arms` — strengthened to assert all 12 named variants observed + `Unknown(_)` + `Other(_)`. - `reachability_range_weighted_enums_hit_named_codes` (new) — Matrix ≥10, Primaries ≥8, Transfer ≥10, PixelFormat ≥40 distinct named codes. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck --lib 163 pass check -p mediaframe / --features arbitrary,quickcheck / --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…ded enums + all 12 SampleFormat slugs Codex round-2 findings on PR #8: 1. [high] `Matrix`/`Primaries`/`Transfer`/`PixelFormat` were still on `arb_via_code!` (uniform u32 → from_u32). Their named codes cluster in low integers (Matrix 0..=17, Primaries 0..=22, Transfer 0..=18, PixelFormat 270 codes over 0..=947), so uniform u32 reaches a named variant essentially never — fuzz/property tests over colour + pixel descriptors ran almost entirely on the `Unknown(_)` path. Fix: new `arb_via_code_weighted_range!(Ty, max_named = N)` macro — 50/50 between `int_in_range(0..=max_named)` (lands on named variants most of the time; gaps fall on `Unknown`, fine) and full-range u32 (broad `Unknown` exercise). Applied to the four large enums; `max_named` read from each `to_u32` match. `TrackDisposition` stays on `arb_via_code!` — it's bitflags, every u32 bit pattern is meaningful. 2. [medium] The bespoke `SampleFormat` generator's named branch only sampled 6 of the 12 named slugs, so `dbl`/`u8p`/`s32p`/`dblp`/`s64`/ `s64p` were reachable only via the rare numeric branch drawing their exact 0..=11 code. Fix: `SLUGS` const now lists all 12 canonical slugs. Tests: - `reachability_sample_format_all_named_plus_arms` — strengthened to assert all 12 named variants observed (HashSet of `as_str()`), plus `Unknown(_)` and `Other(_)`, over 4096 rounds. - `reachability_range_weighted_enums_hit_named_codes` (new) — asserts the range-weighted enums hit a broad named-code set: Matrix ≥10, Primaries ≥8, Transfer ≥10, PixelFormat ≥40 distinct named codes over 8192 rounds. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features arbitrary --lib 163 pass check -p mediaframe / --features arbitrary,serde / --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…d enums + all 12 SampleFormat slugs Codex round-2 findings on PR #9 (mirror of the PR #8 fixes): 1. [medium] `matrix`/`primaries`/`transfer`/`pixel_format` helpers still decoded a uniform `u32` through `from_u32`. Their named codes cluster in low integers (Matrix 0..=17, Primaries 0..=22, Transfer 0..=18, PixelFormat 270 codes over 0..=947), so QuickCheck `u32::arbitrary` reaches a named variant essentially never — properties over colour / pixel descriptors ran almost entirely on the `Unknown(_)` path. Fix: new `qc_via_code_weighted_range!(fn => Ty, max_named = N)` macro — 50/50 between an in-range code (`u32 % (max+1)`, since `quickcheck:: Gen` has no `int_in_range`) and a full-range `u32`. `track_disposition` stays on `arb_via_code!` — bitflags, every bit pattern meaningful. 2. [medium] The bespoke `sample_format` helper's named arm only sampled 6 of 12 slugs; `dbl`/`u8p`/`s32p`/`dblp`/`s64`/`s64p` were reachable only via the rare numeric arm. `SLUGS` now lists all 12. Tests: - `reachability_sample_format_all_named_plus_arms` — strengthened to assert all 12 named variants observed + `Unknown(_)` + `Other(_)`. - `reachability_range_weighted_enums_hit_named_codes` (new) — Matrix ≥10, Primaries ≥8, Transfer ≥10, PixelFormat ≥40 distinct named codes. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck --lib 163 pass check -p mediaframe / --features arbitrary,quickcheck / --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…ly values Codex round-4 finding on PR #9: `quickcheck_helpers::composite::tags` populated 12 of the 13 `Tags` fields but never `language`, so every generated `Tags` had `language == None` — property tests could not catch language-tag loss / corruption. Fix: the `tags` helper now generates `language` too, 50/50 between `None` and `Some(<arbitrary string>)`. Also carried the PR #8 round-4 canonical-value fix into the quickcheck helpers (same latent bug, separate file tree): - `qc_open_string_enum!` and the bespoke `sample_format` helper built `Other(SmolStr::from(s))` directly; a string equal to a named slug produced a malformed `Other("h264")` that serde canonicalises to `H264` — not round-trippable. - Both now route string construction through `FromStr` (the canonicalising constructor), so every generated value is canonical. Tests: - `reachability_tags_language_hits_none_and_some` — asserts both `None` and `Some(_)` are observed. - `arbitrary_values_survive_serde_round_trip` (gated on `serde`) — 4096 rounds, `SampleFormat` + `VideoCodec` round-trip through `serde_json`. This branch was also rebased onto the updated `feat/arbitrary` (which itself rebased onto the serde-fixed `feat/serde-support`) — keeps the stack coherent so reviews see the real combined state. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck --lib 164 pass test -p mediaframe --features quickcheck,serde --lib 174 pass check -p mediaframe --all-features ✓
Adds the `arbitrary` feature + `arbitrary 1` optional dep and the empty
module skeleton: `arbitrary_impls/{mod.rs, strings.rs, coded.rs, composite.rs}`.
Cluster ownership (each owns its own file; cross-cluster type cascades resolve
through the module's items):
strings.rs — open string enums w/ `Other(SmolStr)` (codec×3, container,
subtitle::Format, audio open formats).
coded.rs — closed `from_u32` coded enums + colour / frame / pixel-format
/ disposition structs and enums.
composite.rs — audio composite metadata, capture, language.
Shared `arb_via_code!` and `arb_open_string_enum!` macros live in `mod.rs`.
This is the baseline commit; the three cluster files are stub-only — the
actual impls land in three parallel follow-up commits, one per cluster.
…tructs
Coded enums via the `arb_via_code!` macro (single `from_u32` round-trip
covers every named variant + the `Unknown(u32)` / `Reserved(_)` arms):
- color::{Matrix, Primaries, Transfer, DynamicRange, ChromaLocation,
DcpTargetGamut}
- pixel_format::PixelFormat
- frame::{Rotation, FieldOrder, StereoMode}
- subtitle::TrackOrigin
- audio::BitRateMode
- disposition::TrackDisposition (bitflags w/ lossless to_u32/from_u32)
Colour structs (hand-written via public `new`):
- color::Info (Primaries, Transfer, Matrix, DynamicRange,
ChromaLocation — cascades through cluster's
own coded-enum impls)
- color::ContentLightLevel (u32, u32)
- color::ChromaCoord (u32, u32)
- color::MasteringDisplay ([ChromaCoord; 3], ChromaCoord, u32, u32)
- color::HdrStaticMetadata (Option<MasteringDisplay>,
Option<ContentLightLevel>)
- color::DolbyVisionConfig (u8, u8, bool, bool, u8)
Frame structs (hand-written via public `new`):
- frame::Dimensions (u32, u32)
- frame::Rect (u32, u32, u32, u32)
- frame::Rational (u32, NonZeroU32 — uses
`NonZeroU32::arbitrary`, supplied by the
`arbitrary` crate, so the fuzzer skips
the zero corpus byte instead of clamping)
- frame::SampleAspectRatio (u32, NonZeroU32)
- frame::FrameRate (Rational, bool)
No constructor shapes differed from the brief.
Types covered:
* audio::Loudness — plain Loudness::new(f32×4) over arbitrary f32s
* audio::Fingerprint — try_new(algorithm, value); algorithm
guaranteed non-empty via 'x' fallback so the
expect() is sound; empty value allowed
* audio::CoverArt — try_new(mime, data); mime falls back to
'application/octet-stream' and data to [0u8]
so both non-empty invariants hold by
construction
* audio::Tags — Tags::new() + maybe_*/with_* builders
covering 12 fields: title, artist,
album_artist, album, composer, genre,
comment (SmolStr empty=absent) + year,
track_number, track_total, disc_number,
disc_total (Option<u16>, None≠Some(0))
* capture::Device — Device::new() + with_make + with_model;
SmolStr empty=absent passes through
* capture::GeoLocation — try_new(lat, lon, Option<altitude>);
lat/lon built with int_in_range so they're
in [-90,90]/[-180,180] by construction
(1/100° resolution); altitude is always
finite f32 when Some, so the expect()
holds
* lang::Language — from_bcp47 on a curated 12-tag list
(und, en, en-US, es, fr, de, ja, zh-Hant-TW,
pt-BR, ar, ru, ko) covering language-only,
language+region, language+script+region,
and the und sentinel
Validation strategy: for every try_new type, valid inputs are
produced FIRST (in-range ints via int_in_range, non-empty strings
via fallback constants) before .expect() — fuzz input never reaches
a validator in a way that could panic.
Verified: cargo fmt --check / RUSTFLAGS=-Dwarnings cargo check -p
mediaframe --features arbitrary / RUSTFLAGS=-Dwarnings cargo check
-p mediaframe (default).
…tle/audio) Fills in mediaframe/src/arbitrary_impls/strings.rs with hand-written `impl arbitrary::Arbitrary` blocks for the eight open string enums in the descriptor vocabulary. Strategy: `super::arb_open_string_enum!` with ~6 curated canonical FFmpeg / file-extension slugs per type (drawn from each type's own `as_str()` match). The macro flips a coin between round-tripping a named slug through the type's total `FromStr` and constructing the `Other(SmolStr)` lossless-escape arm from an arbitrary string — every type covered here satisfies the macro's preconditions (`Other(SmolStr)` arm + `FromStr` with `Err = core::convert::Infallible`). Types covered: - codec::VideoCodec (h264, hevc, av1, vp9, mpeg4, prores) - codec::AudioCodec (aac, mp3, opus, flac, ac3, alac) - codec::SubtitleCodec (srt, ass, ssa, webvtt, mov_text, dvb_subtitle) - container::Format (mp4, mkv, webm, mov, avi, mpegts) - subtitle::Format (srt, webvtt, ass, ssa, mov_text, ttml) - audio::ChannelLayout (mono, stereo, 5.1, 7.1, quad, 5.0) - audio::SampleFormat (s16, s32, flt, s16p, fltp, u8) - audio::ContainerFormat (mp3, aac, flac, wav, m4a, opus) Nothing skipped: every type listed in the cluster brief is an open string enum (audio::SampleFormat is mixed — has both `Unknown(u32)` and `Other(SmolStr)` — but the brief specifies open-string strategy for the mixed case so the `Other` arm gets exercised, since the closed-coded path is covered by cluster B types). Coded-only enums (no `Other`) are left for cluster B's `arb_via_code!`.
Adds a deterministic test harness (`drive`) that mixes a seed into a 4 KiB buffer and feeds it through `Unstructured` across N rounds. Five tests gated on `#[cfg(test)]` (the test module is unconditional, but the wider `arbitrary_impls` module is `#[cfg(feature = "arbitrary")]`): - geo_location_invariant_lat_lon_in_range (256 rounds) - fingerprint_invariant_algorithm_non_empty (256 rounds) - cover_art_invariant_mime_and_data_non_empty (256 rounds) - smoke_yields_values_for_representative_types (64 rounds, 5 types) - coded_enums_roundtrip_through_code (128 rounds, 4 enums) Verifies what hand-written `Arbitrary` impls can get wrong: validated constructors' invariants survive arbitrary fuzz bytes (no `.unwrap()` panics on attacker input), and coded enums round-trip through their `to_u32` / `from_u32` code losslessly.
…nums + bespoke SampleFormat 3-way
Two Codex round-1 findings on the descriptor-vocabulary 'arbitrary::Arbitrary'
impls:
[medium] arb_via_code! almost never reaches named variants of low-
cardinality coded enums. Feeding a uniform u32 into 'from_u32' for an
enum whose named codes are 0/1/2 lands on a named variant
~3-in-4-billion of the time; everything else collapses to the default
(closed enums) or 'Unknown(_)' (open enums). Most named variants are
effectively unreachable in fuzz / property-driven runs.
[medium] SampleFormat::Unknown(u32) is unreachable. SampleFormat
carries BOTH 'Unknown(u32)' AND 'Other(SmolStr)'; the shared
'arb_open_string_enum!' macro only exercises curated slugs and
'Other', skipping the numeric-escape arm entirely.
Fix — two new macros in arbitrary_impls/mod.rs:
arb_via_named_variants! strictly closed coded enums (no
'Unknown(u32)' arm). Picks uniformly from
the named variants via 'Unstructured::choose'.
Applied to 'audio::BitRateMode' and
'subtitle::TrackOrigin'.
arb_via_code_weighted! closed coded enums WITH 'Unknown(u32)' and
< 10 named variants. 50/50 between named
'choose' and arbitrary-u32 'from_u32'.
Applied to 'color::{DynamicRange,
ChromaLocation, DcpTargetGamut}' and
'frame::{Rotation, FieldOrder, StereoMode}'.
'arb_via_code!' is retained for the large coded enums where uniform u32
genuinely exercises the named ranges ('color::{Matrix, Primaries,
Transfer}', 'pixel_format::PixelFormat') and for the bitflags
('disposition::TrackDisposition').
For 'SampleFormat' a bespoke 3-way generator lives next to the open-
string-enum invocations in strings.rs: pick a curated slug via
'FromStr', a uniform u32 via 'from_u32' (so 'Unknown(v)' is hit on
every non-canonical code, ~99.99% of the value space), or an arbitrary
'String' wrapped in 'SmolStr::from' for 'Other'.
Reachability tests live next to the shared 'drive' helper in
arbitrary_impls/mod.rs. A new 'drive_per_round' constructs a fresh
'Unstructured' (seeded by a SplitMix64-ish per-round state) for every
round, avoiding the single-buffer exhaustion that biases every per-
round decode to the all-zero fallback. Tests assert:
* 'BitRateMode' and 'TrackOrigin' visit every named variant.
* 'Rotation' visits every named variant AND 'Unknown(_)'.
* 'SampleFormat' visits all three arms (named / Unknown / Other).
Verified: 'cargo fmt -p mediaframe --check', 'cargo test -p mediaframe
--features arbitrary --lib' (162 passed), and 'cargo check' across
default / arbitrary+serde / no-default+arbitrary+alloc / all-features,
all with '-Dwarnings'.
…ded enums + all 12 SampleFormat slugs Codex round-2 findings on PR #8: 1. [high] `Matrix`/`Primaries`/`Transfer`/`PixelFormat` were still on `arb_via_code!` (uniform u32 → from_u32). Their named codes cluster in low integers (Matrix 0..=17, Primaries 0..=22, Transfer 0..=18, PixelFormat 270 codes over 0..=947), so uniform u32 reaches a named variant essentially never — fuzz/property tests over colour + pixel descriptors ran almost entirely on the `Unknown(_)` path. Fix: new `arb_via_code_weighted_range!(Ty, max_named = N)` macro — 50/50 between `int_in_range(0..=max_named)` (lands on named variants most of the time; gaps fall on `Unknown`, fine) and full-range u32 (broad `Unknown` exercise). Applied to the four large enums; `max_named` read from each `to_u32` match. `TrackDisposition` stays on `arb_via_code!` — it's bitflags, every u32 bit pattern is meaningful. 2. [medium] The bespoke `SampleFormat` generator's named branch only sampled 6 of the 12 named slugs, so `dbl`/`u8p`/`s32p`/`dblp`/`s64`/ `s64p` were reachable only via the rare numeric branch drawing their exact 0..=11 code. Fix: `SLUGS` const now lists all 12 canonical slugs. Tests: - `reachability_sample_format_all_named_plus_arms` — strengthened to assert all 12 named variants observed (HashSet of `as_str()`), plus `Unknown(_)` and `Other(_)`, over 4096 rounds. - `reachability_range_weighted_enums_hit_named_codes` (new) — asserts the range-weighted enums hit a broad named-code set: Matrix ≥10, Primaries ≥8, Transfer ≥10, PixelFormat ≥40 distinct named codes over 8192 rounds. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features arbitrary --lib 163 pass check -p mediaframe / --features arbitrary,serde / --all-features ✓
…bility Codex round-3 finding: `Matrix` has a mediaframe-domain variant `Bt601` at `DOMAIN_EXT_BASE` (`0x8000_0000`) — far outside the H.273 `0..=17` range the round-2 `arb_via_code_weighted_range!` macro biases toward, so `Bt601` was reachable only via the full-`u32` fallback (~1-in-8.6-billion); the reachability test missed it (counted only codes ≤ 17). Fix: `Matrix` is now hand-written (out of the range macro) — a 3-way `int_in_range(0..=2)` dispatch: in-range H.273 code / the domain-ext code (`DOMAIN_EXT_BASE`, decodes to `Bt601`) / full-range `u32`. `Matrix` was the only coded enum with a domain variant — Primaries / Transfer / PixelFormat have none, so they stay on the range macro. Test `reachability_range_weighted_enums_hit_named_codes` strengthened to assert `Matrix` observed `DOMAIN_EXT_BASE` specifically. 163 tests pass; `--all-features` clean.
…rippable) values
Codex round-4 finding: arbitrary-generated `SampleFormat` values commonly
lost identity through a serde round-trip — two causes:
1. The bespoke `SampleFormat` arbitrary impl built `Other(SmolStr::from(
<arbitrary String>))` directly. When the string happened to equal a
named slug (e.g. `"s16"`), that produced a malformed `Other("s16")`
which serde canonicalises to `S16` — not round-trippable.
Fix: the `Other` branch now routes through `FromStr` (the canonicalising
constructor — a named slug → the named variant, only a non-named slug →
`Other`). Every generated value is now canonical. The `arb_open_string_
enum!` macro had the identical latent bug for `codec::*` / `container::
Format` / `subtitle::Format` etc. (`Other("h264")` → `H264` on
round-trip) — fixed the same way: both macro branches go through
`FromStr`.
2. The `Unknown(u32)` concern in the finding was an artefact of the stale
PR base — `feat/arbitrary` was branched off the pre-fix `feat/serde-
support`. Rebasing onto the current serde tip brings in the bespoke
`SampleFormat` serde (commit "preserve SampleFormat::Unknown(u32)")
which preserves `Unknown(v)` numerically. No code change needed for
this half — the rebase resolves it.
New test `arbitrary_values_survive_serde_round_trip` (gated on
`feature = "serde"`) — 4096 rounds asserting `SampleFormat` and
`VideoCodec` arbitrary values round-trip through `serde_json` unchanged.
Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0):
test -p mediaframe --features arbitrary,serde --lib 173 pass
check -p mediaframe / --no-default-features arbitrary,alloc / --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…d enums + all 12 SampleFormat slugs Codex round-2 findings on PR #9 (mirror of the PR #8 fixes): 1. [medium] `matrix`/`primaries`/`transfer`/`pixel_format` helpers still decoded a uniform `u32` through `from_u32`. Their named codes cluster in low integers (Matrix 0..=17, Primaries 0..=22, Transfer 0..=18, PixelFormat 270 codes over 0..=947), so QuickCheck `u32::arbitrary` reaches a named variant essentially never — properties over colour / pixel descriptors ran almost entirely on the `Unknown(_)` path. Fix: new `qc_via_code_weighted_range!(fn => Ty, max_named = N)` macro — 50/50 between an in-range code (`u32 % (max+1)`, since `quickcheck:: Gen` has no `int_in_range`) and a full-range `u32`. `track_disposition` stays on `arb_via_code!` — bitflags, every bit pattern meaningful. 2. [medium] The bespoke `sample_format` helper's named arm only sampled 6 of 12 slugs; `dbl`/`u8p`/`s32p`/`dblp`/`s64`/`s64p` were reachable only via the rare numeric arm. `SLUGS` now lists all 12. Tests: - `reachability_sample_format_all_named_plus_arms` — strengthened to assert all 12 named variants observed + `Unknown(_)` + `Other(_)`. - `reachability_range_weighted_enums_hit_named_codes` (new) — Matrix ≥10, Primaries ≥8, Transfer ≥10, PixelFormat ≥40 distinct named codes. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck --lib 163 pass check -p mediaframe / --features arbitrary,quickcheck / --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…ly values Codex round-4 finding on PR #9: `quickcheck_helpers::composite::tags` populated 12 of the 13 `Tags` fields but never `language`, so every generated `Tags` had `language == None` — property tests could not catch language-tag loss / corruption. Fix: the `tags` helper now generates `language` too, 50/50 between `None` and `Some(<arbitrary string>)`. Also carried the PR #8 round-4 canonical-value fix into the quickcheck helpers (same latent bug, separate file tree): - `qc_open_string_enum!` and the bespoke `sample_format` helper built `Other(SmolStr::from(s))` directly; a string equal to a named slug produced a malformed `Other("h264")` that serde canonicalises to `H264` — not round-trippable. - Both now route string construction through `FromStr` (the canonicalising constructor), so every generated value is canonical. Tests: - `reachability_tags_language_hits_none_and_some` — asserts both `None` and `Some(_)` are observed. - `arbitrary_values_survive_serde_round_trip` (gated on `serde`) — 4096 rounds, `SampleFormat` + `VideoCodec` round-trip through `serde_json`. This branch was also rebased onto the updated `feat/arbitrary` (which itself rebased onto the serde-fixed `feat/serde-support`) — keeps the stack coherent so reviews see the real combined state. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck --lib 164 pass test -p mediaframe --features quickcheck,serde --lib 174 pass check -p mediaframe --all-features ✓
…Loudness floats Two codex round-5 findings on PR #8: 1. [high] The reachability tests used `std::collections::HashSet`. Under `--no-default-features --features arbitrary` the crate is no-std (`arbitrary` pulls only `alloc`), where `lib.rs` aliases `alloc as std` — and `alloc` has no `HashSet`. The cargo-hack feature-powerset CI path `--no-default-features --features arbitrary` failed to compile the test build. Fix: switched all reachability-test sets to `BTreeSet` (in `alloc::collections`, works on every tier). `BitRateMode` / `TrackOrigin` aren't `Ord`, so those two sets key on `to_u32()`. Also replaced a `&str::to_string()` (needs `ToString`, not in the no-std prelude) with `String::from`. 2. [medium] `Loudness`'s `arbitrary` impl filled its four `f32` fields via `f32::arbitrary`, which builds floats from raw bits — NaN / ±inf are common. `Loudness` derives serde; JSON serializes a non-finite `f32` as `null`, which then fails to deserialize back into `f32`. Arbitrary `Loudness` values did not reliably survive a serde round-trip. Fix: `Loudness` fields are now generated finite — a bounded integer `[-10_000_000, 10_000_000]` mapped to `f32 / 100.0` (finite, range [-100_000, 100_000], covers every real EBU R128 scalar). Extended `arbitrary_values_survive_serde_round_trip` to round-trip `Loudness` too (it previously covered only `SampleFormat` + `VideoCodec`). Verified (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --no-default-features --features arbitrary --lib 159 pass test -p mediaframe --features arbitrary,serde --lib 173 pass hack --each-feature check -p mediaframe ✓ check -p mediaframe --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…d enums + all 12 SampleFormat slugs Codex round-2 findings on PR #9 (mirror of the PR #8 fixes): 1. [medium] `matrix`/`primaries`/`transfer`/`pixel_format` helpers still decoded a uniform `u32` through `from_u32`. Their named codes cluster in low integers (Matrix 0..=17, Primaries 0..=22, Transfer 0..=18, PixelFormat 270 codes over 0..=947), so QuickCheck `u32::arbitrary` reaches a named variant essentially never — properties over colour / pixel descriptors ran almost entirely on the `Unknown(_)` path. Fix: new `qc_via_code_weighted_range!(fn => Ty, max_named = N)` macro — 50/50 between an in-range code (`u32 % (max+1)`, since `quickcheck:: Gen` has no `int_in_range`) and a full-range `u32`. `track_disposition` stays on `arb_via_code!` — bitflags, every bit pattern meaningful. 2. [medium] The bespoke `sample_format` helper's named arm only sampled 6 of 12 slugs; `dbl`/`u8p`/`s32p`/`dblp`/`s64`/`s64p` were reachable only via the rare numeric arm. `SLUGS` now lists all 12. Tests: - `reachability_sample_format_all_named_plus_arms` — strengthened to assert all 12 named variants observed + `Unknown(_)` + `Other(_)`. - `reachability_range_weighted_enums_hit_named_codes` (new) — Matrix ≥10, Primaries ≥8, Transfer ≥10, PixelFormat ≥40 distinct named codes. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck --lib 163 pass check -p mediaframe / --features arbitrary,quickcheck / --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…ly values Codex round-4 finding on PR #9: `quickcheck_helpers::composite::tags` populated 12 of the 13 `Tags` fields but never `language`, so every generated `Tags` had `language == None` — property tests could not catch language-tag loss / corruption. Fix: the `tags` helper now generates `language` too, 50/50 between `None` and `Some(<arbitrary string>)`. Also carried the PR #8 round-4 canonical-value fix into the quickcheck helpers (same latent bug, separate file tree): - `qc_open_string_enum!` and the bespoke `sample_format` helper built `Other(SmolStr::from(s))` directly; a string equal to a named slug produced a malformed `Other("h264")` that serde canonicalises to `H264` — not round-trippable. - Both now route string construction through `FromStr` (the canonicalising constructor), so every generated value is canonical. Tests: - `reachability_tags_language_hits_none_and_some` — asserts both `None` and `Some(_)` are observed. - `arbitrary_values_survive_serde_round_trip` (gated on `serde`) — 4096 rounds, `SampleFormat` + `VideoCodec` round-trip through `serde_json`. This branch was also rebased onto the updated `feat/arbitrary` (which itself rebased onto the serde-fixed `feat/serde-support`) — keeps the stack coherent so reviews see the real combined state. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck --lib 164 pass test -p mediaframe --features quickcheck,serde --lib 174 pass check -p mediaframe --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…de round-trip coverage Carries the PR #8 round-5 `Loudness` fix into the quickcheck helpers (same latent bug, separate file tree): the `loudness` helper filled its four `f32` fields via `f32::arbitrary`, which yields NaN / ±inf. `Loudness` derives serde; JSON serializes a non-finite `f32` as `null`, which fails to deserialize back into `f32` — quickcheck-generated `Loudness` values did not reliably survive a serde round-trip. Fix: the `loudness` helper generates finite fields — a bounded `i32` (`rem_euclid` → non-negative, shifted into `[-10_000_000, 10_000_000]`) mapped to `f32 / 100.0`, finite in [-100_000, 100_000]. Extended `arbitrary_values_survive_serde_round_trip` to round-trip `Loudness` too. (The `HashSet` → `BTreeSet` half of the round-5 finding is PR #8-only — `feat/quickcheck`'s tests run under the `quickcheck` feature, which implies `std`; the no-std-capable `arbitrary` tests are inherited from `feat/arbitrary` already fixed there.) Also rebased onto the round-5-fixed `feat/arbitrary` to keep the stack coherent. Verified (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck,serde --lib 174 pass hack --each-feature check -p mediaframe ✓ check -p mediaframe --all-features ✓
…are it honestly Codex round-6 finding: the `arbitrary` feature was declared `["alloc", "dep:arbitrary"]` with comments describing an alloc-tier / no-std setup — but the `arbitrary` crate (1.4.2) is genuinely std-based: it exposes no `alloc`/`no_std` feature and its crate root uses `std` directly. So `--no-default-features --features arbitrary` only kept *mediaframe's own* crate root `#![no_std]` on a host target; a real no-std target would still fail to compile the `arbitrary` dependency. The advertised no-std support was illusory. Fix: `arbitrary` is honestly std-only. The feature now declares `["std", "dep:arbitrary"]` (std includes alloc, so the String/Vec/byte generation is unaffected) and the comment states it serves host-side fuzzing only, never an embedded target. No code change to the impls — they already used `std`-aliased paths that now resolve to real `std`. Dropped the now-false "no-std" rationale from two round-5 test comments (`BTreeSet`, `String::from`) — both remain correct under `std`, only the justification needed correcting. Verified (RUSTFLAGS=-Dwarnings cargo +1.95.0): check -p mediaframe --no-default-features --features arbitrary ✓ test -p mediaframe --features arbitrary,serde --lib 173 pass hack --each-feature check -p mediaframe ✓ check -p mediaframe --all-features ✓
The round-6 commit fixed the `arbitrary` *feature* comment but missed the `arbitrary` *dependency* comment, which still claimed the crate is "no-alloc-capable by default". It is not — `arbitrary` 1.x is std-based. Comment now matches reality; no code / manifest-value change.
uqio
added a commit
that referenced
this pull request
May 22, 2026
…d enums + all 12 SampleFormat slugs Codex round-2 findings on PR #9 (mirror of the PR #8 fixes): 1. [medium] `matrix`/`primaries`/`transfer`/`pixel_format` helpers still decoded a uniform `u32` through `from_u32`. Their named codes cluster in low integers (Matrix 0..=17, Primaries 0..=22, Transfer 0..=18, PixelFormat 270 codes over 0..=947), so QuickCheck `u32::arbitrary` reaches a named variant essentially never — properties over colour / pixel descriptors ran almost entirely on the `Unknown(_)` path. Fix: new `qc_via_code_weighted_range!(fn => Ty, max_named = N)` macro — 50/50 between an in-range code (`u32 % (max+1)`, since `quickcheck:: Gen` has no `int_in_range`) and a full-range `u32`. `track_disposition` stays on `arb_via_code!` — bitflags, every bit pattern meaningful. 2. [medium] The bespoke `sample_format` helper's named arm only sampled 6 of 12 slugs; `dbl`/`u8p`/`s32p`/`dblp`/`s64`/`s64p` were reachable only via the rare numeric arm. `SLUGS` now lists all 12. Tests: - `reachability_sample_format_all_named_plus_arms` — strengthened to assert all 12 named variants observed + `Unknown(_)` + `Other(_)`. - `reachability_range_weighted_enums_hit_named_codes` (new) — Matrix ≥10, Primaries ≥8, Transfer ≥10, PixelFormat ≥40 distinct named codes. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck --lib 163 pass check -p mediaframe / --features arbitrary,quickcheck / --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…ly values Codex round-4 finding on PR #9: `quickcheck_helpers::composite::tags` populated 12 of the 13 `Tags` fields but never `language`, so every generated `Tags` had `language == None` — property tests could not catch language-tag loss / corruption. Fix: the `tags` helper now generates `language` too, 50/50 between `None` and `Some(<arbitrary string>)`. Also carried the PR #8 round-4 canonical-value fix into the quickcheck helpers (same latent bug, separate file tree): - `qc_open_string_enum!` and the bespoke `sample_format` helper built `Other(SmolStr::from(s))` directly; a string equal to a named slug produced a malformed `Other("h264")` that serde canonicalises to `H264` — not round-trippable. - Both now route string construction through `FromStr` (the canonicalising constructor), so every generated value is canonical. Tests: - `reachability_tags_language_hits_none_and_some` — asserts both `None` and `Some(_)` are observed. - `arbitrary_values_survive_serde_round_trip` (gated on `serde`) — 4096 rounds, `SampleFormat` + `VideoCodec` round-trip through `serde_json`. This branch was also rebased onto the updated `feat/arbitrary` (which itself rebased onto the serde-fixed `feat/serde-support`) — keeps the stack coherent so reviews see the real combined state. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck --lib 164 pass test -p mediaframe --features quickcheck,serde --lib 174 pass check -p mediaframe --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…de round-trip coverage Carries the PR #8 round-5 `Loudness` fix into the quickcheck helpers (same latent bug, separate file tree): the `loudness` helper filled its four `f32` fields via `f32::arbitrary`, which yields NaN / ±inf. `Loudness` derives serde; JSON serializes a non-finite `f32` as `null`, which fails to deserialize back into `f32` — quickcheck-generated `Loudness` values did not reliably survive a serde round-trip. Fix: the `loudness` helper generates finite fields — a bounded `i32` (`rem_euclid` → non-negative, shifted into `[-10_000_000, 10_000_000]`) mapped to `f32 / 100.0`, finite in [-100_000, 100_000]. Extended `arbitrary_values_survive_serde_round_trip` to round-trip `Loudness` too. (The `HashSet` → `BTreeSet` half of the round-5 finding is PR #8-only — `feat/quickcheck`'s tests run under the `quickcheck` feature, which implies `std`; the no-std-capable `arbitrary` tests are inherited from `feat/arbitrary` already fixed there.) Also rebased onto the round-5-fixed `feat/arbitrary` to keep the stack coherent. Verified (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck,serde --lib 174 pass hack --each-feature check -p mediaframe ✓ check -p mediaframe --all-features ✓
Codex round-7 finding: the `arbitrary::Arbitrary` impl for `Tags` started from `Tags::new()` and populated 12 of the 13 fields but never `language` (`Option<SmolStr>`, serialized as buffa field 13). Every arbitrary-driven `Tags` had `language == None`, leaving that wire/storage path outside the fuzzing surface. (The quickcheck-helper `Tags` generator was already fixed for this in round 4; this is the same gap in the separate `arbitrary_impls` tree.) Fix: the `Tags` impl now generates `language` — an arbitrary `Option<String>` mapped to `Option<SmolStr>` through `.maybe_language(...)`, covering both `None` and `Some(_)`. New test `reachability_tags_language_hits_none_and_some` asserts both states are observed. Verified (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features arbitrary --lib 164 pass test -p mediaframe --features arbitrary,serde --lib 174 pass check -p mediaframe --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…d enums + all 12 SampleFormat slugs Codex round-2 findings on PR #9 (mirror of the PR #8 fixes): 1. [medium] `matrix`/`primaries`/`transfer`/`pixel_format` helpers still decoded a uniform `u32` through `from_u32`. Their named codes cluster in low integers (Matrix 0..=17, Primaries 0..=22, Transfer 0..=18, PixelFormat 270 codes over 0..=947), so QuickCheck `u32::arbitrary` reaches a named variant essentially never — properties over colour / pixel descriptors ran almost entirely on the `Unknown(_)` path. Fix: new `qc_via_code_weighted_range!(fn => Ty, max_named = N)` macro — 50/50 between an in-range code (`u32 % (max+1)`, since `quickcheck:: Gen` has no `int_in_range`) and a full-range `u32`. `track_disposition` stays on `arb_via_code!` — bitflags, every bit pattern meaningful. 2. [medium] The bespoke `sample_format` helper's named arm only sampled 6 of 12 slugs; `dbl`/`u8p`/`s32p`/`dblp`/`s64`/`s64p` were reachable only via the rare numeric arm. `SLUGS` now lists all 12. Tests: - `reachability_sample_format_all_named_plus_arms` — strengthened to assert all 12 named variants observed + `Unknown(_)` + `Other(_)`. - `reachability_range_weighted_enums_hit_named_codes` (new) — Matrix ≥10, Primaries ≥8, Transfer ≥10, PixelFormat ≥40 distinct named codes. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck --lib 163 pass check -p mediaframe / --features arbitrary,quickcheck / --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…ly values Codex round-4 finding on PR #9: `quickcheck_helpers::composite::tags` populated 12 of the 13 `Tags` fields but never `language`, so every generated `Tags` had `language == None` — property tests could not catch language-tag loss / corruption. Fix: the `tags` helper now generates `language` too, 50/50 between `None` and `Some(<arbitrary string>)`. Also carried the PR #8 round-4 canonical-value fix into the quickcheck helpers (same latent bug, separate file tree): - `qc_open_string_enum!` and the bespoke `sample_format` helper built `Other(SmolStr::from(s))` directly; a string equal to a named slug produced a malformed `Other("h264")` that serde canonicalises to `H264` — not round-trippable. - Both now route string construction through `FromStr` (the canonicalising constructor), so every generated value is canonical. Tests: - `reachability_tags_language_hits_none_and_some` — asserts both `None` and `Some(_)` are observed. - `arbitrary_values_survive_serde_round_trip` (gated on `serde`) — 4096 rounds, `SampleFormat` + `VideoCodec` round-trip through `serde_json`. This branch was also rebased onto the updated `feat/arbitrary` (which itself rebased onto the serde-fixed `feat/serde-support`) — keeps the stack coherent so reviews see the real combined state. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck --lib 164 pass test -p mediaframe --features quickcheck,serde --lib 174 pass check -p mediaframe --all-features ✓
uqio
added a commit
that referenced
this pull request
May 22, 2026
…de round-trip coverage Carries the PR #8 round-5 `Loudness` fix into the quickcheck helpers (same latent bug, separate file tree): the `loudness` helper filled its four `f32` fields via `f32::arbitrary`, which yields NaN / ±inf. `Loudness` derives serde; JSON serializes a non-finite `f32` as `null`, which fails to deserialize back into `f32` — quickcheck-generated `Loudness` values did not reliably survive a serde round-trip. Fix: the `loudness` helper generates finite fields — a bounded `i32` (`rem_euclid` → non-negative, shifted into `[-10_000_000, 10_000_000]`) mapped to `f32 / 100.0`, finite in [-100_000, 100_000]. Extended `arbitrary_values_survive_serde_round_trip` to round-trip `Loudness` too. (The `HashSet` → `BTreeSet` half of the round-5 finding is PR #8-only — `feat/quickcheck`'s tests run under the `quickcheck` feature, which implies `std`; the no-std-capable `arbitrary` tests are inherited from `feat/arbitrary` already fixed there.) Also rebased onto the round-5-fixed `feat/arbitrary` to keep the stack coherent. Verified (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck,serde --lib 174 pass hack --each-feature check -p mediaframe ✓ check -p mediaframe --all-features ✓
Codex round-8 finding: the round-7 `Tags.language` generator mapped an
arbitrary `Option<String>` straight to `Option<SmolStr>`, which permits
`Some("")`. buffa writes field 13 only for a non-empty language and
decodes empty input back to `None` — so a generated `Some("")` would not
survive a wire round-trip and the reachability test's `Some(_)` was not
equivalent to the serialized-present state it claimed to cover.
Fix: the generator now `.filter(|s| !s.is_empty())` before `.maybe_
language(...)`, so it yields only `None` or `Some(non-empty)`. The
reachability test is tightened to assert each generated `Some(s)` has
`!s.is_empty()` and to require a non-empty `Some` was observed.
Verified (RUSTFLAGS=-Dwarnings cargo +1.95.0):
test -p mediaframe --features arbitrary,serde --lib 174 pass
Drops the previous byte-bridge approach in favor of native `quickcheck::Arbitrary` impls via the `quickcheck-richderive` derive. Per-type `#[quickcheck(with = "…")]` points at `fn(&mut Gen) -> Self` helpers that own the generation strategy; the derive emits the actual `impl` blocks. Feature wiring: quickcheck = ["std", "dep:quickcheck", "dep:quickcheck-richderive"] Deps: quickcheck 1 + quickcheck-richderive 0.1, both optional. `mediaframe::quickcheck_helpers/` module is `pub` (the derive's `with` paths must be reachable from each type's definition site). Sub-modules: strings.rs — cluster A (open string enums) coded.rs — cluster B (closed coded enums + colour/frame structs) composite.rs — cluster C (audio composite + capture + lang) Tests live in `mod.rs` (5 — invariant + roundtrip + smoke; mirror the `arbitrary_impls/` test set but drive `quickcheck::Gen` directly). The cluster files are stub-only here; the actual derives + helpers land in three parallel follow-up commits, one per cluster, before the matrix is re-verified. The two features are now independent — `quickcheck` does NOT pull `arbitrary`.
…rive Adds container-level `#[cfg_attr(feature = "quickcheck", derive(::quickcheck_richderive::Arbitrary), quickcheck(with = "…"))]` to every cluster-A open string enum, plus the matching `fn(&mut Gen) -> T` helper bodies in `quickcheck_helpers/strings.rs`. The derive emits the `impl quickcheck::Arbitrary` blocks; the helpers run a 50/50 split between round-tripping a curated slug through total `FromStr` and constructing `Other(SmolStr::from(<arbitrary String>))`. Types covered (8) → helpers / curated-slug count: - codec::VideoCodec → strings::video_codec (6 slugs) - codec::AudioCodec → strings::audio_codec (6 slugs) - codec::SubtitleCodec → strings::subtitle_codec (6 slugs) - container::Format → strings::container_format (6 slugs) - subtitle::Format → strings::subtitle_format (6 slugs) - audio::ChannelLayout → strings::channel_layout (6 slugs) - audio::SampleFormat → strings::sample_format (6 slugs) - audio::ContainerFormat → strings::audio_container_format (6 slugs) Slug picks mirror `arbitrary_impls/strings.rs` for consistency. A small `qc_open_string_enum!` macro inside `strings.rs` factors the shared body; per-type `pub(crate) fn` surface is preserved so each `with = "…"` path resolves.
…via quickcheck-richderive
Adds the native `quickcheck::Arbitrary` derive surface for the 24
descriptor types in cluster B, via container-level
`#[cfg_attr(feature = "quickcheck", derive(::quickcheck_richderive::Arbitrary),
quickcheck(with = "crate::quickcheck_helpers::coded::<snake>"))]`.
Helper bodies live in `mediaframe/src/quickcheck_helpers/coded.rs` (the
former comment-only stub).
Strategy per group:
- 13 coded enums via the `arb_via_code!` macro — each helper is
`T::from_u32(u32::arbitrary(g))`, total coverage of
`Unknown(u32)` / `Reserved(_)` arms in one line per type.
- 11 structs build via the type's public `new(...)` ctor with each
field via `<FieldT>::arbitrary(g)`. Cascade structs (`Info`,
`MasteringDisplay`, `HdrStaticMetadata`, `FrameRate`) call sister
helpers within the file. `Rational` / `SampleAspectRatio` use
`core::num::NonZeroU32::arbitrary(g)` (quickcheck 1.x has the
impl), preserving the non-zero denom invariant by construction.
`HdrStaticMetadata` rolls each `Option<…>` via a coin flip rather
than going through a not-yet-in-scope `Option<T>: Arbitrary`.
Coded enums (13):
color::{Matrix, Primaries, Transfer, DynamicRange,
ChromaLocation, DcpTargetGamut},
pixel_format::PixelFormat,
frame::{Rotation, FieldOrder, StereoMode},
subtitle::TrackOrigin,
audio::BitRateMode,
disposition::TrackDisposition (bitflags struct, `from_u32` =
`from_bits_retain` — unknown bits round-trip).
Structs (11):
color::{Info, ContentLightLevel, ChromaCoord, MasteringDisplay,
HdrStaticMetadata, DolbyVisionConfig},
frame::{Dimensions, Rect, Rational, SampleAspectRatio, FrameRate}.
Default builds are unaffected (everything is gated by
`cfg(feature = "quickcheck")`). Under `--features quickcheck` the
type-by-type surface compiles clean; the lib-test failures that
remain are all in cluster A/C types referenced by the shared smoke
test in `quickcheck_helpers/mod.rs` and will resolve when those
subagents land their helpers.
…ickcheck-richderive
Wires native quickcheck::Arbitrary on the 7 cluster-C composite types via
quickcheck-richderive's container-level `#[quickcheck(with = "...")]` →
`crate::quickcheck_helpers::composite::<name>` helper functions.
Types:
- audio::Loudness — plain `new(f32, f32, f32, f32)`; fields pass through.
- audio::Fingerprint — `try_new(algo, value)`; algo non-empty fallback "x"
so `expect` is sound; empty `value` is allowed.
- audio::CoverArt — `try_new(mime, data)`; mime fallback
"application/octet-stream", data fallback `[0u8]`
so both non-empty preconditions hold.
- audio::Tags — `new()` + 12 builder setters (title/artist/
album_artist/album/composer/genre/comment +
year/track_number/track_total/disc_number/
disc_total); same representative subset as the
arbitrary_impls cluster.
- capture::Device — `new()` + with_make / with_model; empty SmolStr
means absent so arbitrary strings pass through.
- capture::GeoLocation — `try_new(lat, lon, altitude)`; coordinates built
in-range via `i32.rem_euclid(...) - shift`
(quickcheck::Gen has no int_in_range);
altitude 50/50 None vs in-band finite f32.
- lang::Language — `from_bcp47(...)` with a curated 12-tag table
via `g.choose(...)` — und sentinel + language-only
+ language+region + language+script+region.
Validation strategy is identical to the arbitrary_impls Cluster C: build
VALID inputs FIRST, then call `try_new(...).expect(...)`. Never feed
attacker-controlled fuzz into a fallible ctor + `unwrap`.
Default build unaffected — everything is gated by `feature = "quickcheck"`.
…ary`)
richderive 0.2 reshuffles the attribute surface to mirror serde: `with` is now
a module pair (`mod::arbitrary` + `mod::shrink`); the single-fn override moved
to `arbitrary`. All 39 descriptor types use the gen-half-only override, so the
mechanical sweep is `quickcheck(with = "…")` → `quickcheck(arbitrary = "…")`
across 17 type-def files + 3 doc-comment references in `quickcheck_helpers/*`.
The helpers in `quickcheck_helpers/{strings,coded,composite}.rs` are unchanged
— they're still single `fn(&mut Gen) -> T` functions, which is the contract
for the new `arbitrary = "fn"` attribute.
Dep flipped to a git ref on the branch pending the 0.2.0 publish — once that
lands, this becomes `quickcheck-richderive = { version = "0.2", optional = true }`
in a follow-up.
Matrix (RUSTFLAGS=-Dwarnings):
cargo fmt -p mediaframe -- --check ✓
cargo test -p mediaframe --features quickcheck --lib 159 pass
cargo check -p mediaframe --all-features ✓
…enums + bespoke SampleFormat 3-way
The first-pass quickcheck helpers all routed coded enums through
`<Ty>::from_u32(u32::arbitrary(g))`, which is fine for large enums
(matrix / primaries / transfer / pixel format) but collapses small ones:
- `BitRateMode` (3 named codes 0/1/2, no `Unknown` arm) hits a named
variant only when the random u32 happens to be 0/1/2 — ~3-in-4 G —
so `Vbr` / `Abr` were effectively unreachable.
- `TrackOrigin` (same shape, 3 named, no `Unknown`).
- `Rotation` / `FieldOrder` / `StereoMode` / `DynamicRange` /
`ChromaLocation` / `DcpTargetGamut` — closed + `Unknown(u32)` arm,
< 10 named variants; uniform u32 decode virtually never names them.
`SampleFormat` was also misclassified: routed through the 2-arm
curated-slug-vs-`Other` macro (`qc_open_string_enum!`) even though it
has a *third* `Unknown(u32)` arm, so the `Unknown` arm was unreachable.
Two new internal helper macros next to `arb_via_code!`:
- `qc_via_named_variants!` — strictly-closed enums; pick uniformly
from a `const NAMED: &[Ty]` via `g.choose`.
- `qc_via_code_weighted!` — closed + `Unknown(u32)`; 50/50 between
`choose` over named and `from_u32(u32::arbitrary(g))` (so the
`Unknown` arm stays reachable).
`arb_via_code!` stays for the large enums (matrix / primaries /
transfer / pixel format / disposition bitflags) where uniform-u32
decode hits named code points densely.
`SampleFormat` gets a bespoke 3-way: `g.choose(&[0u8, 1, 2])` over a
sentinel arm-tag slice (quickcheck::Gen has no `int_in_range`) →
named slug via `FromStr` / `from_u32(u32::arbitrary(g))` /
`Other(SmolStr)` from an arbitrary String.
New reachability tests in `quickcheck_helpers::tests` (use the existing
`drive(size, rounds, body)` harness):
- reachability_small_coded_enums_hit_all_named
- reachability_weighted_rotation_hits_named_and_unknown
- reachability_sample_format_reaches_all_three_arms
All 162 lib tests pass under `--features quickcheck`; default /
`quickcheck,serde` / `arbitrary,quickcheck` / `--all-features` all
clean under `RUSTFLAGS=-Dwarnings`.
…d enums + all 12 SampleFormat slugs Codex round-2 findings on PR #9 (mirror of the PR #8 fixes): 1. [medium] `matrix`/`primaries`/`transfer`/`pixel_format` helpers still decoded a uniform `u32` through `from_u32`. Their named codes cluster in low integers (Matrix 0..=17, Primaries 0..=22, Transfer 0..=18, PixelFormat 270 codes over 0..=947), so QuickCheck `u32::arbitrary` reaches a named variant essentially never — properties over colour / pixel descriptors ran almost entirely on the `Unknown(_)` path. Fix: new `qc_via_code_weighted_range!(fn => Ty, max_named = N)` macro — 50/50 between an in-range code (`u32 % (max+1)`, since `quickcheck:: Gen` has no `int_in_range`) and a full-range `u32`. `track_disposition` stays on `arb_via_code!` — bitflags, every bit pattern meaningful. 2. [medium] The bespoke `sample_format` helper's named arm only sampled 6 of 12 slugs; `dbl`/`u8p`/`s32p`/`dblp`/`s64`/`s64p` were reachable only via the rare numeric arm. `SLUGS` now lists all 12. Tests: - `reachability_sample_format_all_named_plus_arms` — strengthened to assert all 12 named variants observed + `Unknown(_)` + `Other(_)`. - `reachability_range_weighted_enums_hit_named_codes` (new) — Matrix ≥10, Primaries ≥8, Transfer ≥10, PixelFormat ≥40 distinct named codes. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck --lib 163 pass check -p mediaframe / --features arbitrary,quickcheck / --all-features ✓
…ability Codex round-3 finding: `Matrix` has a mediaframe-domain variant `Bt601` at `DOMAIN_EXT_BASE` (`0x8000_0000`) — far outside the H.273 `0..=17` range the round-2 `qc_via_code_weighted_range!` macro biases toward. So `Bt601` was only reachable via the full-`u32` fallback (~1-in-8.6-billion), and the reachability test missed it entirely (it only counted codes ≤ 17). Fix: `Matrix` is now hand-written (out of the range macro) — a 3-way pick: in-range H.273 code (`u32 % 18`) / the domain-ext code (`DOMAIN_EXT_BASE`, decodes to `Bt601`) / full-range `u32` (broad `Unknown` exercise). `Matrix` was the only coded enum with a domain variant — Primaries / Transfer / PixelFormat have none, so they stay on the range macro. Test `reachability_range_weighted_enums_hit_named_codes` strengthened to assert `matrix` observed `DOMAIN_EXT_BASE` specifically. 163 tests pass; `--all-features` clean.
…ly values Codex round-4 finding on PR #9: `quickcheck_helpers::composite::tags` populated 12 of the 13 `Tags` fields but never `language`, so every generated `Tags` had `language == None` — property tests could not catch language-tag loss / corruption. Fix: the `tags` helper now generates `language` too, 50/50 between `None` and `Some(<arbitrary string>)`. Also carried the PR #8 round-4 canonical-value fix into the quickcheck helpers (same latent bug, separate file tree): - `qc_open_string_enum!` and the bespoke `sample_format` helper built `Other(SmolStr::from(s))` directly; a string equal to a named slug produced a malformed `Other("h264")` that serde canonicalises to `H264` — not round-trippable. - Both now route string construction through `FromStr` (the canonicalising constructor), so every generated value is canonical. Tests: - `reachability_tags_language_hits_none_and_some` — asserts both `None` and `Some(_)` are observed. - `arbitrary_values_survive_serde_round_trip` (gated on `serde`) — 4096 rounds, `SampleFormat` + `VideoCodec` round-trip through `serde_json`. This branch was also rebased onto the updated `feat/arbitrary` (which itself rebased onto the serde-fixed `feat/serde-support`) — keeps the stack coherent so reviews see the real combined state. Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck --lib 164 pass test -p mediaframe --features quickcheck,serde --lib 174 pass check -p mediaframe --all-features ✓
…de round-trip coverage Carries the PR #8 round-5 `Loudness` fix into the quickcheck helpers (same latent bug, separate file tree): the `loudness` helper filled its four `f32` fields via `f32::arbitrary`, which yields NaN / ±inf. `Loudness` derives serde; JSON serializes a non-finite `f32` as `null`, which fails to deserialize back into `f32` — quickcheck-generated `Loudness` values did not reliably survive a serde round-trip. Fix: the `loudness` helper generates finite fields — a bounded `i32` (`rem_euclid` → non-negative, shifted into `[-10_000_000, 10_000_000]`) mapped to `f32 / 100.0`, finite in [-100_000, 100_000]. Extended `arbitrary_values_survive_serde_round_trip` to round-trip `Loudness` too. (The `HashSet` → `BTreeSet` half of the round-5 finding is PR #8-only — `feat/quickcheck`'s tests run under the `quickcheck` feature, which implies `std`; the no-std-capable `arbitrary` tests are inherited from `feat/arbitrary` already fixed there.) Also rebased onto the round-5-fixed `feat/arbitrary` to keep the stack coherent. Verified (RUSTFLAGS=-Dwarnings cargo +1.95.0): test -p mediaframe --features quickcheck,serde --lib 174 pass hack --each-feature check -p mediaframe ✓ check -p mediaframe --all-features ✓
Carries the PR #8 round-8 fix into the quickcheck `tags` helper (same latent bug, separate file tree): the helper produced `Some(SmolStr::from(<arbitrary string>))`, which permits `Some("")`. buffa writes field 13 only for a non-empty language and decodes empty back to `None`, so a generated `Some("")` would not survive a wire round-trip. Fix: the helper now yields `Some` only for a non-empty string (`(!s.is_empty()).then(...)`), else `None`. The reachability test is tightened to assert each generated `Some(s)` has `!s.is_empty()`. Verified: cargo +1.95.0 test -p mediaframe --features quickcheck,serde --lib — 174 pass; --all-features clean.
…guage>` for `language`
Resolves the codex `Some(0)` finding fundamentally — by fixing the `Tags`
type rather than constraining the generators — and closes the `language`
`TODO(lang)`.
Numeric fields (`year` / `track_number` / `track_total` / `disc_number` /
`disc_total`): `Option<u16>` → bare `u16` with `0` = absent. The buffa
wire codec already proto-zero-elided these fields (`v != 0` guards), so
the old `Option` was a type/codec contradiction: `Some(0)` encoded to
nothing and decoded back to `None`. Bare `u16` makes the type agree with
the wire — `0` is unambiguously "absent" (track/disc numbers are 1-based;
no real release tags year 0). Matches the `SmolStr` empty-is-absent
convention the string fields already use. Drops the now-needless
`maybe_* / update_* / clear_*` numeric vocabulary; `set_year(0)` clears.
`language`: `Option<SmolStr>` placeholder → `Option<crate::lang::Language>`
(the `TODO(lang)`). Parsed BCP-47 instead of a raw string; `None` = no
tag, `Some(Language)` = a parsed tag (possibly `und`).
Why not `jiff` for `year`: `jiff` is std-only (mediaframe is no-std-
capable) and has no plain calendar-year type (smallest is `civil::Date`,
which would force a fake month/day). A release-year metadata tag is just
a number — bare `u16` is the honest representation.
Touched:
- `audio/tags.rs` — struct, `new`, accessors (`year() -> u16`,
`language() -> Option<Language>`), builders/setters; tests rewritten.
- `buffa.rs` — `Tags` codec: numeric encode/decode now bare `u16`;
`language` encodes `Language::to_bcp47()`, decodes via `from_bcp47`
(lenient `unwrap_or_default`, mirroring the standalone `Language`
codec). Round-trip test gains a `0`-sentinel case.
- `arbitrary_impls/composite.rs` + `quickcheck_helpers/composite.rs` —
`Tags` generators emit bare `u16` (incl. `0`, now wire-canonical) and
`Option<Language>` from the curated BCP-47 generator. The codex
`Some(0)` / `Some("")` findings are gone — there is no `Option<u16>`
and no empty-string language.
- reachability tests updated for the new shape.
Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0):
test -p mediaframe --all-features --lib 875 pass
check -p mediaframe {default, no-default+arbitrary, no-default+serde,
quickcheck+serde+buffa} ✓
hack --each-feature check -p mediaframe ✓
arbitrary support for the descriptor vocabularyarbitrary + quickcheck support for the descriptor vocabulary
… (golden rule §9)
Applies the serde golden rule (rust-type-conventions §9): an absent
`Option` field serializes to an *omitted key*, never an explicit `null`.
- `Tags.language`, `HdrStaticMetadata.{mastering, content_light}` —
`#[cfg_attr(feature = "serde", serde(skip_serializing_if =
"Option::is_none"))]`. `HdrStaticMetadata` additionally gains
container-level `serde(default)` (whole struct has a meaningful
all-`None` `Default`), so an omitted field still deserializes —
`skip_serializing_if` + `serde(default)` are a pair.
- `GeoLocation` has a hand-written `Serialize`; it unconditionally wrote
`altitude` (emitting `"altitude":null` when absent). Now skips the
field when `None` via `SerializeStruct::skip_field`. Its deserialize
`Shadow` already had `#[serde(default)]` on `altitude`.
This also closes the round-9/10 concern about old `Option<u16>`-shape
`null` payloads: with the no-`null` rule there is no `null` to migrate —
and the `Tags` numerics are bare `u16` now anyway.
`Tags.language` doc notes the `Language` normalization (variant /
extension subtags fold to the language/script/region core) — lossless
for real embedded-metadata language tags.
New test `absent_option_fields_are_omitted_not_null` asserts the emitted
JSON for a default `Tags` / `HdrStaticMetadata` / `GeoLocation` contains
no `null` and no absent-field key (`HdrStaticMetadata::default()` → `{}`).
Matrix (RUSTFLAGS=-Dwarnings cargo +1.95.0):
test -p mediaframe --all-features --lib 876 pass
check -p mediaframe --all-features ✓
…ec.rs`
CI `xtask-check` failed: `mediaframe/src/codec.rs is stale vs the
vendored FFmpeg table`.
`codec.rs` is generated by `cargo xtask gen-codec`. The `quickcheck`
feature derives `quickcheck_richderive::Arbitrary` on the three codec
enums via a `#[cfg_attr(feature = "quickcheck", derive(…),
quickcheck(arbitrary = "…"))]` on each type — but that attribute was
hand-added directly to `codec.rs`, which the generator does not emit. So
the on-disk file diverged from `gen-codec`'s output and the byte-for-byte
freshness diff in `xtask check` failed.
Fix: `build_codec_enum` now emits the `quickcheck` `cfg_attr` itself,
routing each enum to its `quickcheck_helpers::strings::{video,audio,
subtitle}_codec` helper. Re-running `gen-codec` produces a `codec.rs`
byte-identical to the current file (the hand-added attr matched), so
`xtask check` is green and future regenerations preserve it.
(`serde` / `arbitrary` need no codegen change — codec support for those
is macro-based in `serde_impls.rs` / `arbitrary_impls/`, not a derive on
the type, so it never touched `codec.rs`.)
Verified: `cargo xtask gen-codec` → no `codec.rs` diff; `cargo xtask
check` → "codec enums and FFmpeg n8.1 are in two-way sync".
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds two optional test-data-generation features for the 39 descriptor-vocabulary types, plus a
audio::Tagstype fix. (PR #9 was folded into this branch —feat/quickcheckretired; this is the single reviewable PR.)Features
arbitrary— hand-writtenarbitrary::Arbitraryimpls via the types' public constructors. Std-only (thearbitrarycrate is std-based). Modulearbitrary_impls/.quickcheck— nativequickcheck::Arbitraryvia the publishedquickcheck-richderive 0.2derive (#[quickcheck(arbitrary = "…")]→fn(&mut Gen) -> Selfhelpers). Modulequickcheck_helpers/. Independent ofarbitrary.Both cover the identical 39-type surface as the
serdefeature.audio::Tagsredesign (folded in)Resolves a type/wire-codec contradiction surfaced during review:
year/track_number/track_total/disc_number/disc_total):Option<u16>→ bareu16,0= absent. The buffa codec already proto-zero-elided these, soSome(0)silently decoded toNone— theOptionwas a lie. Bareu16makes type + wire agree.language:Option<SmolStr>placeholder →Option<crate::lang::Language>(closes theTODO(lang)) — parsed BCP-47.Generation strategies
from_u32; small enums weighted (named-variants/code-weighted); large enums range-weighted (Matrixhand-written for theDOMAIN_EXT_BASEBt601domain variant).Other, all construction throughFromStrso generated values are canonical (no malformedOther("h264")).try_newtypes built valid-by-construction;Loudnessfloats kept finite (NaN/inf break JSON round-trip);SampleFormat3-way (named /Unknown(u32)/Other).Codex adversarial review
Hardened through a multi-round
/codex:adversarial-reviewloop on both feature halves (enum-reachability, canonical-value, no-std-feature-honesty, finite-float,Tags.language, and finally theTagsnumeric redesign).Matrix (
RUSTFLAGS=-Dwarnings cargo +1.95.0)test -p mediaframe --all-features— 875 passcheck— default /no-default+arbitrary/no-default+serde/quickcheck,serde,buffahack --each-feature checkfmt --check