Skip to content

feat(regen): v2 language_hints, profanity filter, word timings, diarize_model#730

Open
GregHolmes wants to merge 17 commits into
mainfrom
gh/sdk-gen-2026-06-15
Open

feat(regen): v2 language_hints, profanity filter, word timings, diarize_model#730
GregHolmes wants to merge 17 commits into
mainfrom
gh/sdk-gen-2026-06-15

Conversation

@GregHolmes

@GregHolmes GregHolmes commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

SDK regeneration 2026-06-15

Fern generator output (commit dc04625) reconciled against hand-maintained patches.

Versioning: classified feat (minor). The regen adds new fields/params (additive) and corrects the never-functional V2 language_hint field. No working public behavior is removed, so this is intentionally not a major release.

New / additive (non-breaking)

  • ListenV2ProfanityFilter — new type, re-exported from the package root + types.
  • ListenV2TurnInfoWordsItem.start / .end — word-level timing on streaming turn info, Optional[float] (default None). The API models these as optional (stem: Option<f32> with skip_serializing_if), so the server may omit them on some words/paths and the SDK matches that contract. As a response model parsed from the wire, this is additive and non-breaking — readers just see new optional fields; no construction changes are required.
  • diarize_model param + DiarizeModel type on listen/v1. The legacy diarize param is now deprecated but still present — no break.

Corrected (fix, not a break)

  • V2 listen provider language_hintlanguage_hints (DeepgramListenProviderV2, its request TypedDict, and the agent …ListenProviderV2 aliases). The singular language_hint (Union[str, List[str]]) was never accepted by the API — the SDK serialized a field the server discarded — so this corrects a non-functional field rather than breaking a working one. Use a list: language_hints=["en", "es"].

Re-applied manual patches (still needed)

  • Socket clients (agent/v1, listen/v1, listen/v2, speak/v1): broad except Exception, optional control-message params, _sanitize_numeric_types (agent), and the listen/v2 send_configure typing.Any/raw-_send shim.
  • core/query_encoder.py: Python bool → lowercase "true"/"false" coercion for websocket query strings.
  • Agent Settings 2026-05-05 backward-compat (agent_v1settings, agent_v1settings_agent, agent_v1settings_agent_context, types + requests): callable AgentV1SettingsAgent, legacy messages= kwarg, .messages property, nested-context migration.
  • Legacy alias re-exports in the 7 package __init__.py files.
  • tests/wire/test_manage_v1_projects_keys.py: legacy CreateKeyV1RequestOneParams wire coverage.

Dropped (generator caught up)

  • agent_v1settings_audio_output.container: adopt the generated AgentV1SettingsAudioOutputContainer enum. It's Union[Literal["none","wav","ogg"], Any], so arbitrary strings still validate — runtime-non-breaking. Removed from .fernignore.

New compat shims (generator regression)

  • The regen removed ListenV2CloseStreamType (docs #946 narrowed the v2 CloseStream type enum). The original generated type wrongly allowed Union[Literal["Finalize","CloseStream","KeepAlive"], Any] — v2 had copied v1's control-message enum, but a CloseStream message's type can only ever be "CloseStream". Recreated by hand as the corrected Literal["CloseStream"], re-exported from the three listen __init__.py files, and frozen in .fernignore — preserving the public import path without resurrecting the invalid values. This removes the only hard breaking change, so the PR stays a minor (feat) bump.
  • The regen removed the top-level DeepgramListenProviderV2LanguageHint/...Params type (Union[str, List[str]]); the → diarize_model.py git "rename" is a Fern false-match (the unrelated DiarizeModel). Its removal broke two permanently-frozen *V2LanguageHint listen-provider alias files. Recreated both modules by hand and froze them in .fernignore so the public import paths keep working.

Validation

  • mypy src/ — clean (802 files) — CI gate
  • pytest — 225 passed, 1 skipped — CI gate
  • ruff checksrc/deepgram clean; not run by CI. (Pre-existing I001 import-order nits remain in some hand-written examples/ and tests/manual/ files, untouched by this regen.)

All .bak files removed; .fernignore paths restored; AGENTS.md frozen-file list updated.


Second reconciliation pass — Fern run f6c1612 (origin 8dd6f48)

A follow-up generator run landed on this branch after the first reconciliation above. Re-ran the full review-regen flow against it.

Re-applied — all 25 frozen patches still needed (none dropped)

Every temporarily-frozen file's diff against its .bak consisted only of our hand-patches being stripped; the generator introduced no new content into them (all 10 __init__.py diffs were pure deletions of our legacy re-exports). All re-applied verbatim:

  • 4 socket clients, 6 agent_v1settings* files, 3 language_hint→language_hints validators, 10 package __init__.py re-exports, core/query_encoder.py, and the test_manage_v1_projects_keys.py wire test.

Generator changes flagged (judgment calls — kept our patches)

  • listen/v2 send_configure: the generator now does emit real ListenV2Configure / ListenV2ConfigureSuccess types and wires them in (response Union included). Decision: keep the typing.Any/raw-_send shim — no public signature change; adopting the generated types is deferred as a separate deliberate change.
  • Socket-client except: the generator now narrows to (websockets.WebSocketException, JSONDecodeError) (added JSONDecodeError). Decision: keep broad except Exception — custom transports can raise non-websocket exceptions.

Generator-owned change (auto-accepted, not frozen)

  • ListenV2TurnInfoWordsItem.start / .end are Optional[float] (default None) — matching the API contract (the upstream spec was corrected to drop them from required). The file isn't frozen, so Fern owns it; accepted as-is. The "New / additive" section above already reflects this final optional state.

Validation (refreshed)

  • pytest237 passed, 1 skipped
  • mypy src/deepgramclean (804 files)
  • ruff check src/deepgramclean; the 25 re-applied files also clean. Pre-existing I001 import-order nits remain only in hand-written tests/custom/ + tests/manual/ files, untouched by this regen.

All .bak files removed; .fernignore paths restored to originals.

🤖 Generated with Claude Code

@GregHolmes GregHolmes changed the title chore: SDK regeneration 2026-06-15 feat: SDK regeneration 2026-06-15 — V2 language_hints, profanity filter, word timings, diarize_model Jun 15, 2026
@GregHolmes GregHolmes changed the title feat: SDK regeneration 2026-06-15 — V2 language_hints, profanity filter, word timings, diarize_model feat(regen): V2 language_hints, profanity filter, word timings, diarize_model Jun 15, 2026
@GregHolmes GregHolmes changed the title feat(regen): V2 language_hints, profanity filter, word timings, diarize_model feat(regen): v2 language_hints, profanity filter, word timings, diarize_model Jun 15, 2026
@GregHolmes GregHolmes self-assigned this Jun 18, 2026
GregHolmes and others added 10 commits June 18, 2026 14:19
Reconcile hand-maintained patches against Fern's 2026-06-15 output (dc04625):

Re-applied (still needed):
- socket clients (agent/listen-v1/listen-v2/speak): broad except, optional
  control-message params, _sanitize_numeric_types (agent), listen/v2
  send_configure typing.Any shim
- core/query_encoder.py: bool -> lowercase coercion
- Agent Settings 2026-05-05 backward-compat patches (types+requests)
- legacy alias re-exports in package __init__ files
- tests/wire legacy CreateKeyV1RequestOneParams coverage

Dropped (generator caught up):
- agent_v1settings_audio_output.container: adopt generated
  AgentV1SettingsAudioOutputContainer enum (Union[Literal, Any] is
  runtime-non-breaking); removed from .fernignore

New compat shims (generator regression):
- recreate top-level DeepgramListenProviderV2LanguageHint/...Params that the
  regen removed; required by the *V2LanguageHint listen-provider aliases.
  Frozen in .fernignore.

Kept generator additions: ListenV2ProfanityFilter re-exports.

Validation: ruff clean, mypy clean (802 files), pytest 225 passed / 1 skipped.
Re-prepare for another regeneration on this branch (no new branch/PR). Swapped 19 temporarily-frozen files to .bak and repointed .fernignore.
Second regeneration on this branch. Re-applied all 19 manual patches (socket clients, agent settings backward-compat, query_encoder bool coercion, legacy alias re-exports across 7 __init__.py, create-key wire test). Restored .fernignore originals; removed .bak files.

Notable generator change (Fern-owned): ListenV2TurnInfoWordsItem start/end are now required float (was Optional[float]) — Flux word-timestamp parity, same as the Java SDK.

Verified: mypy src/ clean (802 files), pytest 225 passed / 1 skipped.
Fern removed ListenV2CloseStreamType in the 2026-06-15 regen (docs #946 narrowed the
v2 CloseStream type enum). The original generated type wrongly allowed
Union[Literal['Finalize','CloseStream','KeepAlive'], Any] — v2 copied v1's
control-message enum, but a CloseStream message's type can only ever be 'CloseStream'.

Recreate it by hand as the corrected Literal['CloseStream'] and re-export from the
three listen __init__.py files so the public import path keeps working without
resurrecting the invalid values. Freeze the shim + the 3 __init__ files in .fernignore
and document in AGENTS.md. This removes the only hard breaking change in the regen, so
the PR stays a minor (feat) bump.

Verified: mypy src/ clean (803 files), pytest 225 passed / 1 skipped, all 3 import
paths resolve to the narrowed Literal.
The 2026-06-15 regen renamed the provider field language_hint -> language_hints
(matching the API, which uses language_hints and rejects unknown fields). Removing
the singular name is a source-level break for existing callers, so add a
model_validator(mode='before') / root_validator(pre=True) on the three provider
models that remaps a legacy language_hint= kwarg (str or list) to language_hints
and drops the dead singular key. Freeze the three files in .fernignore so the
validator survives future regens; remove when the alias is retired in a major.
@GregHolmes GregHolmes force-pushed the gh/sdk-gen-2026-06-15 branch from 55fa132 to 22face6 Compare June 18, 2026 13:23
fern-api Bot and others added 7 commits June 18, 2026 13:25
The before-validator fixed the runtime drop but language_hint was not a
declared field, so language_hint= still failed type-checking (pyright /
mypy+pydantic plugin reported 'No parameter named language_hint').

Re-add language_hint as a deprecated Optional[Union[str, List[str]]] field
with exclude=True on the three V2 provider models, so legacy call sites
type-check. The validator still remaps its value to language_hints, and
exclude=True keeps the field off the wire (the API rejects unknown fields).
Verified: pyright clean + maps + never serialized, on pydantic v1 and v2.
Matches the JS SDK's deprecated-field approach.
Adds regression coverage for the regen shims/constraints, which had none:

- tests/custom/test_language_hint_compat.py: exhaustive language_hint ->
  language_hints mapping across all 5 surfaces (3 models + 2 aliases) and the
  discriminated unions, via kwargs and dict construction. Asserts str->[str],
  list passthrough, plural unchanged, explicit-plural precedence, None/empty
  no-op, v1-member unaffected, and that language_hint never serializes.
- tests/custom/test_listen_v2_regen_constraints.py: word start/end optional,
  CloseStream type shim = Literal['CloseStream'], v2 profanity_filter literals.
- tests/typecheck/compat_aliases.py: type-level assertions that the deprecated
  language_hint kwarg (str/list) still type-checks.
- ci.yml: run 'mypy tests/typecheck' so those type assertions actually gate
  (previously written but never type-checked in CI).

Verified: 50 pytest pass; mypy tests/typecheck clean.
The prior tests exercised language_hints only as the back-compat *target*.
Add dedicated forward-feature coverage for language_hints as the real field:

- ListenV2Configure (Flux STT reconfigure, client-sent) — previously zero
  coverage; single + multiple codes, kwargs + dict, asserting it serializes
  onto the wire (contrast with the excluded language_hint shim).
- DeepgramListenProviderV2 — multi-code language_hints serializes + dict parse.

6 tests, all passing.
…figure)

Adds the missing automated coverage for the hand-maintained socket-client
patches frozen in .fernignore (previously only exercised in tests/manual):

- _sanitize_numeric_types: whole-float->int, recursion, passthrough, and that
  agent _send_model pipes payloads through it (sample_rate 16000.0 -> 16000).
- Optional message param on no-payload control sends: send_close_stream() and
  send_keep_alive() callable with no arg emit the correct default control type.
- listen/v2 send_configure(typing.Any): raw passthrough sent verbatim.

Driven via a fake websocket capturing the .send() payload. 9 tests, passing.
The new 'mypy tests/typecheck' CI step ran the compat_aliases type-check across
the full 3.10-3.13 matrix and exposed a latent bug: typing.assert_type is 3.11+,
so compile (3.10) failed. Use the typing_extensions backport (already a dep),
which works on all supported versions. Verified with mypy --python-version 3.10.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant