feat(regen): v2 language_hints, profanity filter, word timings, diarize_model#730
Open
GregHolmes wants to merge 17 commits into
Open
feat(regen): v2 language_hints, profanity filter, word timings, diarize_model#730GregHolmes wants to merge 17 commits into
GregHolmes wants to merge 17 commits into
Conversation
Reconcile hand-maintained patches against Fern's 2026-06-15 output (dc04625): Re-applied (still needed): - socket clients (agent/listen-v1/listen-v2/speak): broad except, optional control-message params, _sanitize_numeric_types (agent), listen/v2 send_configure typing.Any shim - core/query_encoder.py: bool -> lowercase coercion - Agent Settings 2026-05-05 backward-compat patches (types+requests) - legacy alias re-exports in package __init__ files - tests/wire legacy CreateKeyV1RequestOneParams coverage Dropped (generator caught up): - agent_v1settings_audio_output.container: adopt generated AgentV1SettingsAudioOutputContainer enum (Union[Literal, Any] is runtime-non-breaking); removed from .fernignore New compat shims (generator regression): - recreate top-level DeepgramListenProviderV2LanguageHint/...Params that the regen removed; required by the *V2LanguageHint listen-provider aliases. Frozen in .fernignore. Kept generator additions: ListenV2ProfanityFilter re-exports. Validation: ruff clean, mypy clean (802 files), pytest 225 passed / 1 skipped.
Re-prepare for another regeneration on this branch (no new branch/PR). Swapped 19 temporarily-frozen files to .bak and repointed .fernignore.
Second regeneration on this branch. Re-applied all 19 manual patches (socket clients, agent settings backward-compat, query_encoder bool coercion, legacy alias re-exports across 7 __init__.py, create-key wire test). Restored .fernignore originals; removed .bak files. Notable generator change (Fern-owned): ListenV2TurnInfoWordsItem start/end are now required float (was Optional[float]) — Flux word-timestamp parity, same as the Java SDK. Verified: mypy src/ clean (802 files), pytest 225 passed / 1 skipped.
Fern removed ListenV2CloseStreamType in the 2026-06-15 regen (docs #946 narrowed the v2 CloseStream type enum). The original generated type wrongly allowed Union[Literal['Finalize','CloseStream','KeepAlive'], Any] — v2 copied v1's control-message enum, but a CloseStream message's type can only ever be 'CloseStream'. Recreate it by hand as the corrected Literal['CloseStream'] and re-export from the three listen __init__.py files so the public import path keeps working without resurrecting the invalid values. Freeze the shim + the 3 __init__ files in .fernignore and document in AGENTS.md. This removes the only hard breaking change in the regen, so the PR stays a minor (feat) bump. Verified: mypy src/ clean (803 files), pytest 225 passed / 1 skipped, all 3 import paths resolve to the narrowed Literal.
The 2026-06-15 regen renamed the provider field language_hint -> language_hints (matching the API, which uses language_hints and rejects unknown fields). Removing the singular name is a source-level break for existing callers, so add a model_validator(mode='before') / root_validator(pre=True) on the three provider models that remaps a legacy language_hint= kwarg (str or list) to language_hints and drops the dead singular key. Freeze the three files in .fernignore so the validator survives future regens; remove when the alias is retired in a major.
55fa132 to
22face6
Compare
The before-validator fixed the runtime drop but language_hint was not a declared field, so language_hint= still failed type-checking (pyright / mypy+pydantic plugin reported 'No parameter named language_hint'). Re-add language_hint as a deprecated Optional[Union[str, List[str]]] field with exclude=True on the three V2 provider models, so legacy call sites type-check. The validator still remaps its value to language_hints, and exclude=True keeps the field off the wire (the API rejects unknown fields). Verified: pyright clean + maps + never serialized, on pydantic v1 and v2. Matches the JS SDK's deprecated-field approach.
Adds regression coverage for the regen shims/constraints, which had none: - tests/custom/test_language_hint_compat.py: exhaustive language_hint -> language_hints mapping across all 5 surfaces (3 models + 2 aliases) and the discriminated unions, via kwargs and dict construction. Asserts str->[str], list passthrough, plural unchanged, explicit-plural precedence, None/empty no-op, v1-member unaffected, and that language_hint never serializes. - tests/custom/test_listen_v2_regen_constraints.py: word start/end optional, CloseStream type shim = Literal['CloseStream'], v2 profanity_filter literals. - tests/typecheck/compat_aliases.py: type-level assertions that the deprecated language_hint kwarg (str/list) still type-checks. - ci.yml: run 'mypy tests/typecheck' so those type assertions actually gate (previously written but never type-checked in CI). Verified: 50 pytest pass; mypy tests/typecheck clean.
The prior tests exercised language_hints only as the back-compat *target*. Add dedicated forward-feature coverage for language_hints as the real field: - ListenV2Configure (Flux STT reconfigure, client-sent) — previously zero coverage; single + multiple codes, kwargs + dict, asserting it serializes onto the wire (contrast with the excluded language_hint shim). - DeepgramListenProviderV2 — multi-code language_hints serializes + dict parse. 6 tests, all passing.
…figure) Adds the missing automated coverage for the hand-maintained socket-client patches frozen in .fernignore (previously only exercised in tests/manual): - _sanitize_numeric_types: whole-float->int, recursion, passthrough, and that agent _send_model pipes payloads through it (sample_rate 16000.0 -> 16000). - Optional message param on no-payload control sends: send_close_stream() and send_keep_alive() callable with no arg emit the correct default control type. - listen/v2 send_configure(typing.Any): raw passthrough sent verbatim. Driven via a fake websocket capturing the .send() payload. 9 tests, passing.
The new 'mypy tests/typecheck' CI step ran the compat_aliases type-check across the full 3.10-3.13 matrix and exposed a latent bug: typing.assert_type is 3.11+, so compile (3.10) failed. Use the typing_extensions backport (already a dep), which works on all supported versions. Verified with mypy --python-version 3.10.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SDK regeneration 2026-06-15
Fern generator output (commit
dc04625) reconciled against hand-maintained patches.New / additive (non-breaking)
ListenV2ProfanityFilter— new type, re-exported from the package root +types.ListenV2TurnInfoWordsItem.start/.end— word-level timing on streaming turn info,Optional[float](defaultNone). The API models these as optional (stem:Option<f32>withskip_serializing_if), so the server may omit them on some words/paths and the SDK matches that contract. As a response model parsed from the wire, this is additive and non-breaking — readers just see new optional fields; no construction changes are required.diarize_modelparam +DiarizeModeltype on listen/v1. The legacydiarizeparam is now deprecated but still present — no break.Corrected (fix, not a break)
language_hint→language_hints(DeepgramListenProviderV2, its request TypedDict, and the agent…ListenProviderV2aliases). The singularlanguage_hint(Union[str, List[str]]) was never accepted by the API — the SDK serialized a field the server discarded — so this corrects a non-functional field rather than breaking a working one. Use a list:language_hints=["en", "es"].Re-applied manual patches (still needed)
agent/v1,listen/v1,listen/v2,speak/v1): broadexcept Exception, optional control-message params,_sanitize_numeric_types(agent), and thelisten/v2 send_configuretyping.Any/raw-_sendshim.core/query_encoder.py: Python bool → lowercase"true"/"false"coercion for websocket query strings.agent_v1settings,agent_v1settings_agent,agent_v1settings_agent_context, types + requests): callableAgentV1SettingsAgent, legacymessages=kwarg,.messagesproperty, nested-context migration.__init__.pyfiles.tests/wire/test_manage_v1_projects_keys.py: legacyCreateKeyV1RequestOneParamswire coverage.Dropped (generator caught up)
agent_v1settings_audio_output.container: adopt the generatedAgentV1SettingsAudioOutputContainerenum. It'sUnion[Literal["none","wav","ogg"], Any], so arbitrary strings still validate — runtime-non-breaking. Removed from.fernignore.New compat shims (generator regression)
ListenV2CloseStreamType(docs #946 narrowed the v2 CloseStreamtypeenum). The original generated type wrongly allowedUnion[Literal["Finalize","CloseStream","KeepAlive"], Any]— v2 had copied v1's control-message enum, but a CloseStream message'stypecan only ever be"CloseStream". Recreated by hand as the correctedLiteral["CloseStream"], re-exported from the threelisten__init__.pyfiles, and frozen in.fernignore— preserving the public import path without resurrecting the invalid values. This removes the only hard breaking change, so the PR stays a minor (feat) bump.DeepgramListenProviderV2LanguageHint/...Paramstype (Union[str, List[str]]); the→ diarize_model.pygit "rename" is a Fern false-match (the unrelatedDiarizeModel). Its removal broke two permanently-frozen*V2LanguageHintlisten-provider alias files. Recreated both modules by hand and froze them in.fernignoreso the public import paths keep working.Validation
mypy src/— clean (802 files) — CI gatepytest— 225 passed, 1 skipped — CI gateruff check—src/deepgramclean; not run by CI. (Pre-existingI001import-order nits remain in some hand-writtenexamples/andtests/manual/files, untouched by this regen.)All
.bakfiles removed;.fernignorepaths restored;AGENTS.mdfrozen-file list updated.Second reconciliation pass — Fern run
f6c1612(origin8dd6f48)A follow-up generator run landed on this branch after the first reconciliation above. Re-ran the full review-regen flow against it.
Re-applied — all 25 frozen patches still needed (none dropped)
Every temporarily-frozen file's diff against its
.bakconsisted only of our hand-patches being stripped; the generator introduced no new content into them (all 10__init__.pydiffs were pure deletions of our legacy re-exports). All re-applied verbatim:agent_v1settings*files, 3language_hint→language_hintsvalidators, 10 package__init__.pyre-exports,core/query_encoder.py, and thetest_manage_v1_projects_keys.pywire test.Generator changes flagged (judgment calls — kept our patches)
listen/v2 send_configure: the generator now does emit realListenV2Configure/ListenV2ConfigureSuccesstypes and wires them in (response Union included). Decision: keep thetyping.Any/raw-_sendshim — no public signature change; adopting the generated types is deferred as a separate deliberate change.except: the generator now narrows to(websockets.WebSocketException, JSONDecodeError)(addedJSONDecodeError). Decision: keep broadexcept Exception— custom transports can raise non-websocket exceptions.Generator-owned change (auto-accepted, not frozen)
ListenV2TurnInfoWordsItem.start/.endareOptional[float](defaultNone) — matching the API contract (the upstream spec was corrected to drop them fromrequired). The file isn't frozen, so Fern owns it; accepted as-is. The "New / additive" section above already reflects this final optional state.Validation (refreshed)
pytest— 237 passed, 1 skippedmypy src/deepgram— clean (804 files)ruff check src/deepgram— clean; the 25 re-applied files also clean. Pre-existingI001import-order nits remain only in hand-writtentests/custom/+tests/manual/files, untouched by this regen.All
.bakfiles removed;.fernignorepaths restored to originals.🤖 Generated with Claude Code