fix: make Kokoro TTS multilingual-ready and fix JSON parsing for podc…#1451
fix: make Kokoro TTS multilingual-ready and fix JSON parsing for podc…#1451guangyang1206 wants to merge 1 commit into
Conversation
…ast generation
## Summary
Fixes two issues with podcast generation when using local/kokoro TTS:
1. **JSON parsing fails on multilingual content** — LLM-generated transcripts
containing backslashes (common in Chinese/Asian text) caused
`json.loads()` to raise `Invalid \escape` errors.
- Fix: pre-escape backslashes and use `strict=False` in `json.loads()`
2. **Kokoro language code hardcoded to English** — `lang_code="a"`
caused Chinese (and other non-English) podcast audio to speak
"Chinese letter, Chinese letter..." instead of actual content.
- Fix: read `KOKORO_LANG_CODE` from environment variable, default "a"
## Changes
- `surfsense_backend/app/agents/podcaster/nodes.py`:
- Add backslash escaping before `json.loads()`
- Pass `strict=False` to `json.loads()` for lenient parsing
- Read `lang_code` from `os.getenv("KOKORO_LANG_CODE", "a")`
- Document valid language codes in comments
## Usage
To generate Chinese podcasts, set in `.env`:
```
TTS_SERVICE=local/kokoro
KOKORO_LANG_CODE=z
```
Valid codes: `a` (American English), `b` (British English),
`z` (Chinese), `j` (Japanese), `k` (Korean), etc.
Fixes MODSetter#1440
|
@guangyang1206 is attempting to deploy a commit to the Rohan Verma's projects Team on Vercel. A member of the Team first needs to authorize it. |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@guangyang1206 Thanks for the quick fix here but I want llm to select the lang code(as it generates the transcripts anyway) and if for some reason that fails then we should fallback to Updated the main issue with some thoughts : #1440 |
…ast generation
Summary
Fixes two issues with podcast generation when using local/kokoro TTS:
JSON parsing fails on multilingual content — LLM-generated transcripts containing backslashes (common in Chinese/Asian text) caused
json.loads()to raiseInvalid \escapeerrors.strict=Falseinjson.loads()Kokoro language code hardcoded to English —
lang_code="a"caused Chinese (and other non-English) podcast audio to speak "Chinese letter, Chinese letter..." instead of actual content.KOKORO_LANG_CODEfrom environment variable, default "a"Changes
surfsense_backend/app/agents/podcaster/nodes.py:json.loads()strict=Falsetojson.loads()for lenient parsinglang_codefromos.getenv("KOKORO_LANG_CODE", "a")Usage
To generate Chinese podcasts, set in
.env:Valid codes:
a(American English),b(British English),z(Chinese),j(Japanese),k(Korean), etc.Fixes #1440
Original Issue
#1440
What I Did
Made Kokoro TTS language configurable: Added
KOKORO_LANG_CODEandKOKORO_DEFAULT_VOICEenvironment variables so users don't need to modify source code for multilingual support.Fixed hardcoded English default: Changed
lang_code="a"(American English) to useapp_config.KOKORO_LANG_CODE(defaults to"a"for backward compatibility).Improved JSON parsing for multilingual content: Enhanced the fallback JSON parsing logic to handle escape characters in multilingual content (Chinese, Japanese, Korean, etc.) that were causing "Invalid \escape" errors.
Why I Did It
Problem: Users generating Chinese podcasts with local/kokoro TTS had to manually edit source code to change
lang_code="a"tolang_code="z"and install Chinese dependencies. This is not user-friendly for a self-hosted application.Solution: Made Kokoro configuration configurable through environment variables, following the principle of "configuration over code modification".
Anticipating reviewer questions:
Changed Files
surfsense_backend/app/config/__init__.py: AddedKOKORO_LANG_CODEandKOKORO_DEFAULT_VOICEenvironment variables with defaultssurfsense_backend/app/agents/podcaster/nodes.py:create_merged_podcast_audio()to use configurablelang_codeandvoicecreate_podcast_transcript()to handle escape characters in multilingual contentTesting
lang_code="a", emptyKOKORO_DEFAULT_VOICEfalls back toget_voice_for_provider()).Notes for Reviewers
KOKORO_LANG_CODEfor their language.strict=Falseparsing.Related Issues
Fixes #1440
Environment Variables Added
Dependencies
No new dependencies added. Users need to install language-specific Misaki dependencies separately (as documented in the issue).
High-level PR Summary
This PR fixes two issues with podcast generation when using Kokoro TTS: it makes the language code configurable via the
KOKORO_LANG_CODEenvironment variable (previously hardcoded to English) to support multilingual podcasts, and it improves JSON parsing to handle backslash escape characters that commonly appear in LLM-generated transcripts for Chinese and other Asian languages. The changes default to American English ("a") for backward compatibility.⏱️ Estimated Review Time: 5-15 minutes
💡 Review Order Suggestion
surfsense_backend/app/agents/podcaster/nodes.py