A Manim-based video generator for explaining Targeted Maximum Likelihood Estimation (TMLE) with Chinese TTS narration.
# Install dependencies
make install
# Build the video
make
# Preview
make previewTMLE-explain/
โโโ src/tmle_explain/
โ โโโ scenes.py # Manim scene definitions
โ โโโ narration.py # TTS narration scripts
โโโ scripts/
โ โโโ build.py # Build orchestration
โโโ audio/ # Generated TTS audio files
โโโ media/ # Rendered Manim videos
โโโ output/ # Combined video+audio clips
โโโ final.mp4 # Final output video
โโโ Makefile
โโโ pyproject.toml
Edit Makefile or use manim flags:
-ql(480p15) - Fast preview-qm(720p30) - Medium quality-qh(1080p60) - High quality (default)-qk(4K60) - 4K quality
In src/tmle_explain/narration.py:
voice = "zh-TW-HsiaoChenNeural" # Taiwan Mandarin female
# Other options:
# "zh-CN-XiaoxiaoNeural" # Mainland Mandarin female
# "zh-CN-YunxiNeural" # Mainland Mandarin male
# "zh-TW-YunJheNeural" # Taiwan Mandarin maleCreate a visual explanation video for [TOPIC] using Manim with Chinese TTS narration.
Requirements:
1. Use Manim for visualization (scenes.py)
2. Use edge-tts for Chinese narration (narration.py)
3. Animation should pause to let narration finish
4. Output 1080p video
5. Use ffmpeg to combine video and audio
Scene structure:
- Scene01: Introduction
- Scene02-N: Main content (one concept per scene)
- SceneN: Summary
For each scene, provide:
- Manim animations
- Corresponding Chinese narration text
The Manim frame is 14.2 x 8 units (aspect ratio 16:9). Always leave margins:
# Safe area: keep content within 80% of frame
# Title: top edge with buff=0.5
title.to_edge(UP)
# Content: below title with buff=0.4-0.6
content.next_to(title, DOWN, buff=0.5)| Element | Font Size | Use Case |
|---|---|---|
| Title | 44-56 | Scene titles |
| Section Header | 28-36 | Section titles |
| Body Text | 22-28 | Main content |
| Small Text | 18-22 | Labels, notes |
| Minimum | 18 | Anything smaller is hard to read |
Problem: Content overflows bottom of screen.
Solution: Calculate total height before placing elements.
# Bad: content goes off-screen
problems = VGroup(...).arrange(DOWN, buff=0.4)
solution = VGroup(...)
solution.next_to(problems, DOWN, buff=0.6) # May overflow!
# Good: use smaller buffs and font sizes
problems = VGroup(...).arrange(DOWN, buff=0.25)
solution = VGroup(...)
solution.next_to(problems, DOWN, buff=0.35)Rule of thumb for vertical content:
- Title: ~1 unit
- Each text line: ~0.5-0.8 units (depending on font size)
- Buffs: 0.2-0.4 between items
- Total available: ~6 units (leaving margins)
Problem: FadeOut causes black frame when extending video for audio.
Bad:
self.wait(5)
self.play(FadeOut(all_content)) # Ends with black frame!Good:
# Keep content visible, just wait
self.wait(10) # Content stays on screenManim doesn't auto-wrap. Keep text short or split manually:
# Bad: long line may overflow
Text("้ๆฏไธๆฎตๅพ้ท็ๆๅญๅฏ่ฝๆ่ถ
ๅบ็ซ้ข้็", font_size=28)
# Good: split into multiple lines
VGroup(
Text("้ๆฏไธๆฎตๅพ้ท็ๆๅญ", font_size=28),
Text("ๅฏ่ฝๆ่ถ
ๅบ็ซ้ข้็", font_size=28),
).arrange(DOWN, buff=0.2)Tables need careful scaling:
table = Table(data).scale(0.5) # Start at 0.5, adjust as needed
# For 5+ rows, use scale 0.4-0.5
# For 3-4 rows, use scale 0.5-0.6
# For 2 rows, use scale 0.6-0.7Always specify Chinese font:
Text("ไธญๆๅ
งๅฎน", font="PingFang TC", font_size=28)
# Other options: "Noto Sans TC", "Microsoft YaHei"Use ffmpeg to freeze last frame when audio is longer:
if audio_duration > video_duration:
padding = audio_duration - video_duration + 0.5
# Use tpad filter to clone last frame
f"[0:v]tpad=stop_mode=clone:stop_duration={padding}[v]"Get audio duration first, then adjust scene timing:
# In scenes.py, add wait time based on audio duration
# Scene01_Intro: audio=30.2s
# Calculate: total_animation_time + wait_time >= audio_duration
self.wait(10) # Adjust to match audioIf you see dvisvgm errors, either:
- Install LaTeX:
brew install --cask mactex - Use
Text()instead ofMathTex()for formulas
Remove FadeOut() at scene end. Keep content visible.
- Reduce font sizes (minimum 18-20)
- Reduce buffs (0.2-0.3)
- Shorten text
- Split into multiple scenes
Ensure video duration >= audio duration. The build script auto-extends video using tpad filter.
MIT