Speech segments bypass cache — VEED videos re-render every run

## Problem

Running `speech-segments.tsx` or `speech-segments-voiceover.tsx` multiple times with the same text/voice/model always re-generates VEED videos from scratch. The cache never hits for the Video elements that depend on speech segments.

## Root Cause

Three levels of non-determinism prevent cache hits:

### 1. Speech generation itself is NOT cached (root cause)

`experimental_generateSpeech` from the Vercel `ai` SDK does not accept a `cacheKey` parameter — it's silently ignored. Every run calls ElevenLabs from scratch. Neural TTS is inherently non-deterministic: same text/voice/model produces different audio bytes each time. This means different alignment timings → different segment boundaries → different sliced audio bytes on every run.

### 2. Segment audio bytes are embedded in the Video cache key

`computeCacheKey` for a Video element serializes the entire `prompt` prop via `serializeValue()`. When `prompt.audio` is a `ResolvedElement` (segment), `serializeValue` recursively walks `Object.entries()` and reaches `meta.file._data` — the raw Uint8Array — which gets base64-encoded into the cache key string. Any byte-level difference in the audio produces a different key.

### 3. Non-deterministic floats in the cache key

The serialized segment also includes `meta.duration` (ffprobe float), `meta.words` (ElevenLabs timing floats), and `start`/`end` properties — all of which change per API call.

## Suggested Fix

**Fix A — Cache speech generation.** Wrap the ElevenLabs `/with-timestamps` fetch with `withCache`, keyed on text + voice + model. This prevents redundant API calls and produces stable audio bytes on cache hit.

**Fix B — Exclude audio bytes from Video cache key.** When `serializeValue` encounters a `ResolvedElement` inside a prop (e.g., `prompt.audio`), it should use the element's *semantic identity* (type + props + children text, via `computeCacheKey`) rather than serializing the physical bytes from `meta.file._data`. This makes the Video cache key depend on *what was said*, not *the exact bytes of the audio*.

Fix A alone is sufficient if the cache produces bit-identical results (which it should — returning the same cached bytes). Fix B is the robust solution for cases where a resolved element is embedded in another element's props.

## Affected Files

- `src/react/resolve.ts` — `resolveSpeechElement()` needs to wrap the speech generation in `withCache`
- `src/react/renderers/utils.ts` — `serializeValue()` should detect `ResolvedElement`/`VargElement` and use `computeCacheKey` instead of recursive `Object.entries()` walk
- `src/ai-sdk/cache.ts` — may need a speech-specific cache wrapper
- `src/react/renderers/cache.test.ts` — missing test for props containing resolved elements

## Repro

```bash
# Run once — generates speech + 2 VEED videos (~$0.50 + wait time)
bun run src/react/examples/async/speech-segments.tsx

# Run again — speech is re-generated, VEED videos are re-rendered (cache miss)
bun run src/react/examples/async/speech-segments.tsx
```

Expected: second run should hit cache for both Speech and Video elements.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speech segments bypass cache — VEED videos re-render every run #161

Problem

Root Cause

1. Speech generation itself is NOT cached (root cause)

2. Segment audio bytes are embedded in the Video cache key

3. Non-deterministic floats in the cache key

Suggested Fix

Affected Files

Repro

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Speech segments bypass cache — VEED videos re-render every run #161

Description

Problem

Root Cause

1. Speech generation itself is NOT cached (root cause)

2. Segment audio bytes are embedded in the Video cache key

3. Non-deterministic floats in the cache key

Suggested Fix

Affected Files

Repro

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions