fix(voice-button): stabilize speech flow and clarify replace mode by SonyLeo · Pull Request #315 · opentiny/tiny-robot

SonyLeo · 2026-03-18T07:11:23Z

变更背景

本次改动主要聚焦 VoiceButton 的语音识别流程和文档说明，目的是统一“追加 / 替换”两种输入体验，并修复语音会话中的几个边界问题。

主要收敛的问题包括：

autoReplace 的行为不够直观，开启后在部分场景下看起来仍像“追加”
continuous 场景下识别结果会话语义不清晰，容易和替换模式产生理解偏差
手动停止录音时，浏览器补发的最终 transcript 存在丢失风险
speechConfig 动态更新后，底层语音 handler 仍可能沿用旧配置
demo 和文档中对 replace 模式、continuous 模式、lang 配置的说明不够清楚

主要改动

1. 收敛语音写回语义

autoReplace: false 时，语音识别结果继续追加到输入框末尾
autoReplace: true 时，录音期间使用识别结果替换整个输入框内容
在 continuous: true 场景下，替换内容为“当前录音会话的累计识别结果”，而不是仅保留最后一句

2. 优化 Web Speech 结果处理

按增量方式解析 SpeechRecognitionEvent
区分 final / interim 结果，避免 continuous 场景下重复拼接历史内容
统一会话内 transcript 的累计逻辑，使 replace 模式和 continuous 模式的表现一致

3. 修复 stop / end 生命周期问题

手动 stop() 时不再提前清理事件处理器
改为由原生 onend 统一完成收尾，避免丢失最后一段 transcript
调整回调顺序，保证录音状态、结束事件和最终识别结果的一致性

4. 让 speechConfig 按最新配置生效

避免在组件初始化时一次性快照 speechConfig
lang、continuous、interimResults、customHandler 等配置改为在下一次 start() 时按最新值生效
解决运行时切换语音配置后，底层 handler 仍使用旧参数的问题

5. 同步更新文档和示例

将 demo 的展示方式从“混合输入 / 连续识别”调整为“追加模式 / 替换模式”
补充 autoReplace 与 continuous 组合使用时的行为说明
补充 speechConfig.lang 的配置说明和常见取值示例
同步更新 VoiceButton / SpeechConfig 相关文档注释

影响说明

本次改动后，autoReplace: true 的语义明确为“整框替换”
如果输入框中已有手动输入内容，开启 replace 模式后会被当前录音结果覆盖
若同时开启 continuous: true，输入框会随着当前录音会话的累计识别结果持续更新

验证情况

pnpm.cmd -F @opentiny/tiny-robot build
pnpm.cmd -F tiny-robot-test build
pnpm.cmd -F tiny-robot-test test -- src/voice-button/index.spec.ts
pnpm.cmd -F docs build

Summary by CodeRabbit

New Features
- Voice input modes renamed to "Append" and "Replace" with matching UI; Replace updates content live, Append adds to end.
- Interim transcription now reflects accumulated finalized text plus current partial results for smoother live feedback.
Documentation
- Updated voice recognition docs, demo and language guidance; clarified voice-button handler behavior and speech-config options.
Bug Fixes
- Improved placeholder, focus and session handling to avoid stale or duplicated transcription.

…lacement

coderabbitai · 2026-03-18T07:11:31Z

Walkthrough

Renames voice modes from mixed|continuous to append|replace, changes demo/docs and VoiceButton props/types, adds session-based transcript accumulation and auto-replace behavior, and updates Web Speech handling to parse/accumulate finals and emit interim/final with updated callbacks.

Changes

Cohort / File(s)	Summary
Documentation & Demo `docs/demos/sender/voice-input.vue`, `docs/src/components/sender.md`	Replaced mode values/labels with `append`/`replace`, updated UI text, placeholders, demo descriptions, added lang tips, and tightened `VoiceButton` prop/type docs.
Voice Button Implementation `packages/components/src/sender-actions/voice-button/index.vue`, `packages/components/src/sender-actions/voice-button/speech.types.ts`	Added session state (`committedTranscript` etc.), helpers to reset/append/replace transcripts, conditional `autoReplace` flows for onStart/onInterim/onFinal/onEnd/onError, and removed `onVoiceButtonClick` from `SpeechConfig`.
Web Speech Handler `packages/components/src/sender-actions/voice-button/webSpeechHandler.ts`	Added `parseSpeechRecognitionResult`, tracking of `finalizedTranscript`, accumulate final pieces, emit interim as accumulated finals+partial, invoke `onEnd` with finalized transcript, and reset session state across start/stop/error.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Button as VoiceButton
    participant Speech as SpeechRecognition
    participant Editor as Editor/Document

    rect rgba(255, 200, 100, 0.5)
    Note over User,Editor: Legacy / append-on-final flow
    User->>Button: Press record
    Button->>Speech: start()
    Speech-->>Button: onresult (interim)
    Button->>Button: emit interim event
    Speech-->>Button: onresult (final)
    Button->>Editor: append final transcript (if autoInsert)
    Button->>Editor: focus
    end

    rect rgba(100, 200, 255, 0.5)
    Note over User,Editor: New autoReplace / accumulated flow
    User->>Button: Press record
    Button->>Speech: start() (autoReplace config)
    Speech-->>Button: onresult (finals + interim)
    Button->>Button: parseSpeechRecognitionResult()
    Button->>Button: update finalizedTranscript
    Button->>Editor: replace text at speechRange with interim (finals+partial)
    Speech-->>Button: onresult (final)
    Button->>Editor: replace text at speechRange with merged final transcript
    Button->>Button: reset session
    Button->>Editor: focus
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped and listened, soft and snappy,
I learned to append or swap — how happy!
Finals gather, whispers fill the line,
Replace or add — each chunk aligns.
A tiny rabbit taps and sends on time.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title 'fix(voice-button): stabilize speech flow and clarify replace mode' accurately reflects the main changes, which involve stabilizing speech event handling and clarifying the replace mode behavior in the voice button component.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-18T07:15:05Z

✅ Preview build completed successfully!

Click the image above to preview.
Preview will be automatically removed when this PR is closed.

github-actions · 2026-03-18T07:15:06Z

📦 Package Preview

pnpm add https://pkg.pr.new/@opentiny/tiny-robot@dc9998c

pnpm add https://pkg.pr.new/@opentiny/tiny-robot-kit@dc9998c

pnpm add https://pkg.pr.new/@opentiny/tiny-robot-svgs@dc9998c

commit: dc9998c

… to append/replace

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

packages/components/src/sender-actions/voice-button/index.vue (1)

25-57: Consider simplifying speechRange state.

The speechRange.to field is tracked (line 54) but never read—the replacement always uses docSize as the end position. Since from is always 0 for autoReplace mode, you could simplify to just a boolean flag indicating whether a recording session has started.

However, if this tracking is intentional for future features (e.g., partial replacement from cursor position), feel free to keep it.

♻️ Optional simplification

-const speechRange = ref<{ from: number; to: number } | null>(null)
+const hasStartedRecording = ref(false)

 const resetSpeechRange = () => {
-  speechRange.value = null
+  hasStartedRecording.value = false
 }

 const insertTranscript = (transcript: string) => {
   // ... early returns ...
   
   // autoReplace 模式：替换整个输入框内容
-  if (speechRange.value === null) {
-    speechRange.value = {
-      from: 0,
-      to: 0,
-    }
-  }
+  hasStartedRecording.value = true

   const docSize = editorInstance.state.doc.content.size
-  const tr = editorInstance.state.tr.insertText(transcript, speechRange.value.from, docSize)
+  const tr = editorInstance.state.tr.insertText(transcript, 0, docSize)
   editorInstance.view.dispatch(tr)

-  speechRange.value = {
-    from: speechRange.value.from,
-    to: speechRange.value.from + transcript.length,
-  }
   editorInstance.commands.focus('end')
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/components/src/sender-actions/voice-button/index.vue` around lines
25 - 57, The speechRange.to field is never read in insertTranscript (autoReplace
uses docSize), so simplify state by removing speechRange.to and either replace
speechRange with a boolean like speechStarted or keep only speechRange.from;
update insertTranscript to initialize and check that flag (e.g., speechStarted
or speechRange.from === 0) and use speechRange.from (or 0) as the insert start,
and update the single state field accordingly; ensure references to
speechRange.value.to are removed and editorInstance, autoReplace, and
insertTranscript behavior remain unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/src/components/sender.md`:
- Line 1: The file docs/src/components/sender.md contains a UTF-8 BOM (U+FEFF)
at the start; remove the BOM so the file begins with the first markdown
character and re-save the file as UTF-8 without BOM (e.g., strip the leading
U+FEFF or re-encode), ensuring consistency with speech.types.ts which was
corrected earlier.

In `@packages/components/src/sender-actions/voice-button/speech.types.ts`:
- Line 1: The file speech.types.ts contains a UTF‑8 BOM at the very start;
remove the leading BOM character so the file begins with the first TypeScript
token, re-save the file as UTF‑8 without BOM, and re-commit; if your editor adds
BOMs automatically, update its save settings (or run a one-time clean-up
command) to ensure speech.types.ts stays UTF‑8 without BOM.

---

Nitpick comments:
In `@packages/components/src/sender-actions/voice-button/index.vue`:
- Around line 25-57: The speechRange.to field is never read in insertTranscript
(autoReplace uses docSize), so simplify state by removing speechRange.to and
either replace speechRange with a boolean like speechStarted or keep only
speechRange.from; update insertTranscript to initialize and check that flag
(e.g., speechStarted or speechRange.from === 0) and use speechRange.from (or 0)
as the insert start, and update the single state field accordingly; ensure
references to speechRange.value.to are removed and editorInstance, autoReplace,
and insertTranscript behavior remain unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cc096f6e-f1c4-4f6d-b278-1a831a223417

📥 Commits

Reviewing files that changed from the base of the PR and between 16fd9bf and 855ae9e.

📒 Files selected for processing (4)

docs/demos/sender/voice-input.vue
docs/src/components/sender.md
packages/components/src/sender-actions/voice-button/index.vue
packages/components/src/sender-actions/voice-button/speech.types.ts

…inuous recognition support

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/src/components/sender.md`:
- Around line 167-173: The markdown table for the `lang` examples is missing the
required blank line before the admonition terminator (`:::`), causing the
terminator to be parsed as a table row; edit the `sender.md` section containing
the `| 值 | 说明 |` table and insert a single empty line after the last table row
(after the `| `en-US` | 美式英语 |` row) before the `:::` terminator so the
admonition closes correctly.

In `@packages/components/src/sender-actions/voice-button/webSpeechHandler.ts`:
- Around line 151-156: The manual stop is clearing event handlers before the
SpeechRecognition final onresult can arrive; change the stop flow so you call
this.recognition.stop() first and allow the native onresult→onend sequence to
run (letting onend invoke cleanup())—do not call cleanup() or
resetSessionTranscript() before recognition.stop(); also remove the eager
callbacks.onEnd() invocation at useSpeechHandler.ts line where callbacks.onEnd()
is called (around useSpeechHandler.ts:99) so the final transcript
(finalizedTranscript) is accumulated by the native events and delivered before
cleanup resets handlers/state.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 00f621fb-799e-4808-bf33-16701a4df123

📥 Commits

Reviewing files that changed from the base of the PR and between 855ae9e and 986c63c.

📒 Files selected for processing (5)

docs/demos/sender/voice-input.vue
docs/src/components/sender.md
packages/components/src/sender-actions/voice-button/index.vue
packages/components/src/sender-actions/voice-button/speech.types.ts
packages/components/src/sender-actions/voice-button/webSpeechHandler.ts

…odes

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

docs/src/components/sender.md (1)

162-173: ⚠️ Potential issue | 🟡 Minor

Close the lang tip block.

The :::tip lang 语言说明 block never terminates, so everything after the table can be parsed inside the admonition. Add the missing closing ::: after the last row.

📝 Suggested markdown fix

 :::tip lang 语言说明
 `lang` 用于指定语音识别语言，建议显式传入，并与页面的 `html lang` 保持一致，避免页面语言和浏览器环境语言不一致时出现识别偏差。
 
 常见取值示例：
 
 | 值 | 说明 |
 | --- | --- |
 | `en` | 英语 |
 | `zh` | 中文 |
 | `zh-CN` | 简体中文 |
 | `en-US` | 美式英语 |
+
+:::
 
 #### 自定义语音服务

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/src/components/sender.md` around lines 162 - 173, The tip admonition
starting with ":::tip lang 语言说明" is not closed, causing the remainder of the
document to be included inside the block; locate the ":::tip lang 语言说明" block in
docs/src/components/sender.md (the block containing the language table) and add
the missing closing marker ":::" immediately after the table's last row so the
admonition terminates properly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/components/src/sender-actions/voice-button/index.vue`:
- Around line 62-64: The current speechOptions constant snapshots
props.speechConfig once and becomes stale; make it reactive by replacing the
plain object with a computed or a reactive wrapper that derives from
props.speechConfig (e.g., replace const speechOptions = {...props.speechConfig}
with a computed(() => ({...props.speechConfig})) or reactive copy), and add a
watcher on props.speechConfig to recreate the speech recognition handler (the
customHandler / recognition handler creation logic) whenever speechConfig
changes so the handler uses updated lang, continuous, interimResults, and
customHandler settings.

---

Duplicate comments:
In `@docs/src/components/sender.md`:
- Around line 162-173: The tip admonition starting with ":::tip lang 语言说明" is
not closed, causing the remainder of the document to be included inside the
block; locate the ":::tip lang 语言说明" block in docs/src/components/sender.md (the
block containing the language table) and add the missing closing marker ":::"
immediately after the table's last row so the admonition terminates properly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 6c69696d-a33a-4695-96e5-6e20d44d5b43

📥 Commits

Reviewing files that changed from the base of the PR and between 986c63c and f0a4ca2.

📒 Files selected for processing (4)

docs/demos/sender/voice-input.vue
docs/src/components/sender.md
packages/components/src/sender-actions/voice-button/index.vue
packages/components/src/sender-actions/voice-button/speech.types.ts

🚧 Files skipped from review as they are similar to previous changes (1)

docs/demos/sender/voice-input.vue

SonyLeo added 2 commits March 17, 2026 06:22

feat(voice-button): add autoReplace support for continuous speech rep…

71b7042

…lacement

docs(sender): update voice button documentation and type definitions

39ca9cb

SonyLeo linked an issue Mar 18, 2026 that may be closed by this pull request

【docs enhancement】speech config default lang #302

Open

feat(voice-button): refactor speech input modes from mixed/continuous…

855ae9e

… to append/replace

SonyLeo force-pushed the docs/update-speech-config branch from 4c60cae to 855ae9e Compare March 18, 2026 09:25

SonyLeo marked this pull request as ready for review March 20, 2026 06:50

coderabbitai Bot reviewed Mar 20, 2026

View reviewed changes

Comment thread docs/src/components/sender.md

Comment thread packages/components/src/sender-actions/voice-button/speech.types.ts Outdated

feat(voice-button): improve speech handling with autoReplace and cont…

986c63c

…inuous recognition support

coderabbitai Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread docs/src/components/sender.md Outdated

Comment thread packages/components/src/sender-actions/voice-button/webSpeechHandler.ts Outdated

docs(sender): update voice input descriptions for clarity on speech m…

f0a4ca2

…odes

coderabbitai Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread packages/components/src/sender-actions/voice-button/index.vue Outdated

fix: review suggestion

dc9998c

SonyLeo changed the title ~~feat: add autoReplace support for sender and update voice button document~~ fix(voice-button): stabilize speech flow and clarify replace mode Apr 16, 2026

SonyLeo linked an issue Apr 16, 2026 that may be closed by this pull request

🐛 [Bug]: 语音录入，容易有重复产生 #323

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(voice-button): stabilize speech flow and clarify replace mode#315

fix(voice-button): stabilize speech flow and clarify replace mode#315
SonyLeo wants to merge 6 commits intoopentiny:developfrom
SonyLeo:docs/update-speech-config

SonyLeo commented Mar 18, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Mar 18, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 18, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SonyLeo commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

变更背景

主要改动

1. 收敛语音写回语义

2. 优化 Web Speech 结果处理

3. 修复 stop / end 生命周期问题

4. 让 speechConfig 按最新配置生效

5. 同步更新文档和示例

影响说明

验证情况

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📦 Package Preview

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SonyLeo commented Mar 18, 2026 •

edited

Loading

coderabbitai Bot commented Mar 18, 2026 •

edited

Loading

github-actions Bot commented Mar 18, 2026 •

edited

Loading

github-actions Bot commented Mar 18, 2026 •

edited

Loading