Skip to content

Agent dispatch reports JS_RUNNING but job is never delivered to registered worker #1211

@VictorZ-29

Description

@VictorZ-29

Describe the bug

When using explicit agent dispatch via createRoom({ agents: [{ agentName: "stt-agent", metadata: ... }] }), the dispatch is silently dropped — the agent worker never receives the job request. However, calling AgentDispatchClient.createDispatch() for the same room and agent succeeds immediately (agent connects within 0.5s). This suggests the issue is specifically with the agents field in CreateRoomRequest, not with the dispatch system in general.

This occurs intermittently and is NOT limited to idle periods. In one test, session 1 succeeded via createRoom.agents, but sessions 2, 3, and 4 (created seconds apart) all required a createDispatch() fallback. The agent container was running, registered, and had idle prewarmed workers the entire time.

The issue has three distinct failure modes:

  1. Dispatch silently droppedcreateRoom.agents creates the room but the agent never receives the job. Calling createDispatch() as a fallback delivers the job within 0.5s.
  2. Dispatch state is wronglistDispatch() reports JS_RUNNING for dispatches that were never delivered to the agent. We observed 4 consecutive sessions where listDispatch() showed JS_RUNNING but the agent container logged zero job requests.
  3. Dispatch delivered with significant delay — In some cases the job arrives 6-16 seconds after room creation instead of the normal <2 seconds.

We verified the agent container was running (Azure Container Apps, min-replicas: 1, no restarts) and had a freshly prewarmed worker process available. The agent was registered with LiveKit Cloud (Frankfurt/Germany 2 region, server v1.10.1).

Relevant log output

# Agent container — registered and idle with prewarmed worker
{"level":30,"version":"1.2.3","msg":"starting worker"}
{"level":30,"version":"1.2.3","id":"AW_pppn92uksXfd","server_info":{"edition":"Cloud","version":"1.10.1","region":"Germany 2","nodeId":"NC_OFRANKFURT1B_zQeKyzpij3KH"},"msg":"registered worker"}
[STT-AGENT] Worker prewarmed pid=23

# Session 1 — createRoom.agents dispatch works, agent connects in ~2s
[LIVEKIT createRoom] Room created name=myapp-<redacted-1> metadata=set
[agent-transcript-token] Agent transcript JWT issued (2s after room creation)
[ensureAgentDispatched] Agent stt-agent connected to backend (check 1)
Transcript finalised segmentCount=2 ✅

# Session 2 — createRoom.agents dispatch FAILS, createDispatch() retry works
[LIVEKIT createRoom] Room created name=myapp-<redacted-2> metadata=set
# 5s later — agent never contacted backend:
[ensureAgentDispatched] Agent stt-agent not connected — re-dispatching (attempt 1/3)
[ensureAgentDispatched] Dispatched stt-agent to room
# 0.5s after re-dispatch — agent connects immediately:
[agent-transcript-token] Agent transcript JWT issued
[ensureAgentDispatched] Agent stt-agent connected to backend (check 2)
Transcript finalised segmentCount=2 ✅

# Sessions 3 and 4 — same pattern: createRoom.agents fails, createDispatch() works
[LIVEKIT createRoom] Room created name=myapp-<redacted-3> metadata=set
[ensureAgentDispatched] Agent stt-agent not connected — re-dispatching (attempt 1/3)
# 0.6s later — agent connects:
[agent-transcript-token] Agent transcript JWT issued
[ensureAgentDispatched] Agent stt-agent connected to backend (check 2)
Transcript finalised segmentCount=2 ✅

# Earlier test — createRoom.agents fails AND dispatch state is wrong:
# listDispatch() reported JS_RUNNING but agent container received nothing
# 4 consecutive sessions all failed with segmentCount=0
# Agent container was running with idle prewarmed worker the entire time

Describe your environment

  • @livekit/agents: 1.2.3
  • @livekit/rtc-node: 0.13.24
  • livekit-server-sdk: 2.15.0
  • Node.js: >=22 (agent container), >=24 (backend)
  • LiveKit Cloud: v1.10.1, Germany 2 (Frankfurt) region, nodeId: NC_OFRANKFURT1B
  • Agent container: Azure Container Apps (min-replicas: 1, always running)
  • Dispatch method: Explicit via createRoom({ agents: [...] }) and AgentDispatchClient.createDispatch() as retry

Minimal reproducible example

  1. Register an agent: cli.runApp(new ServerOptions({ agentName: "stt-agent", numIdleProcesses: 2 }))
  2. Create rooms with explicit dispatch:
    const room = await roomService.createRoom({
      name: roomName,
      metadata: sessionMetadata,
      agents: [{ agentName: "stt-agent", metadata: sessionMetadata }],
    });
  3. Create multiple rooms in sequence (each after the previous one ends)
  4. Some rooms: agent receives job and connects normally
  5. Other rooms: agent never receives job — no pattern to which ones fail
  6. Workaround — calling createDispatch() delivers the job immediately:
    await dispatchClient.createDispatch(roomName, "stt-agent", { metadata: sessionMetadata });
    // Agent connects within 0.5s

Additional information

  • The agent container never crashes or restarts — verified via Azure Container Apps status and logs.
  • listDispatch() returns false positives: dispatch records with JS_RUNNING status for jobs that were never delivered to the agent. This makes dispatch state unreliable for verification.
  • createDispatch() works reliably as a fallback — suggesting the issue is specifically with the agents field in CreateRoomRequest, not with the dispatch system in general.
  • We upgraded from @livekit/agents@1.0.50 to 1.2.3 (which includes reconnection fixes from PR fix: address 5 Detail scan bugs from March 11 (reconnect, mutex leak, playout, ordering, retryability) #1188) but the issue persists.
  • Our workaround: after createRoom, we verify end-to-end that the agent contacted our backend. If not, we call createDispatch() which succeeds immediately.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions