You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched existing issues and did not find a duplicate.
I included enough detail to reproduce or investigate the problem.
Area
apps/web
Steps to reproduce
Summary
The browser↔server UI WebSocket (/ws) drops and reconnects every few seconds when the client is on a link with mild packet reordering and/or loss — in our case a WireGuard "road-warrior" tunnel (OPNsense if_wg). The underlying TCP connection stays alive the whole time, and other long-lived apps over the same link (IMAP, plain HTTPS) are unaffected. The WS layer declares "disconnected" on brief stalls that TCP recovers from on its own.
This also intermittently leaves a thread stuck on "awaiting input" with no rendered content — the orphaned-thread behavior in #313 — which appears to be a downstream symptom of this same reconnect churn.
Clients: Android Chrome and desktop Chrome, both reaching the server over a WireGuard tunnel terminating on an OPNsense firewall.
Evidence it is not the server, proxy, or network config
Server/proxy are stable. An authenticated WS client run from the server host — both directly to the t3 process (loopback) and through Caddy — held 70 s with zero drops, while the user's browser was dropping during that same window.
It's specific to the lossy path, measured live.ss -ti on the host, comparing all established connections at one instant:
Path
reord_seen
retransmits
WireGuard client (the browser)
25 (peaked at 206)
0/336
non-WG connections (incl. public internet)
0
0/4 – 0/15
loopback
0
minimal
The WG connection also showed cwnd collapsed to ~7 and ~27 ms jitter — but the TCP socket stayed established and transferred 60+ MB. Jittery, not dead.
Not MTU / offload. MSS is correctly clamped (mss:1360, pmtu:1500); NIC hardware offload (TSO/LRO/CRC) is disabled. Oversized-packet black-holing is ruled out.
Conclusion: TCP survives the reordering/loss; it's the WebSocket keepalive/heartbeat that tears the connection down.
Suspected root cause
The WS keepalive is too aggressive for imperfect links — a single brief stall (reordering or a retransmit) trips a disconnect instead of being ridden out. Apps without an aggressive heartbeat over the identical tunnel are fine.
Requested change
Make the WS keepalive tolerant of brief stalls/reordering before tearing down — a longer ping timeout, several missed beats before declaring dead, and ideally a configurable timeout for users on high-latency/VPN/mobile links.
[Bug]: Opencode server frequently disconnects / connection lost #2579 — frequent disconnects on the OpenCode provider SSE connection. Different layer (provider, not the UI WebSocket), but the same theme: connection handling that doesn't tolerate imperfect links, no heartbeat, opaque "disconnected" messaging. A shared resilience philosophy would help both.
Use t3code over any link with mild loss + packet reordering. On a Linux box you can emulate it on the client (or a gateway):
sudo tc qdisc add dev <iface> root netem delay 30ms 20ms reorder 5% loss 0.2%
Open a thread and watch the UI cycle "disconnected → reconnected" every few seconds while the page is otherwise reachable. Remove with sudo tc qdisc del dev <iface> root.
Expected behavior
Requested change
Make the WS keepalive tolerant of brief stalls/reordering before tearing down — a longer ping timeout, several missed beats before declaring dead, and ideally a configurable timeout for users on high-latency/VPN/mobile links.
The browser↔server UI WebSocket (/ws) drops and reconnects every few seconds when the client is on a link with mild packet reordering and/or loss — in our case a WireGuard "road-warrior" tunnel (OPNsense if_wg). The underlying TCP connection stays alive the whole time, and other long-lived apps over the same link (IMAP, plain HTTPS) are unaffected. The WS layer declares "disconnected" on brief stalls that TCP recovers from on its own.
This also intermittently leaves a thread stuck on "awaiting input" with no rendered content — the orphaned-thread behavior in #313 — which appears to be a downstream symptom of this same reconnect churn.
Clients: Android Chrome and desktop Chrome, both reaching the server over a WireGuard tunnel terminating on an OPNsense firewall.
Evidence it is not the server, proxy, or network config
Server/proxy are stable. An authenticated WS client run from the server host — both directly to the t3 process (loopback) and through Caddy — held 70 s with zero drops, while the user's browser was dropping during that same window.
It's specific to the lossy path, measured live.ss -ti on the host, comparing all established connections at one instant:
Path
reord_seen
retransmits
WireGuard client (the browser)
25 (peaked at 206)
0/336
non-WG connections (incl. public internet)
0
0/4 – 0/15
loopback
0
minimal
The WG connection also showed cwnd collapsed to ~7 and ~27 ms jitter — but the TCP socket stayed established and transferred 60+ MB. Jittery, not dead.
Not MTU / offload. MSS is correctly clamped (mss:1360, pmtu:1500); NIC hardware offload (TSO/LRO/CRC) is disabled. Oversized-packet black-holing is ruled out.
Conclusion: TCP survives the reordering/loss; it's the WebSocket keepalive/heartbeat that tears the connection down.
Suspected root cause
The WS keepalive is too aggressive for imperfect links — a single brief stall (reordering or a retransmit) trips a disconnect instead of being ridden out. Apps without an aggressive heartbeat over the identical tunnel are fine.
Before submitting
Area
apps/web
Steps to reproduce
Summary
The browser↔server UI WebSocket (
/ws) drops and reconnects every few seconds when the client is on a link with mild packet reordering and/or loss — in our case a WireGuard "road-warrior" tunnel (OPNsenseif_wg). The underlying TCP connection stays alive the whole time, and other long-lived apps over the same link (IMAP, plain HTTPS) are unaffected. The WS layer declares "disconnected" on brief stalls that TCP recovers from on its own.This also intermittently leaves a thread stuck on "awaiting input" with no rendered content — the orphaned-thread behavior in #313 — which appears to be a downstream symptom of this same reconnect churn.
Environment
Evidence it is not the server, proxy, or network config
Server/proxy are stable. An authenticated WS client run from the server host — both directly to the t3 process (loopback) and through Caddy — held 70 s with zero drops, while the user's browser was dropping during that same window.
It's specific to the lossy path, measured live.
ss -tion the host, comparing all established connections at one instant:The WG connection also showed
cwndcollapsed to ~7 and ~27 ms jitter — but the TCP socket stayed established and transferred 60+ MB. Jittery, not dead.Not MTU / offload. MSS is correctly clamped (
mss:1360,pmtu:1500); NIC hardware offload (TSO/LRO/CRC) is disabled. Oversized-packet black-holing is ruled out.Conclusion: TCP survives the reordering/loss; it's the WebSocket keepalive/heartbeat that tears the connection down.
Suspected root cause
The WS keepalive is too aggressive for imperfect links — a single brief stall (reordering or a retransmit) trips a disconnect instead of being ridden out. Apps without an aggressive heartbeat over the identical tunnel are fine.
Requested change
Related
Repro
Use t3code over any link with mild loss + packet reordering. On a Linux box you can emulate it on the client (or a gateway):
Open a thread and watch the UI cycle "disconnected → reconnected" every few seconds while the page is otherwise reachable. Remove with
sudo tc qdisc del dev <iface> root.Expected behavior
Requested change
Actual behavior
Summary
The browser↔server UI WebSocket (
/ws) drops and reconnects every few seconds when the client is on a link with mild packet reordering and/or loss — in our case a WireGuard "road-warrior" tunnel (OPNsenseif_wg). The underlying TCP connection stays alive the whole time, and other long-lived apps over the same link (IMAP, plain HTTPS) are unaffected. The WS layer declares "disconnected" on brief stalls that TCP recovers from on its own.This also intermittently leaves a thread stuck on "awaiting input" with no rendered content — the orphaned-thread behavior in #313 — which appears to be a downstream symptom of this same reconnect churn.
Environment
Evidence it is not the server, proxy, or network config
Server/proxy are stable. An authenticated WS client run from the server host — both directly to the t3 process (loopback) and through Caddy — held 70 s with zero drops, while the user's browser was dropping during that same window.
It's specific to the lossy path, measured live.
ss -tion the host, comparing all established connections at one instant:The WG connection also showed
cwndcollapsed to ~7 and ~27 ms jitter — but the TCP socket stayed established and transferred 60+ MB. Jittery, not dead.Not MTU / offload. MSS is correctly clamped (
mss:1360,pmtu:1500); NIC hardware offload (TSO/LRO/CRC) is disabled. Oversized-packet black-holing is ruled out.Conclusion: TCP survives the reordering/loss; it's the WebSocket keepalive/heartbeat that tears the connection down.
Suspected root cause
The WS keepalive is too aggressive for imperfect links — a single brief stall (reordering or a retransmit) trips a disconnect instead of being ridden out. Apps without an aggressive heartbeat over the identical tunnel are fine.
Impact
Minor bug or occasional failure
Version or commit
No response
Environment
No response
Logs or stack traces
Screenshots, recordings, or supporting files
No response
Workaround
No response