Skip to content

Fix SSH tunnel not being closed after remote server reboot (#406)#407

Open
debba wants to merge 2 commits into
mainfrom
fix/ssh-tunnel-reconnect
Open

Fix SSH tunnel not being closed after remote server reboot (#406)#407
debba wants to merge 2 commits into
mainfrom
fix/ssh-tunnel-reconnect

Conversation

@debba

@debba debba commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Closes #406

What was happening

When a MySQL connection went through an SSH tunnel and the remote server rebooted, Tabularis correctly detected the drop and cleared the connection indicator. But trying to reconnect failed with a "port already in use" error, and the only way out was to restart the app.

The reason is that SSH tunnels are cached in a global map (keyed by user@host:port:remote->port) and reused across connects, but nothing ever tore them down. A tunnel could outlive the connection it belonged to. Once the remote host rebooted, the tunnel died while its local forward port was still held by the (now defunct) ssh process, so the next reconnect happily picked up the stale map entry, pointed at that dead port, and blew up. Restarting the app "fixed" it only because that wiped the map.

What I changed

  • ssh_tunnel.rs
    • stop() now also reaps the killed system-ssh child (wait()), so it doesn't linger as a zombie still holding the forwarded port.
    • Added is_alive() to tell whether a cached tunnel is still usable (detects a system-ssh child that has already exited).
    • Added remove_tunnel(key) to stop a tunnel and drop it from the map (no-op if it isn't there).
  • commands.rs
    • Before reusing a cached tunnel we now check is_alive(). A dead one is removed and a fresh tunnel is created instead of reusing the broken port.
    • disconnect_connection tears the tunnel down so it no longer survives the connection.
  • health_check.rs
    • When a connection crosses the failure threshold (exactly the reboot case), the tunnel is torn down as part of closing it, instead of being left behind for the next reconnect to trip over.

Tests

Added unit tests in ssh_tunnel.rs for is_alive() on both backends (russh flag + system-ssh exited child), for stop() reaping the child, and for remove_tunnel (both the missing-key no-op and real removal). Full ssh_tunnel suite passes and the backend builds clean.

Notes

Tunnels are shared by key, so two connections to the same DB over the same SSH host share one tunnel — tearing it down on disconnect affects both, which is the intended behaviour here since a reboot takes them all down anyway.

…oot (#406)

SSH tunnels were cached in the global TUNNELS map but never stopped or
removed, so a tunnel outlived the connection that owned it. When the
remote host rebooted the tunnel died while still holding its local
forward port; the next reconnect reused the stale map entry and failed
with "port already in use". Only restarting the app cleared the map.

- ssh_tunnel: reap the killed system-ssh child in stop(); add
  is_alive() and remove_tunnel() helpers.
- commands: skip and discard dead tunnels on reuse; tear down the
  tunnel in disconnect_connection.
- health_check: tear down the tunnel when a connection exceeds the
  failure threshold (the path triggered by a server reboot).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: SSH Tunnel does not get closed if server is rebooted

1 participant