fix(core): fall back to podman CLI when no API socket is found#1858
fix(core): fall back to podman CLI when no API socket is found#1858russellb wants to merge 5 commits into
Conversation
…A#1834) The Podman API socket symlink is not always present — it varies by version, machine provider, and platform. When none of the candidate socket paths respond, try `podman info` as a fallback so auto-detection succeeds on macOS setups where Podman is functional but the socket is not at a well-known path. Closes NVIDIA#1834 Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
/ok to test e068662 |
elezar
left a comment
There was a problem hiding this comment.
One question I have: This seems to fix detection, but doesn't change how the gateway interacts with the podman driver? Why are no changes needed there?
It's a fair question. I have podman on mac, but I don't have this problem. I was hoping the reporter would test this and see if it was enough. It seems like a reasonable change on the detection side. We go from "podman not detected at all" to either:
In either case, it's more consistent with docker in this part of the code. Docker does the same thing with falling back to a CLI check at this stage. A next improvement could be to discover the socket using cc @r3v5, reporter of the issue |
Hey @russellb ! Sure, I will test your fix, no worries. |
|
@r3v5 thanks. The output of that podman info command would be helpful too |
|
I have a couple of concerns / questions here. The first is the one that I've already mentioned. This change checks that Then, although this seems to align Podman functionality with Docker, there are subtle differences between the two paths. Although it is a slightly larger change than originally proposed, I think there is some benefit in trying to better align the detection paths for Podman and Docker. Ideally these would return a usable driver config (including, for example socket information) and not just a boolean. This config could then be used directly when instantiating the driver(s) instead of rediscovering the relevant config (as is done in the Docker case). |
|
Thanks, @elezar. I'm happy to work on the changes you described. |
|
I made a change to podman auto-detection recently (#1536), to avoid using just the existence of the CLI to determine that podman was available. In that change I made sure to align the auto-detection with the actual client socket choice mechanism. Podman unfortunately has different behavior on macos depending on how you install it, which I talked about in #1690 (comment) This PR could supersede #1690 (which is scoped to documentation), but it needs to keep the auto-detection mechanism and what the client uses to make the actual connection be aligned, like @elezar has raised. |
|
Great feedback, thanks @krishicks. I'll iterate on this. |
Hey @russellb ! I am coming back with results from local testing on my machine. Tested on macOS (Apple Silicon, M3 Pro RAM 36 GB), Podman 5.7.1 via Homebrew.Detection fix works — gateway now finds podman (Using compute driver driver=podman). Without this PR, it crashes with: I ran Gateway output with the fix: Socket mismatch — after detection, driver construction fails because default_socket_path() returns podman info output |
|
Perfect, thanks. This confirms the non-standard socket location and that discovery needs to include determining socket location to use. |
Auto-detection now queries `podman info --format json` to find the actual API socket when no well-known socket path responds. On macOS (serviceIsRemote=true) it follows up with `podman machine inspect` to get the host-side forwarded socket; on native Linux it uses remoteSocket.path directly. The discovered path is threaded into PodmanComputeConfig so the driver connects to the right socket instead of falling back to a default that may not exist. Closes NVIDIA#1834 Signed-off-by: Russell Bryant <rbryant@redhat.com>
Unlike `podman info`, `podman machine inspect` outputs JSON by default. Passing `--format json` is interpreted as a Go template literal, causing it to output the string "json" instead of the JSON payload. Signed-off-by: Russell Bryant <rbryant@redhat.com>
When the socket probe succeeds against a well-known candidate, return that path so the driver uses the exact socket that was verified rather than rediscovering it via default_socket_path(). This ensures detection and driver connection always agree on the socket, regardless of whether it was found via probe or CLI discovery. Signed-off-by: Russell Bryant <rbryant@redhat.com>
Each variant carries only its own connection metadata — Podman gets a socket_path, other drivers carry nothing. Eliminates the generic Optional field and makes the match arms self-documenting. Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
@r3v5 I've pushed follow-up commits that address the socket mismatch you confirmed. Changes:
Precedence for the socket path is: Could you re-test on your Homebrew Podman setup? The retry loop against the missing |
Summary
Auto-detection fails to discover Podman on macOS because the API socket symlink is not always present — it varies by Podman version, machine provider, and platform. This adds a
podman infoCLI fallback so auto-detection succeeds when Podman is functional but the socket isn't at a well-known path.Related Issue
Closes #1834
Changes
podman_cli_responds()fallback tois_podman_available()inopenshell-coreconfigpodman infowhich uses Podman's own internal discoveryTesting
cargo test -p openshell-core— 164 tests passChecklist