Summary
pidfile.t fails against any spec-compliant runtime because it calls Kill("KILL") on a container that may have already stopped — and the OCI spec requires that kill on a stopped container MUST return an error.
Reproduction
Run validation/pidfile/pidfile.t against any OCI-compliant runtime (runc, remora, youki). The test container runs true, which exits immediately after start. By the time PreDelete runs, the container is in stopped state. Kill("KILL") returns a non-zero exit code, and PreDelete returns that error, failing the TAP assertion.
Root cause
In validation/pidfile/pidfile.go, the PreDelete callback:
PreDelete: func(r *util.Runtime) error {
util.WaitingForStatus(*r, util.LifecycleStatusRunning, time.Second*10, time.Second*1)
err = r.Kill("KILL")
util.WaitingForStatus(*r, util.LifecycleStatusStopped, time.Second*10, time.Second*1)
return err // Kill error propagated as test failure
},
WaitingForStatus(..., LifecycleStatusRunning, ...) polls for up to 10s but the container has already exited — the timeout error is silently discarded. Kill("KILL") is then called on a stopped container. Per OCI Runtime Spec:
If the state of the container is not created or running, the runtime MUST generate an error.
Compliant runtimes return an error. PreDelete returns that error and the test fails.
Impact
This is a test harness bug. The actual pid-file assertion is not even what's failing — it's the cleanup logic. Any runtime that correctly implements the kill error for stopped containers will fail this test.
Confirmed with: remora (Rust), reproduced logically against runc (runc also returns container not running on kill of stopped).
Fix
Ignore the Kill error when the container may already be stopped:
PreDelete: func(r *util.Runtime) error {
// Container may have already exited (e.g. process is "true").
// Kill on a stopped container is an error per spec; don't propagate it.
r.Kill("KILL")
util.WaitingForStatus(*r, util.LifecycleStatusStopped, time.Second*10, time.Second*1)
return nil
},
Or: check whether WaitingForStatus(Running) succeeded before calling Kill.
Summary
pidfile.tfails against any spec-compliant runtime because it callsKill("KILL")on a container that may have already stopped — and the OCI spec requires thatkillon a stopped container MUST return an error.Reproduction
Run
validation/pidfile/pidfile.tagainst any OCI-compliant runtime (runc, remora, youki). The test container runstrue, which exits immediately afterstart. By the timePreDeleteruns, the container is instoppedstate.Kill("KILL")returns a non-zero exit code, andPreDeletereturns that error, failing the TAP assertion.Root cause
In
validation/pidfile/pidfile.go, thePreDeletecallback:WaitingForStatus(..., LifecycleStatusRunning, ...)polls for up to 10s but the container has already exited — the timeout error is silently discarded.Kill("KILL")is then called on a stopped container. Per OCI Runtime Spec:Compliant runtimes return an error.
PreDeletereturns that error and the test fails.Impact
This is a test harness bug. The actual pid-file assertion is not even what's failing — it's the cleanup logic. Any runtime that correctly implements the kill error for stopped containers will fail this test.
Confirmed with: remora (Rust), reproduced logically against runc (runc also returns
container not runningon kill of stopped).Fix
Ignore the Kill error when the container may already be stopped:
Or: check whether
WaitingForStatus(Running)succeeded before calling Kill.