Defer TCP socket connect() until DO output gate clears.#6533
Defer TCP socket connect() until DO output gate clears.#6533vaibhavSharma-cf wants to merge 1 commit intomainfrom
Conversation
|
/bonk review |
There was a problem hiding this comment.
Review Summary
- [HIGH] Use-after-free:
addressStrcaptured as non-owningkj::StringPtrin coroutine, then the owningkj::Stringis moved away — the coroutine will dereference freed memory when it resumes afterwaitForOutputLocks(). - [MEDIUM]
ConnectionDatamember ordering:connectTaskis destroyed aftertlsStarter— while currently safe due to coroutine cancellation semantics, this contradicts the stated intent and is fragile against future changes. MovingconnectTaskaftertlsStarterwould make the safety explicit. - [LOW] Stale comment on
connectImpl— the original TODO was removed but the replacement comment is misleading about where the gate wait actually occurs.
This review was generated by an AI assistant and may contain errors. Please verify all findings independently.
src/workerd/api/sockets.h
Outdated
| kj::Maybe<kj::Promise<void>> connectTask; | ||
| // tlsStarter must be declared after connectionStream so that it is destroyed first, | ||
| // since it holds a reference that keeps the connection alive. | ||
| kj::Own<kj::TlsStarterCallback> tlsStarter; |
There was a problem hiding this comment.
[MEDIUM] Destruction order concern: C++ destroys members in reverse declaration order, so the current order is:
tlsStarterdestroyed first (declared last)connectTaskdestroyed secondwatchForDisconnectTaskdestroyed thirdconnectionStreamdestroyed last
This means tlsStarter is freed before connectTask is cancelled. The PR description says the purpose of storing connectTask here is to "cancel the connect task before tlsStarter is destroyed" — but the current declaration order achieves the opposite.
In practice this is currently safe because cancelling a KJ coroutine just destroys the coroutine frame without running more code, so the dangling httpConnectSettings.tlsStarter pointer inside the frame is never dereferenced during destruction. However, this is a fragile invariant — if anyone later adds cleanup logic or attaches destructors that touch tlsStarter, this becomes a real UAF.
Suggestion: move connectTask after tlsStarter to match the stated intent:
| kj::Maybe<kj::Promise<void>> connectTask; | |
| // tlsStarter must be declared after connectionStream so that it is destroyed first, | |
| // since it holds a reference that keeps the connection alive. | |
| kj::Own<kj::TlsStarterCallback> tlsStarter; | |
| kj::Maybe<kj::Promise<void>> connectTask; | |
| // tlsStarter must be declared after connectionStream so that it is destroyed first, | |
| // since it holds a reference that keeps the connection alive. | |
| kj::Own<kj::TlsStarterCallback> tlsStarter; |
should become:
kj::Own<kj::TlsStarterCallback> tlsStarter;
// connectTask must be declared after tlsStarter so that it is destroyed first,
// since the coroutine frame holds a raw pointer to tlsStarter via httpConnectSettings.
kj::Maybe<kj::Promise<void>> connectTask;
Note: this would also require reordering the constructor initializer list accordingly.
|
Review posted on PR #6533 with 3 inline comments. Here's a summary of the findings: [HIGH] Use-after-free on [MEDIUM] [LOW] Misleading function name ( |
| name = "main", | ||
| worker = ( | ||
| modules = [ | ||
| (name = "worker", esModule = embed "socket-output-gate-test.js"), |
There was a problem hiding this comment.
Can we also get a no-autogate variant of this test, just to show the behavior change when the autogate is not enabled?
There was a problem hiding this comment.
I have removed this test case as it can be the case that putCompleted is set to true immediately after storage put, making this test flaky. Instead I have added test case specific to connectImplNoOutputLock with blocking/unblocking output gate.
dom96
left a comment
There was a problem hiding this comment.
What is the rationale for doing this? Why do we need to prevent network outputs while storage writes are pending?
It also doesn't seem like that is what is happening in this PR. It is actually blocking socket connections completely, shouldn't only writes be blocked instead? Focusing on blocking writes may also simplify the implementation, since connect itself is synchronous, so the code to block it is quite involved and requires these fulfillers (which I don't think are ideal).
afc70fa to
2093ffb
Compare
Adds TCP_SOCKET_CONNECT_OUTPUT_GATE autogate that defers httpClient->connect() until IoContext::waitForOutputLocks() resolves. Socket is returned immediately backed by kj::newPromisedStream so reads/writes queue until the real connection arrives.
2093ffb to
828423e
Compare
That's how output gates work. We don't want applications to have to wait for writes to be confirmed durable before we allow them to continue executing. Instead, we make it impossible to observe that the app has continued executing until the write has become durable. If the write ends up failing, then we prevent the rest of the world from ever knowing that the app kept running -- so from the PoV of the rest of the world, the app stopped exactly where the failed write was performed. This is a huge optimization because it allows an app to perform several writes in rapid succession with only one round trip to durable storage (SRS followers). https://blog.cloudflare.com/durable-objects-easy-fast-correct-choose-three/
Anything that can be observed by the outside world needs to be blocked. That means we have to block both writes and connection initiation -- the remote end can observe that it has received a connection, and in theory you could imagine a protocol where merely connecting to a remote service serves to notify it of something, thus having side effects. |
| return result; | ||
|
|
||
| if (util::Autogate::isEnabled(util::AutogateKey::TCP_SOCKET_CONNECT_OUTPUT_GATE)) { | ||
| // Deferred-connect path: return a Socket backed by a promised stream immediately, deferring |
There was a problem hiding this comment.
I think you can make this a lot simpler. Use newPromisedWorkerInterface() to wrap the underlying output channel that connect() is being called on, then fulfill the promise when the output gate resolves.
Adds TCP_SOCKET_CONNECT_OUTPUT_GATE autogate that defers httpClient->connect() until IoContext::waitForOutputLocks() resolves. Socket is returned immediately backed by kj::newPromisedStream so reads/writes queue until the real connection arrives. Connect task lives in ConnectionData alongside tlsStarter to prevent
TlsStarterCallback UAF on Socket GC.