adapter: use occ for read-then write, with incremental retries#35192
Draft
aljoscha wants to merge 8 commits intoMaterializeInc:mainfrom
Draft
adapter: use occ for read-then write, with incremental retries#35192aljoscha wants to merge 8 commits intoMaterializeInc:mainfrom
aljoscha wants to merge 8 commits intoMaterializeInc:mainfrom
Conversation
Contributor
|
Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone. PR title guidelines
Pre-merge checklist
|
1675464 to
2cb8e6d
Compare
865091f to
22a54a2
Compare
179deb6 to
8b8d81c
Compare
01bcd4d to
aff3f2e
Compare
The cancel scenario periodically fires `pg_cancel_backend` against a random worker. If that signal arrives during the worker's ROLLBACK, psycopg's `connection.rollback()` raises a "canceling statement due to user request" error. The rollback-error handler didn't recognize that string, silently cleared `rollback_next`, and moved on — but psycopg was still in `InFailedSqlTransaction`, so the next psycopg-path query (COPY TO, SET, PREPARE, ...) died client-side with "current transaction is aborted", which no action tolerates. Surfaced by nightly 16160 (Parallel Workload (cancel)) on the frontend-RTW-OCC branch: the RTW path's wider per-statement window widens the race, but the bug is in the harness. Fix it by forcing a reconnect on cancelled or aborted-transaction rollback failures — reconnection is a clean reset, and psycopg doesn't guarantee the connection is usable after a failed rollback anyway. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
We will later use this for the frontend read-then-write implementation.
We now use SUBSCRIBE instead of PEEK to maintain the desired set of updates that need to be written. We also don't acquire locks on tables but instead optimistically try and write our updates at the timestamp right at our current subscribe frontier. Additionally, we take the opportunity this provides and move the sequencing code from the coordinator main loop to the frontend, similar to how we have done that for peeks in frontend_peek.rs. Work towards MaterializeInc/database-issues#6686
The original `if active_subscribe.internal { ... as BuiltinTableAppendNotify }`
branch needed an explicit cast (with a `clippy::as_conversions` allow) for
the if/else types to unify. Splitting into a guarded match arm lets the
match's coercion site unsize both `Box::pin`s to the trait-object type.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
aff3f2e to
f444d6d
Compare
Commit 21bbdde unified the error returned when a peek or subscribe sees an underlying relation dropped mid-flight to "query could not complete because relation X was dropped" (via DroppedDependency::query_terminated_error). The COPY path keeps a distinct "copy has been terminated because underlying ..." message via DroppedDependency::copy_terminated_error. Two assertions were missed in that change: - test/sqllogictest/ct_various.slt:190 still expected the old subscribe wording, so the FETCH-after-DROP-CT case fails. - src/environmentd/tests/sql.rs::test_dont_drop_sinks_twice was updated to expect "query could not complete", but the test runs COPY (SUBSCRIBE ...), which goes through copy_terminated_error and therefore returns "copy has been terminated". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`Coordinator::sequence_read_then_write` is the legacy path for DELETE/UPDATE/INSERT. When `frontend_read_then_write_enabled` is on, the OCC frontend path is the sole intended executor for these statements: the two paths take different locks (OCC takes none; this path takes per-table write locks) and therefore do not synchronize against each other, so running them concurrently double-retracts rows that both observed. Reaching this function while the flag is on indicates a routing bug where something bypassed the frontend gate (`client.rs::try_frontend_read_then_write`). Concretely, SQL `EXECUTE <prepared-dml>` is re-dispatched via `Command::Execute` directly from `coord/sequencer.rs`'s `Plan::Execute` handler and never crosses the session-client boundary where the gate lives. That routing fix is in a follow-up commit; this one ensures we don't silently corrupt a persist shard when a future change adds a similar bypass. Returns `AdapterError::Internal` with a self-descriptive message, which surfaces as a server error to the user. The check happens before any work (lock acquisition, planning, peek dispatch), so it's cheap and side-effect free. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The session-client side now unrolls `Statement::Execute` before
running `try_frontend_peek` / `try_frontend_read_then_write`. The
inner statement (the prepared one being executed) goes through both
gates exactly like a directly-issued statement of the same kind,
which means a prepared DML executed via SQL `EXECUTE` now correctly
takes the OCC RTW path instead of the legacy
`Coordinator::sequence_read_then_write`.
Mechanics: `unroll_sql_execute` looks at the named portal; if its
statement is `Execute`, it plans it (yielding `ExecutePlan { name,
params }`), looks up the prepared statement, and installs a fresh
portal for the inner statement using the EXECUTE's parameter values
— mirroring `Coordinator::sequence_execute`. It loops so a chain of
`EXECUTE`s resolves to the underlying statement, with a depth bound
to defend against pathological nesting.
The coordinator's `Plan::Execute` handler in
`coord/sequencer.rs:620-637` stays as defensive fallback. With this
change in place, `EXECUTE` statements no longer reach it via the
normal session-client → coordinator path; combined with the failsafe
in `sequence_read_then_write`, any future code path that bypasses
the frontend gate for a DML will surface as an explicit internal
error rather than silently corrupting persist.
Confirmed by the parallel-workload retraction repro (see
`debug-notes/occ-rtw-handoff.md`'s root-cause section): the bug
required `DeleteAction` to use `PREPARE … AS DELETE …` followed by
`EXECUTE delete<N>`, which previously skipped OCC. With this routing
in place, every DELETE goes through OCC RTW regardless of how the
client invoked it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
aec8a1a to
9ad320f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We now use SUBSCRIBE instead of PEEK to maintain the desired set of
updates that need to be written. We also don't acquire locks on tables
but instead optimistically try and write our updates at the timestamp
right at our current subscribe frontier.
Additionally, we take the opportunity this provides and move the
sequencing code from the coordinator main loop to the frontend, similar
to how we have done that for peeks in frontend_peek.rs.
Work towards https://github.com/MaterializeInc/database-issues/issues/6686
Implementation of https://github.com/MaterializeInc/materialize/blob/63645b72e83ee26d2cfa99d25d773a06e6accb74/doc/developer/design/20260210_incremental_occ_read_then_write.md