adapter: use occ for read-then write, with incremental retries by aljoscha · Pull Request #35192 · MaterializeInc/materialize

aljoscha · 2026-02-24T12:51:06Z

We now use SUBSCRIBE instead of PEEK to maintain the desired set of
updates that need to be written. We also don't acquire locks on tables
but instead optimistically try and write our updates at the timestamp
right at our current subscribe frontier.

Additionally, we take the opportunity this provides and move the
sequencing code from the coordinator main loop to the frontend, similar
to how we have done that for peeks in frontend_peek.rs.

Work towards https://github.com/MaterializeInc/database-issues/issues/6686

Implementation of https://github.com/MaterializeInc/materialize/blob/63645b72e83ee26d2cfa99d25d773a06e6accb74/doc/developer/design/20260210_incremental_occ_read_then_write.md

github-actions · 2026-02-24T12:51:15Z

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

The PR title is descriptive and will make sense in the git log.
This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

The cancel scenario periodically fires `pg_cancel_backend` against a random worker. If that signal arrives during the worker's ROLLBACK, psycopg's `connection.rollback()` raises a "canceling statement due to user request" error. The rollback-error handler didn't recognize that string, silently cleared `rollback_next`, and moved on — but psycopg was still in `InFailedSqlTransaction`, so the next psycopg-path query (COPY TO, SET, PREPARE, ...) died client-side with "current transaction is aborted", which no action tolerates. Surfaced by nightly 16160 (Parallel Workload (cancel)) on the frontend-RTW-OCC branch: the RTW path's wider per-statement window widens the race, but the bug is in the harness. Fix it by forcing a reconnect on cancelled or aborted-transaction rollback failures — reconnection is a clean reset, and psycopg doesn't guarantee the connection is usable after a failed rollback anyway. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

We will later use this for the frontend read-then-write implementation.

We now use SUBSCRIBE instead of PEEK to maintain the desired set of updates that need to be written. We also don't acquire locks on tables but instead optimistically try and write our updates at the timestamp right at our current subscribe frontier. Additionally, we take the opportunity this provides and move the sequencing code from the coordinator main loop to the frontend, similar to how we have done that for peeks in frontend_peek.rs. Work towards MaterializeInc/database-issues#6686

The original `if active_subscribe.internal { ... as BuiltinTableAppendNotify }` branch needed an explicit cast (with a `clippy::as_conversions` allow) for the if/else types to unify. Splitting into a guarded match arm lets the match's coercion site unsize both `Box::pin`s to the trait-object type. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit 21bbdde unified the error returned when a peek or subscribe sees an underlying relation dropped mid-flight to "query could not complete because relation X was dropped" (via DroppedDependency::query_terminated_error). The COPY path keeps a distinct "copy has been terminated because underlying ..." message via DroppedDependency::copy_terminated_error. Two assertions were missed in that change: - test/sqllogictest/ct_various.slt:190 still expected the old subscribe wording, so the FETCH-after-DROP-CT case fails. - src/environmentd/tests/sql.rs::test_dont_drop_sinks_twice was updated to expect "query could not complete", but the test runs COPY (SUBSCRIBE ...), which goes through copy_terminated_error and therefore returns "copy has been terminated". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`Coordinator::sequence_read_then_write` is the legacy path for DELETE/UPDATE/INSERT. When `frontend_read_then_write_enabled` is on, the OCC frontend path is the sole intended executor for these statements: the two paths take different locks (OCC takes none; this path takes per-table write locks) and therefore do not synchronize against each other, so running them concurrently double-retracts rows that both observed. Reaching this function while the flag is on indicates a routing bug where something bypassed the frontend gate (`client.rs::try_frontend_read_then_write`). Concretely, SQL `EXECUTE <prepared-dml>` is re-dispatched via `Command::Execute` directly from `coord/sequencer.rs`'s `Plan::Execute` handler and never crosses the session-client boundary where the gate lives. That routing fix is in a follow-up commit; this one ensures we don't silently corrupt a persist shard when a future change adds a similar bypass. Returns `AdapterError::Internal` with a self-descriptive message, which surfaces as a server error to the user. The check happens before any work (lock acquisition, planning, peek dispatch), so it's cheap and side-effect free. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The session-client side now unrolls `Statement::Execute` before running `try_frontend_peek` / `try_frontend_read_then_write`. The inner statement (the prepared one being executed) goes through both gates exactly like a directly-issued statement of the same kind, which means a prepared DML executed via SQL `EXECUTE` now correctly takes the OCC RTW path instead of the legacy `Coordinator::sequence_read_then_write`. Mechanics: `unroll_sql_execute` looks at the named portal; if its statement is `Execute`, it plans it (yielding `ExecutePlan { name, params }`), looks up the prepared statement, and installs a fresh portal for the inner statement using the EXECUTE's parameter values — mirroring `Coordinator::sequence_execute`. It loops so a chain of `EXECUTE`s resolves to the underlying statement, with a depth bound to defend against pathological nesting. The coordinator's `Plan::Execute` handler in `coord/sequencer.rs:620-637` stays as defensive fallback. With this change in place, `EXECUTE` statements no longer reach it via the normal session-client → coordinator path; combined with the failsafe in `sequence_read_then_write`, any future code path that bypasses the frontend gate for a DML will surface as an explicit internal error rather than silently corrupting persist. Confirmed by the parallel-workload retraction repro (see `debug-notes/occ-rtw-handoff.md`'s root-cause section): the bug required `DeleteAction` to use `PREPARE … AS DELETE …` followed by `EXECUTE delete<N>`, which previously skipped OCC. With this routing in place, every DELETE goes through OCC RTW regardless of how the client invoked it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

aljoscha force-pushed the adapter-read-then-write-occ-incremental branch 4 times, most recently from 1675464 to 2cb8e6d Compare February 27, 2026 13:59

aljoscha force-pushed the adapter-read-then-write-occ-incremental branch from 865091f to 22a54a2 Compare March 16, 2026 15:30

ggevay mentioned this pull request Mar 30, 2026

adapter: Subscribe on the frontend peek path #35756

Merged

aljoscha force-pushed the adapter-read-then-write-occ-incremental branch 11 times, most recently from 179deb6 to 8b8d81c Compare April 23, 2026 10:33

aljoscha force-pushed the adapter-read-then-write-occ-incremental branch 9 times, most recently from 01bcd4d to aff3f2e Compare April 28, 2026 13:36

aljoscha and others added 3 commits April 28, 2026 16:38

adapter: extract validate_read_then_write_dependencies for reuse

d9cb2bd

We will later use this for the frontend read-then-write implementation.

adapter: introduce PeekResponseUnary::DependencyDropped variant

21bbdde

aljoscha and others added 2 commits April 28, 2026 16:39

aljoscha force-pushed the adapter-read-then-write-occ-incremental branch from aff3f2e to f444d6d Compare April 28, 2026 14:39

aljoscha and others added 3 commits April 28, 2026 16:10

aljoscha force-pushed the adapter-read-then-write-occ-incremental branch from aec8a1a to 9ad320f Compare May 1, 2026 19:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adapter: use occ for read-then write, with incremental retries#35192

adapter: use occ for read-then write, with incremental retries#35192
aljoscha wants to merge 8 commits intoMaterializeInc:mainfrom
aljoscha:adapter-read-then-write-occ-incremental

aljoscha commented Feb 24, 2026

Uh oh!

github-actions Bot commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aljoscha commented Feb 24, 2026

Uh oh!

github-actions Bot commented Feb 24, 2026

PR title guidelines

Pre-merge checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant