Skip to content

fix(restore): use 3-section sequential pg_restore to prevent OID race conditions#4170

Open
mixelburg wants to merge 1 commit intoDokploy:canaryfrom
mixelburg:fix/postgres-restore-3section
Open

fix(restore): use 3-section sequential pg_restore to prevent OID race conditions#4170
mixelburg wants to merge 1 commit intoDokploy:canaryfrom
mixelburg:fix/postgres-restore-3section

Conversation

@mixelburg
Copy link
Copy Markdown
Contributor

@mixelburg mixelburg commented Apr 6, 2026

Description

Fixes #4127

PostgreSQL restore was using a single pg_restore pass with -Fc (custom format), which causes ERROR: could not open relation with OID errors when restoring complex schemas — particularly databases using TimescaleDB, foreign key constraints, or other extensions.

Root Cause

The single-pass restore processes schema creation, data insertion, and constraint creation in a mixed order determined by pg_restore's dependency graph. With extensions like TimescaleDB that register custom types and catalogs, some tables get their OIDs cached during schema creation, then become stale by the time data COPY statements run against them.

Fix

Replace the single pg_restore call with 3 sequential section passes:

  1. pre-data (--section=pre-data): Creates schema objects (tables, types, extensions) with --clean --if-exists to drop existing objects first
  2. data (--section=data): Inserts all data after the schema is fully established
  3. post-data (--section=post-data): Creates indexes, constraints, and triggers after all data is loaded

The dump is saved to a temp file inside the container via stdin (cat > /tmp/dokploy_restore.dump), then restored in three passes. The temp file cleanup uses ; instead of && so it always runs even if a restore step fails.

Tested Against

  • Regular PostgreSQL schemas
  • The web-server restore path (web-server.ts) is unaffected — it already uses docker cp to place the file inside the container before restoring

Checklist

  • I've read the Contributing Guide
  • My changes follow the code style of this project
  • This change is a bug fix

Greptile Summary

Replaces the single-pass pg_restore with three sequential section passes (pre-datadatapost-data) to eliminate OID-staleness errors caused by extensions like TimescaleDB. The dump is written to a temp file inside the container via cat, restored in three passes, and the cleanup correctly uses ; so rm -f always runs regardless of any restore step failing. One minor suggestion: using a PID-based suffix ($$) on the temp filename would prevent conflicts if two restores ever target the same container concurrently.

Confidence Score: 5/5

Safe to merge — the 3-section restore logic is correct and the cleanup is sound.

All remaining findings are P2 (defensive hardening). The core logic, shell operator precedence, and section ordering are all correct.

No files require special attention.

Reviews (1): Last reviewed commit: "fix(restore): use 3-section sequential p..." | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

… conditions

Replaces the single-pass pg_restore with a sequential 3-section approach:
1. pre-data: creates schema objects with --clean --if-exists
2. data: inserts records after schema is fully established
3. post-data: creates indexes, constraints, and triggers after all data

This prevents 'could not open relation with OID' errors that occur with
complex schemas (TimescaleDB, foreign keys) where the previous single-pass
restore could race against itself during schema creation and data insertion.

The dump is saved to a temp file inside the container via stdin, then
restored in three passes. The temp file is always cleaned up (using ;
not && before rm -f) even if a restore step fails.

Fixes Dokploy#4127
@mixelburg mixelburg requested a review from Siumauricio as a code owner April 6, 2026 22:21
@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. bug Something isn't working labels Apr 6, 2026
Comment on lines +10 to +12
const tmpFile = "/tmp/dokploy_restore.dump";
const pgArgs = `-U '${databaseUser}' -d ${database} -O`;
return `docker exec -i $CONTAINER_ID sh -c "cat > ${tmpFile} && pg_restore ${pgArgs} --clean --if-exists --section=pre-data ${tmpFile} && pg_restore ${pgArgs} --section=data ${tmpFile} && pg_restore ${pgArgs} --section=post-data ${tmpFile}; rm -f ${tmpFile}"`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Hardcoded temp file risks concurrent-restore collision

The fixed path /tmp/dokploy_restore.dump lives inside the container (due to docker exec), so concurrent restores to different containers are safe. However, if two restore jobs target the same container simultaneously, both cat > writes race to the same file and the second restore could read a corrupted dump. Using the shell's $$ (PID) in the filename eliminates this risk at no cost.

Suggested change
const tmpFile = "/tmp/dokploy_restore.dump";
const pgArgs = `-U '${databaseUser}' -d ${database} -O`;
return `docker exec -i $CONTAINER_ID sh -c "cat > ${tmpFile} && pg_restore ${pgArgs} --clean --if-exists --section=pre-data ${tmpFile} && pg_restore ${pgArgs} --section=data ${tmpFile} && pg_restore ${pgArgs} --section=post-data ${tmpFile}; rm -f ${tmpFile}"`;
const tmpFile = "/tmp/dokploy_restore_$$.dump";
const pgArgs = `-U '${databaseUser}' -d ${database} -O`;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PostgreSQL restore fails with "could not open relation with OID" due to parallel restore race condition

1 participant