Skip to content

fix: clear handled OFFSET before child recursion in LimitPushdown#22525

Open
kumarUjjawal wants to merge 2 commits into
apache:mainfrom
kumarUjjawal:fix/clear-handled-offset
Open

fix: clear handled OFFSET before child recursion in LimitPushdown#22525
kumarUjjawal wants to merge 2 commits into
apache:mainfrom
kumarUjjawal:fix/clear-handled-offset

Conversation

@kumarUjjawal
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

LimitPushdown carries GlobalRequirements while walking the physical plan.
In the bad plan shape from #22489, an outer OFFSET was already handled above
a sort barrier, but its skip still remained in the state when recursion
continued into the child subtree. That stale skip then merged with an inner
LIMIT and reduced its fetch incorrectly, which caused a grouped row to be
dropped.

The fix is to clear skip once the limit requirement has already been handled,
while keeping fetch so valid limit pushdown into child sorts still happens.

What changes are included in this PR?

Are these changes tested?

Yes

Are there any user-facing changes?

No API Change

@github-actions github-actions Bot added optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels May 26, 2026
@alamb alamb added the regression Something that used to work no longer does label May 26, 2026
@@ -989,3 +989,34 @@ c-4

statement ok
DROP TABLE t21176;

# Regression test for https://github.com/apache/datafusion/issues/22489
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this guards against the issue:

I reverted the code change locally

diff --git a/datafusion/physical-optimizer/src/limit_pushdown.rs b/datafusion/physical-optimizer/src/limit_pushdown.rs
index 63c4f21bd9..6164d86e53 100644
--- a/datafusion/physical-optimizer/src/limit_pushdown.rs
+++ b/datafusion/physical-optimizer/src/limit_pushdown.rs
@@ -375,14 +375,6 @@ pub(crate) fn pushdown_limits(
         (new_node, global_state) = pushdown_limit_helper(new_node.data, global_state)?;
     }

-    // Once a limit has been materialized above the current node, child
-    // subtrees should not inherit its `skip`. Keep `fetch`, but clear
-    // `skip` before recursing so child-local limits are not merged with
-    // an `OFFSET` that has already been applied.
-    if global_state.satisfied {
-        global_state.skip = 0;
-    }
-
     // Apply pushdown limits in children
     let children = new_node.data.children();
     let mut changed = false;

And then I ran the tests:

cargo test --profile=ci --test sqllogictests
...
Running with 16 test threads (available parallelism: 16)
Completed 472 test files in 9 seconds

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you for this

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @kumarUjjawal -- I don't fully understand the fix, but it does seem to fix the bug and I think the test coverage looks good

expr: col("c1", &schema)?,
options: SortOptions {
descending: true,
nulls_first: false,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth calling out that this is a different sort order

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also add a test (or confirm one already exists) where the two sorts DO have the same sort and ensure the limit is still pushed

// Once a limit has been materialized above the current node, child
// subtrees should not inherit its `skip`. Keep `fetch`, but clear
// `skip` before recursing so child-local limits are not merged with
// an `OFFSET` that has already been applied.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am surprised this doesn't need to check for the actual sort keys too 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate optimizer Optimizer rules regression Something that used to work no longer does sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DataFusion drops grouped row after inner ORDER BY/LIMIT and outer ORDER BY/OFFSET

2 participants