refactor(fuse): abstract runtime filter into PartitionRuntimeFilter, IndexRuntimeFilter, and RowRuntimeFilter traits by zhang2014 · Pull Request #19728 · databendlabs/databend

zhang2014 · 2026-04-16T15:20:58Z

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

refactor(fuse): abstract runtime filter into PartitionRuntimeFilter, IndexRuntimeFilter, and RowRuntimeFilter traits

Tests

Unit Test
Logic Test
Benchmark Test
No Test - Explain why

Type of change

Bug Fix (non-breaking change which fixes an issue)
New Feature (non-breaking change which adds functionality)
Breaking Change (fix or feature that could cause existing functionality not to work as expected)
Documentation Update
Refactoring
Performance Improvement
Other (please describe):

This change is

…IndexRuntimeFilter, and RowRuntimeFilter traits

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ac50ce6726

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-16T15:26:15Z

+        // This filter requires async IO for bloom index loading.
+        // The sync prune path cannot be used alone — use prune_async instead.
+        // Return false (don't prune) as a safe fallback.
+        let _ = part;
+        Ok(false)


Wire inlist bloom pruning into index filter

ReadDataTransform::read_parts now relies on IndexRuntimeFilter::prune, but InlistBloomIndexFilter::prune unconditionally returns Ok(false) and the real pruning logic lives in the unused prune_async helper. That means runtime IN-list bloom-index pruning is effectively disabled after this refactor, so partitions that were previously skipped by bloom index checks are always read.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-16T15:26:15Z

-            })
-            .collect();
+        // Use new trait-based API for row-level filtering
+        let row_runtime_filters = ctx.get_row_runtime_filters(scan_id);


Keep bloom row filters available for new hash join

ReadState now pulls runtime bloom filters only from ctx.get_row_runtime_filters(scan_id), but this commit only populates that side-channel in the legacy hash-join runtime-filter path; the new_hash_join globalization path still only calls set_runtime_filter. For plans using new hash join, row_runtime_filters stays empty and row-level bloom runtime filtering is skipped entirely.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-16T15:26:15Z

+                let index = filter.load_index(&part, &operator).await?;
+                let index_ref = index.as_ref().map(|b| b.as_ref() as &dyn std::any::Any);
+                if filter.prune(&part, index_ref)? {


Avoid loading spatial index before stats pruning

The new index-filter loop loads index data before calling prune, so SpatialIndexFilter::load_index runs even when the fast bounding-box stats check would immediately reject a partition. Previously the spatial pruner checked stats first and only read index files when needed; this change adds avoidable index IO per partition and can significantly regress spatial runtime-filter performance on large scans.

Useful? React with 👍 / 👎.

github-actions · 2026-04-16T16:10:32Z

🤖 CI Job Analysis (Retry 1)

Workflow: 24814057526

📊 Summary

Total Jobs: 87
Failed Jobs: 4
Retryable: 0
Code Issues: 4

❌ NO RETRY NEEDED

All failures appear to be code/test issues requiring manual fixes.

🔍 Job Details

❌ linux / sqllogic / standalone (standalone, 2c, hybrid): Not retryable (Code/Test)
❌ linux / sqllogic / standalone (standalone, 2c, http): Not retryable (Code/Test)
❌ linux / sqllogic / cluster (cluster, 2c, http): Not retryable (Code/Test)
❌ linux / sqllogic / cluster (cluster, 2c, hybrid): Not retryable (Code/Test)

🤖 About

Automated analysis using job annotations to distinguish infrastructure issues (auto-retried) from code/test issues (manual fixes needed).

- Add PartitionRuntimeFilters/IndexRuntimeFilters/RowRuntimeFilters type aliases - Change RowRuntimeFilter::apply to take Column directly instead of &DataBlock - Add column_name() to RowRuntimeFilter trait for schema-independent resolution - BloomRowFilter::create returns Arc<dyn RowRuntimeFilter> directly - ReadState resolves column indices via column_name() at init time Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…nsform Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

refactor(fuse): abstract runtime filter into PartitionRuntimeFilter, …

ac50ce6

…IndexRuntimeFilter, and RowRuntimeFilter traits

github-actions Bot added the pr-refactor this PR changes the code base without new features or bugfix label Apr 16, 2026

chatgpt-codex-connector Bot reviewed Apr 16, 2026

View reviewed changes

z

43e04f2

zhang2014 and others added 5 commits April 19, 2026 12:57

z

ced7fd7

fix(fuse): increment RuntimeFilterPruneParts counter in read_data_tra…

f987c38

…nsform Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

z

9f4361e

merge upstream/main: resolve conflicts with sub-trait refactor

ecc59d9

zhang2014 force-pushed the refactor/runtime_filter_read_state branch from 89a48f7 to ecc59d9 Compare April 23, 2026 02:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(fuse): abstract runtime filter into PartitionRuntimeFilter, IndexRuntimeFilter, and RowRuntimeFilter traits#19728

refactor(fuse): abstract runtime filter into PartitionRuntimeFilter, IndexRuntimeFilter, and RowRuntimeFilter traits#19728
zhang2014 wants to merge 7 commits intodatabendlabs:mainfrom
zhang2014:refactor/runtime_filter_read_state

zhang2014 commented Apr 16, 2026 •

edited by drmingdrmer

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zhang2014 commented Apr 16, 2026 • edited by drmingdrmer Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Type of change

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤖 CI Job Analysis (Retry 1)

📊 Summary

❌ NO RETRY NEEDED

🔍 Job Details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zhang2014 commented Apr 16, 2026 •

edited by drmingdrmer

Loading

github-actions Bot commented Apr 16, 2026 •

edited

Loading