Skip to content

perf: skip concurrency overhead in BaseSingleBlockCombineOperator when numTasks=1#18146

Open
deeppatel710 wants to merge 1 commit intoapache:masterfrom
deeppatel710:perf/single-thread-combine-operator
Open

perf: skip concurrency overhead in BaseSingleBlockCombineOperator when numTasks=1#18146
deeppatel710 wants to merge 1 commit intoapache:masterfrom
deeppatel710:perf/single-thread-combine-operator

Conversation

@deeppatel710
Copy link
Copy Markdown
Contributor

Summary

Fixes #14617

When a query runs with a single execution task (one segment, or maxExecutionThreads=1), BaseSingleBlockCombineOperator still incurred the full multi-thread overhead:

  • ExecutorService.submit() — thread pool task submission
  • Phaser — register/deregister synchronization
  • BlockingQueue.poll() — with timeout waiting
  • AtomicInteger / AtomicReference — volatile reads/writes on the hot path

None of this is necessary when _numTasks == 1.

Change

Added a getNextBlockSingleThread() fast path in BaseSingleBlockCombineOperator.getNextBlock(): when _numTasks == 1 and _resultsBlockMerger is non-null, all segments are processed sequentially on the
calling thread with no synchronization overhead. CPU time and memory allocation are still tracked via ThreadResourceSnapshot.

Subclasses that override mergeResults() with custom logic (e.g. SequentialSortedGroupByCombineOperator, which passes null for _resultsBlockMerger) are unaffected and fall through to the standard
multi-thread path.

Test plan

  • All existing combine operator tests pass (125 tests)
  • CombineSlowOperatorsTest, CombineErrorOperatorsTest, SelectionCombineOperatorTest, SortedGroupByCombineOperatorsTest, CombinePlanNodeTest — all green

…n numTasks=1

When a query runs with a single execution task (e.g. one segment or
maxExecutionThreads=1), BaseSingleBlockCombineOperator still incurred the
full multi-thread overhead: ExecutorService.submit(), Phaser registration/
deregistration, BlockingQueue.poll() with timeout, AtomicInteger, and
AtomicReference.

This adds a single-thread fast path in getNextBlock(): when _numTasks==1
and _resultsBlockMerger is non-null (i.e. the subclass uses the default
merge strategy), segments are processed sequentially on the calling thread
with none of that synchronization overhead. CPU time and memory are still
tracked via ThreadResourceSnapshot.

Subclasses that override mergeResults() with custom logic (e.g.
SequentialSortedGroupByCombineOperator, which passes null for
_resultsBlockMerger) are unaffected and continue using the standard path.

Fixes apache#14617

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@xiangfu0 xiangfu0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a critical correctness issue with the fast path implementation:

Timeout Handling Missing: The new getNextBlockSingleThread() method lacks timeout protection that exists in the original mergeResults() method. The original method checks _queryContext.getEndTimeMs() and returns a timeout results block if the deadline is exceeded. The fast path has no such protection, which means:

  1. A hanging segment operator could block indefinitely
  2. No respect for query timeout deadline
  3. Potential resource exhaustion if a segment operator stalls

The original mergeResults() path handles this via the _blockingQueue.poll(waitTimeMs, TimeUnit.MILLISECONDS) with explicit timeout checking. The fast path should implement similar timeout protection.

}
}

/// Processes all segments sequentially on the calling thread when only one task is needed.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing timeout protection. This fast path should respect _queryContext.getEndTimeMs() like the original mergeResults() method does. A hanging segment operator could block indefinitely here without timeout checking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add special combine operator for single thread case

2 participants