Skip to content

HBASE-30255 Async archiving wal file causes TestLogRolling flaky#8406

Merged
Apache9 merged 1 commit into
apache:masterfrom
Apache9:HBASE-30255
Jun 25, 2026
Merged

HBASE-30255 Async archiving wal file causes TestLogRolling flaky#8406
Apache9 merged 1 commit into
apache:masterfrom
Apache9:HBASE-30255

Conversation

@Apache9

@Apache9 Apache9 commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

No description provided.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses flakiness in AbstractTestLogRolling#testLogRolling caused by asynchronous WAL archiving by waiting until both the rolled WAL file count and tracked WAL size reach zero before asserting completion.

Changes:

  • Replace HBaseTestingUtil.waitFor + immediate size assertion with an Awaitility-based untilAsserted block that waits for both conditions.
  • Increase the maximum wait time to 15 seconds to accommodate async archival completion.
  • Add Awaitility and Duration imports to support the new wait logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@PDavid PDavid left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for fixing this. 👍

@guluo2016

Copy link
Copy Markdown
Member

There is the same issue in TestMasterRegionWALCleaner.test()

// no archived wal files yet
assertFalse(fs.exists(globalWALArchiveDir));
region.requestRollAll();
region.waitUntilWalRollFinished();
// should have one
FileStatus[] files = fs.listStatus(globalWALArchiveDir);

Maybe we can also update the code at the same time @Apache9

@guluo2016

Copy link
Copy Markdown
Member

Sorry, forgot to attach the error logs. Here they are:

-------------------------------------------------------------------------------
Test set: org.apache.hadoop.hbase.master.region.TestMasterRegionWALCleaner
-------------------------------------------------------------------------------
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.907 s <<< FAILURE! -- in org.apache.hadoop.hbase.master.region.TestMasterRegionWALCleaner
org.apache.hadoop.hbase.master.region.TestMasterRegionWALCleaner.test -- Time elapsed: 0.876 s <<< ERROR!
java.io.FileNotFoundException: File /home/runner/work/hbase/hbase/src/hbase-server/target/test-data/579f9f7e-bfa0-7ac2-3a7b-eacb6f71a7ce/oldWALs does not exist
	at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:798)
	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:2078)
	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:2122)
	at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:1020)
	at org.apache.hadoop.hbase.master.region.TestMasterRegionWALCleaner.test(TestMasterRegionWALCleaner.java:86)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.apache.hadoop.hbase.HBaseJupiterExtension.lambda$runWithTimeout$0(HBaseJupiterExtension.java:159)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)

@Apache9

Apache9 commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

There is the same issue in TestMasterRegionWALCleaner.test()

// no archived wal files yet
assertFalse(fs.exists(globalWALArchiveDir));
region.requestRollAll();
region.waitUntilWalRollFinished();
// should have one
FileStatus[] files = fs.listStatus(globalWALArchiveDir);

Maybe we can also update the code at the same time @Apache9

I think you can file another issue for this one :)

@Apache9 Apache9 merged commit 481332e into apache:master Jun 25, 2026
8 checks passed
@guluo2016

Copy link
Copy Markdown
Member

I think you can file another issue for this one :)

OK, I'll create a separate PR later.

Apache9 added a commit that referenced this pull request Jun 25, 2026
Signed-off-by: Xiao Liu <liuxiaocs@apache.org>
Signed-off-by: Dávid Paksy <paksyd@apache.org>
(cherry picked from commit 481332e)
Apache9 added a commit that referenced this pull request Jun 25, 2026
Signed-off-by: Xiao Liu <liuxiaocs@apache.org>
Signed-off-by: Dávid Paksy <paksyd@apache.org>
(cherry picked from commit 481332e)
Apache9 added a commit that referenced this pull request Jun 25, 2026
Signed-off-by: Xiao Liu <liuxiaocs@apache.org>
Signed-off-by: Dávid Paksy <paksyd@apache.org>
(cherry picked from commit 481332e)
Apache9 added a commit that referenced this pull request Jun 25, 2026
Signed-off-by: Xiao Liu <liuxiaocs@apache.org>
Signed-off-by: Dávid Paksy <paksyd@apache.org>
(cherry picked from commit 481332e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants