[#10741] fix(hive): Fix keytab symlink TOCTOU race in FetchFileUtils#10742
[#10741] fix(hive): Fix keytab symlink TOCTOU race in FetchFileUtils#10742yuqi1129 merged 3 commits intoapache:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses a concurrency bug in Hive Kerberos keytab fetching by making file: URI handling in FetchFileUtils resilient to concurrent symlink creation, and by removing duplicated Hive fetch utilities/tests in catalog-hive while adding coverage in hive-metastore-common.
Changes:
- Synchronize
file:symlink creation incatalogs/hive-metastore-commonFetchFileUtilsto avoid concurrentFileAlreadyExistsException. - Replace
exists() + delete()withFiles.deleteIfExists()inKerberosClient. - Remove the duplicate
FetchFileUtilsand its tests fromcatalog-hive, and add new unit tests underhive-metastore-common.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
catalogs/hive-metastore-common/src/main/java/org/apache/gravitino/hive/kerberos/FetchFileUtils.java |
Adds synchronization + delete/recreate behavior for file: symlink creation under concurrency. |
catalogs/hive-metastore-common/src/main/java/org/apache/gravitino/hive/kerberos/KerberosClient.java |
Switches to Files.deleteIfExists() prior to fetching keytab. |
catalogs/hive-metastore-common/src/test/java/org/apache/gravitino/hive/kerberos/TestFetchFileUtils.java |
Adds tests for local symlink behavior + concurrency + replacement scenarios (and a disabled HTTP test). |
catalogs/catalog-hive/src/main/java/org/apache/gravitino/catalog/hive/FetchFileUtils.java |
Removes duplicated implementation in catalog-hive. |
catalogs/catalog-hive/src/test/java/org/apache/gravitino/catalog/hive/TestFetchFileUtils.java |
Removes duplicated tests in catalog-hive. |
Code Coverage Report
Files
|
|
What's the meaning of |
|
TOCTOU is |
- Replace String.intern() locking with a ConcurrentHashMap lock map keyed by normalized absolute destination path to avoid races from different path spellings of the same file - Replace delete-then-create with an atomic tmp-symlink + rename (REPLACE_EXISTING) to eliminate the window where the keytab path is absent - Skip symlink creation when the existing link already points to the correct target (idempotent fast path) - Wrap ExecutorService in try/finally with shutdownNow in the concurrent test - Remove permanently-disabled testDownloadFromHTTP (external network dependency)
23a4407 to
adf6fe8
Compare
| * file. Keyed by the normalized absolute destination path string to avoid races caused by | ||
| * different path spellings referring to the same file. | ||
| */ | ||
| private static final ConcurrentHashMap<String, Object> SYMLINK_LOCKS = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Close at the Kerberos client closed
There was a problem hiding this comment.
I did not see related close logic on it. As it's a static field, I believe it will only be clear when the classloader is closed.
There was a problem hiding this comment.
I added a function removeLock, which is called when the Kerberos client closes.
- Add removeLock() to FetchFileUtils and call it in KerberosClient.close() so SYMLINK_LOCKS entries are cleared when the catalog is closed - Remove pre-delete in saveKeyTabFileFromUri; FetchFileUtils handles idempotent replacement atomically for all schemes - Clarify comment that same-directory rename is effectively atomic on common local filesystems - Add 30s timeouts to barrier.await() and future.get() in the concurrency test
…10742) ### What changes were proposed in this pull request? Fix the `file:` scheme handler in `FetchFileUtils` to synchronize symlink creation and avoid concurrent races. Replace the non-atomic `exists() + delete()` in `KerberosClient` with `Files.deleteIfExists()`. Remove the unused duplicate `FetchFileUtils` in `catalog-hive` and add comprehensive tests in `hive-metastore-common`. ### Why are the changes needed? `Files.createSymbolicLink()` throws `FileAlreadyExistsException` when multiple threads concurrently fetch the same keytab path, such as when the search event listener triggers cascade sync after catalog creation. Fix: #10741 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? - added unit coverage in `TestFetchFileUtils` including concurrent, idempotent, and target-replacement scenarios
What changes were proposed in this pull request?
Fix the
file:scheme handler inFetchFileUtilsto synchronize symlink creation and avoid concurrent races. Replace the non-atomicexists() + delete()inKerberosClientwithFiles.deleteIfExists(). Remove the unused duplicateFetchFileUtilsincatalog-hiveand add comprehensive tests inhive-metastore-common.Why are the changes needed?
Files.createSymbolicLink()throwsFileAlreadyExistsExceptionwhen multiple threads concurrently fetch the same keytab path, such as when the search event listener triggers cascade sync after catalog creation.Fix: #10741
Does this PR introduce any user-facing change?
No.
How was this patch tested?
TestFetchFileUtilsincluding concurrent, idempotent, and target-replacement scenarios