fix(msfs): optimistic-lock copy to remove warm-read serialization#69
fix(msfs): optimistic-lock copy to remove warm-read serialization#69ankitmaster08 wants to merge 1 commit into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (10)
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (7)
📝 WalkthroughWalkthroughAdds optional AIStore ChangesMulti-Storage File System: Manifest Routing & Cache Resilience
Sequence Diagram(s)sequenceDiagram
participant Client
participant DoRead
participant CacheLine
participant InodeCache
Client->>DoRead: read request (offset N)
DoRead->>CacheLine: snapshot contentGeneration
DoRead->>InodeCache: unlock global lock
DoRead->>CacheLine: memcpy bytes into reply buffer
DoRead->>InodeCache: re-acquire global lock
DoRead->>CacheLine: compare generation -> unchanged or changed
alt unchanged
DoRead->>Client: return bytes
else changed
DoRead->>DoRead: truncate reply, retry same offset
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies" Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
multi-storage-file-system/config_test.go (1)
708-809: ⚡ Quick winAdd reload-path coverage for
manifest_gen_backend.These tests cover first-load behavior only. Please add a SIGHUP-style second
checkConfigFile()scenario that mutatesmanifest_gen_backendand asserts the intended behavior (reject or remount/apply), so routing drift can’t regress silently.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@multi-storage-file-system/config_test.go` around lines 708 - 809, Add a second "reload" step in both TestManifestGenBackendReference and TestManifestGenBackendMissing that simulates a SIGHUP-style config reload: after the first successful checkConfigFile(), write a modified config to globals.configFilePath changing the AIStore.manifest_gen_backend value (for TestManifestGenBackendReference change it to another existing backend or remove it) and call checkConfigFile() again; then assert the new expected behavior by re-reading globals.backendsToMount["ais"], checking aisCfg.manifestGenBackendName and aisCfg.manifestGenBackend (for successful remount/update) or asserting checkConfigFile() returns an error when the new manifest_gen_backend is missing (for TestManifestGenBackendMissing). Ensure you use the existing symbols checkConfigFile, globals.backendsToMount, aisCfg.manifestGenBackendName and aisCfg.manifestGenBackend to locate and validate the updated state.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@multi-storage-file-system/config.go`:
- Around line 1799-1830: During SIGHUP reload immutability checks, ensure
AIStore backends' manifest_gen_backend is handled: compare the new
aisCfg.manifestGenBackendName against the old config's manifestGenBackendName
for the same backend (aisBackend.dirName) and reject the reload with an error if
it changed; alternatively, if you choose to allow changes, re-resolve
aisCfg.manifestGenBackend from config.backends (the same resolution logic that
sets aisCfg.manifestGenBackend) so the runtime pointer matches the new name.
Update the immutability check block that inspects AIStore backends to reference
manifestGenBackendName and either fail the reload on mismatch or refresh the
resolved manifestGenBackend pointer accordingly.
In `@multi-storage-file-system/README.md`:
- Line 149: Update the README entry for manifest_gen_backend to soften the
requirement from "must" to "should" to match runtime behavior: change the phrase
"must match bucket/prefix" to "should match bucket/prefix" (or equivalent
wording), and optionally note that checkConfigFile() currently only warns on
mismatches rather than failing; reference the manifest_gen_backend field in the
README and the checkConfigFile() function so reviewers can see the doc now
reflects the code behavior (or choose to instead make checkConfigFile() enforce
an error if you prefer strict enforcement).
---
Nitpick comments:
In `@multi-storage-file-system/config_test.go`:
- Around line 708-809: Add a second "reload" step in both
TestManifestGenBackendReference and TestManifestGenBackendMissing that simulates
a SIGHUP-style config reload: after the first successful checkConfigFile(),
write a modified config to globals.configFilePath changing the
AIStore.manifest_gen_backend value (for TestManifestGenBackendReference change
it to another existing backend or remove it) and call checkConfigFile() again;
then assert the new expected behavior by re-reading
globals.backendsToMount["ais"], checking aisCfg.manifestGenBackendName and
aisCfg.manifestGenBackend (for successful remount/update) or asserting
checkConfigFile() returns an error when the new manifest_gen_backend is missing
(for TestManifestGenBackendMissing). Ensure you use the existing symbols
checkConfigFile, globals.backendsToMount, aisCfg.manifestGenBackendName and
aisCfg.manifestGenBackend to locate and validate the updated state.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4137d3b2-a9f3-4360-ae4e-ad86939eeeb8
📒 Files selected for processing (10)
agent-learnings.mdmulti-storage-file-system/README.mdmulti-storage-file-system/backend_aistore.gomulti-storage-file-system/cache.gomulti-storage-file-system/config.gomulti-storage-file-system/config_test.gomulti-storage-file-system/fission.gomulti-storage-file-system/fission_test.gomulti-storage-file-system/fs.gomulti-storage-file-system/globals.go
| // Resolve AIStore `manifest_gen_backend` references now that all backends are | ||
| // parsed. Each reference must name another, non-AIStore backend in this config; | ||
| // that backend serves LIST/STAT-DIR while reads continue via AIS. | ||
| for _, aisBackend := range config.backends { | ||
| if aisBackend.backendType != "AIStore" { | ||
| continue | ||
| } | ||
| aisCfg := aisBackend.backendTypeSpecifics.(*backendConfigAIStoreStruct) | ||
| if aisCfg.manifestGenBackendName == "" { | ||
| continue | ||
| } | ||
| if aisCfg.manifestGenBackendName == aisBackend.dirName { | ||
| err = fmt.Errorf("AIStore backend %q manifest_gen_backend cannot reference itself", aisBackend.dirName) | ||
| return | ||
| } | ||
| srcBackend, found := config.backends[aisCfg.manifestGenBackendName] | ||
| if !found { | ||
| err = fmt.Errorf("AIStore backend %q manifest_gen_backend %q not found among configured backends", aisBackend.dirName, aisCfg.manifestGenBackendName) | ||
| return | ||
| } | ||
| if srcBackend.backendType == "AIStore" { | ||
| err = fmt.Errorf("AIStore backend %q manifest_gen_backend %q must not be an AIStore backend", aisBackend.dirName, aisCfg.manifestGenBackendName) | ||
| return | ||
| } | ||
| if srcBackend.bucketContainerName != aisBackend.bucketContainerName || srcBackend.prefix != aisBackend.prefix { | ||
| globals.logger.Printf("[WARN] AIStore backend %q manifest_gen_backend %q targets bucket/prefix %q/%q which differs from %q/%q; listing and reads may not align", | ||
| aisBackend.dirName, aisCfg.manifestGenBackendName, | ||
| srcBackend.bucketContainerName, srcBackend.prefix, | ||
| aisBackend.bucketContainerName, aisBackend.prefix) | ||
| } | ||
| aisCfg.manifestGenBackend = srcBackend | ||
| } |
There was a problem hiding this comment.
Handle manifest_gen_backend changes on reload (or reject them explicitly).
Line 1799 introduces a new resolved routing field, but the SIGHUP immutability checks for AIStore backends (around Line 2173+) do not compare manifestGenBackendName. A reload that changes manifest_gen_backend can succeed yet keep old routing/context behavior, causing silent config drift.
🔧 Suggested fix
case "AIStore":
+ if backendAsStructOld.backendTypeSpecifics.(*backendConfigAIStoreStruct).manifestGenBackendName !=
+ backendAsStructNew.backendTypeSpecifics.(*backendConfigAIStoreStruct).manifestGenBackendName {
+ err = fmt.Errorf("cannot change AIStore.manifest_gen_backend in backends[\"%s\"]", dirName)
+ return
+ }
+
if backendAsStructOld.backendTypeSpecifics.(*backendConfigAIStoreStruct).endpoint != backendAsStructNew.backendTypeSpecifics.(*backendConfigAIStoreStruct).endpoint {
err = fmt.Errorf("cannot change AIStore.endpoint in backends[\"%s\"]", dirName)
return
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@multi-storage-file-system/config.go` around lines 1799 - 1830, During SIGHUP
reload immutability checks, ensure AIStore backends' manifest_gen_backend is
handled: compare the new aisCfg.manifestGenBackendName against the old config's
manifestGenBackendName for the same backend (aisBackend.dirName) and reject the
reload with an error if it changed; alternatively, if you choose to allow
changes, re-resolve aisCfg.manifestGenBackend from config.backends (the same
resolution logic that sets aisCfg.manifestGenBackend) so the runtime pointer
matches the new name. Update the immutability check block that inspects AIStore
backends to reference manifestGenBackendName and either fail the reload on
mismatch or refresh the resolved manifestGenBackend pointer accordingly.
| | authnTokenFile | string | "${AIS_AUTHN_TOKEN_FILE:=~/.config/ais/cli/auth.token}" | If != "", specifies location of AUTHN Token file | | ||
| | provider | string | "s3" | IF != "ais", specifies the backend of which bucket contents are cached | | ||
| | timeout | decimal milliseconds | 30000 | Limit on allowed duration of requests (including retries) | | ||
| | manifest_gen_backend | string | "" | IF != "", `dir_name` of another (non-AIStore) backend used for LIST/STAT-DIR (manifest generation, readdir); object reads still go via AIS. Lets listing hit the underlying store (e.g. S3) directly while reads benefit from AIS caching. The referenced backend must target the same bucket/prefix. | |
There was a problem hiding this comment.
Documentation overstates bucket/prefix enforcement.
Line 149 says the referenced backend must match bucket/prefix, but checkConfigFile() currently only warns on mismatch and still proceeds. Please change wording to “should” (or enforce as an error in code).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@multi-storage-file-system/README.md` at line 149, Update the README entry for
manifest_gen_backend to soften the requirement from "must" to "should" to match
runtime behavior: change the phrase "must match bucket/prefix" to "should match
bucket/prefix" (or equivalent wording), and optionally note that
checkConfigFile() currently only warns on mismatches rather than failing;
reference the manifest_gen_backend field in the README and the checkConfigFile()
function so reviewers can see the doc now reflects the code behavior (or choose
to instead make checkConfigFile() enforce an error if you prefer strict
enforcement).
c85ce61 to
e0f8e71
Compare
Moves the data-cache warm-read copy (cache line -> FUSE reply) out of the global lock using the data cache's contentGeneration counter: snapshot the generation + range under the lock, copy unlocked, then re-acquire and re-check the generation, rolling back the appended bytes and retrying on mismatch. This removes the single-lock serialization that capped warm-read throughput as application threads scaled. Result (OCI ws-fits, P4.8 warm): 1,861 -> 2,866 MiB/s (+54%); mean warm-read latency ~210 -> ~117 us. Cold path and default behavior unchanged; warm backend fetches remain 0. - cache.go/fission.go: optimistic copy-outside-lock serve path - backend_aistore.go/config.go/globals.go: supporting plumbing - Do not log backend-derived values in the backend setup-failure and manifest_gen mismatch warnings (CodeQL go/clear-text-logging): the secret taints those structures, so the warnings log a fixed message only. - tests: config_test.go, fission_test.go
e0f8e71 to
f3f507e
Compare
Summary
Removes the warm-read lock serialization in the MSFS data cache by moving the cache-line → FUSE reply copy out of the global lock, using the data cache's existing
contentGenerationcounter:touch()the LRU, clamp the range, and snapshotcontentGeneration(+ source offset/length, reply length).dataCacheLinesContentinto the reply (safe — that buffer/tracker are allocated once for the mount's life and never moved).contentGeneration; everyfree()/fetch()bumps it, so on a mid-copy evict/refetch the generation mismatches and we roll back the appended bytes and retry. Common path matches and commits.Only the cheap lookup/
touchstays under the lock; byte movement now runs concurrently.Results (OCI ws-fits, P4.8 warm — all cache-resident, 0 backend fetches)
Win concentrated on the 64 KiB families (P2/P4/P6) at 4–8 threads; flat on the FUSE-op-rate-bound 4 KiB families.
Files
cache.go,fission.go— optimistic copy-outside-lock serve pathbackend_aistore.go,config.go,globals.go— supporting plumbingconfig_test.go,fission_test.goTest plan
go build ./...,go test .greengofmt/go vetcleanSummary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests