feat(shrex): add resource limitting to shrex server#4898
Open
feat(shrex): add resource limitting to shrex server#4898
Conversation
90d6dba to
d10fac7
Compare
Member
Author
|
Note for the release. This is a config breaking and requires operators to call |
3a0d1c1 to
d0fac9a
Compare
Member
Author
The defaults autoscale now with the available memory. The entire configuration is now derived from protocol heuristics and params, and nothing is exposed as configuration. It aims to saturate as much RAM as is provisioned to the process. More RAM, more concurrent peers can be processed. |
8ffc50e to
554dc92
Compare
64a0f19 to
45af3ae
Compare
45af3ae to
4a80adb
Compare
Introduce principled libp2p resource manager limits and exact per-request memory reservation for the shrex server. Memory reservation: - Add ResponseSize(edsSize int) to each shwap ID type so the server can compute the exact bytes it will read before touching the accessor. - Add Size(ctx) to shwap.Accessor to look up the actual EDS square size from the open file. - The handler now calls file.Size() → ResponseSize(edsSize) → ReserveMemory() before ResponseReader(), replacing a flat per-stream constant with an exact per-request budget. Resource limits (new limits.go): - All service and protocol limits are derived from a single source of truth: maxResponseSize at MaxSquareSize (e.g. 32 MiB for a 512-wide square). - streamIncrease = 1 GiB / maxResponseSize keeps stream and memory budgets self-consistent and auto-adjusts if response sizes change. - Service per-peer cap = streamIncrease / minSimultaneousPeers ensures at least N peers can be fully served in parallel. - Protocol per-peer caps are DAS-workload heuristics (256 for SampleID, 16 for namespace/row, 8 for EDS) and fire at stream creation before the handler runs and before any memory is reserved. - Memory is tracked only at the service scope to avoid double-counting with protocol-scope limits. Other: - Fix double-reset: statusResourceExhausted no longer calls s.Reset() after handleDataRequest already called ResetWithError(StreamResourceLimitExceeded). - Per-peer rate limiting is deferred to a follow-up; rcmgr stream limits provide primary backpressure for this change. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When the remote peer's resource manager rejects a stream due to hitting a limit (stream count or memory budget), the libp2p stream reset carries ErrorCode StreamResourceLimitExceeded. Surface this as ErrResourceExhausted in the client and treat it as a temporary overload — cooldown the peer and retry on a different one rather than blacklisting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4a80adb to
86ac4d3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds libp2p resource manager integration to the shrex server: exact per-request memory reservation and principled, hardware-scaled stream limits.
Exact memory reservation —
ResponseSize(edsSize int)is added to each shwap ID type andSize(ctx)toshwap.Accessor. The handler now callsfile.Size()→ResponseSize(edsSize)→ReserveMemory()beforeResponseReader(), so the resource manager budget reflects real allocations per request type rather than a generic estimate.Resource limits (
limits.go) — all service and protocol limits derive from a single value: the worst-case responsesize across registered request types at
MaxSquareSize(~32 MiB for a 512-wide square). Stream and memory budgets are kept self-consistent viastreamIncrease = 1 GiB / maxResponseSizeand scale with hardware throughBaseLimitIncrease. Protocol per-peer caps (256 forSampleID, 16 for namespace/row, 8 for EDS) are DAS-workload heuristics that fire at stream creation before the handler runs. Memory is tracked at the service scope only, sinceReserveMemoryis called inside the handler afterSetService.Other — fix a double-reset where
s.Reset()was called afterhandleDataRequestalready calledResetWithError(StreamResourceLimitExceeded), silently overwriting the error code. Per-peer rate limiting is deferred to a follow-up. Add client-side handling for limits, ensuring peers are set on cooldown, rather than blocklisted.Closes https://linear.app/celestia/issue/PROTOCO-1326/handle-celestia-246