Skip to content

feat(shrex): add resource limitting to shrex server#4898

Open
Wondertan wants to merge 2 commits intomainfrom
feat/shrex-rate-limiting
Open

feat(shrex): add resource limitting to shrex server#4898
Wondertan wants to merge 2 commits intomainfrom
feat/shrex-rate-limiting

Conversation

@Wondertan
Copy link
Copy Markdown
Member

@Wondertan Wondertan commented Apr 2, 2026

Adds libp2p resource manager integration to the shrex server: exact per-request memory reservation and principled, hardware-scaled stream limits.

Exact memory reservationResponseSize(edsSize int) is added to each shwap ID type and Size(ctx) to shwap.Accessor. The handler now calls file.Size()ResponseSize(edsSize)ReserveMemory() before
ResponseReader(), so the resource manager budget reflects real allocations per request type rather than a generic estimate.

Resource limits (limits.go) — all service and protocol limits derive from a single value: the worst-case response
size across registered request types at MaxSquareSize (~32 MiB for a 512-wide square). Stream and memory budgets are kept self-consistent via streamIncrease = 1 GiB / maxResponseSize and scale with hardware through BaseLimitIncrease. Protocol per-peer caps (256 for SampleID, 16 for namespace/row, 8 for EDS) are DAS-workload heuristics that fire at stream creation before the handler runs. Memory is tracked at the service scope only, since ReserveMemory is called inside the handler after SetService.

Other — fix a double-reset where s.Reset() was called after handleDataRequest already called ResetWithError(StreamResourceLimitExceeded), silently overwriting the error code. Per-peer rate limiting is deferred to a follow-up. Add client-side handling for limits, ensuring peers are set on cooldown, rather than blocklisted.

Closes https://linear.app/celestia/issue/PROTOCO-1326/handle-celestia-246

@Wondertan Wondertan requested a review from a team as a code owner April 2, 2026 14:47
@Wondertan Wondertan requested a review from ninabarbakadze April 2, 2026 14:47
@Wondertan Wondertan self-assigned this Apr 2, 2026
@Wondertan Wondertan force-pushed the feat/shrex-rate-limiting branch from 90d6dba to d10fac7 Compare April 2, 2026 14:51
@Wondertan Wondertan changed the title feat(shrex): add rate limiting and resource management to shrex server feat(shrex)!: add rate limiting and resource management to shrex server Apr 2, 2026
@Wondertan
Copy link
Copy Markdown
Member Author

Note for the release. This is a config breaking and requires operators to call config-update manually.

@Wondertan Wondertan force-pushed the feat/shrex-rate-limiting branch 2 times, most recently from 3a0d1c1 to d0fac9a Compare April 3, 2026 18:25
@Wondertan
Copy link
Copy Markdown
Member Author

  • Does defaults meet the product expectations?
  • Do we want to make all, subset, or none of the rate tunables with the config? Currently, all are exposed.

The defaults autoscale now with the available memory. The entire configuration is now derived from protocol heuristics and params, and nothing is exposed as configuration. It aims to saturate as much RAM as is provisioned to the process. More RAM, more concurrent peers can be processed.

@Wondertan Wondertan changed the title feat(shrex)!: add rate limiting and resource management to shrex server feat(shrex): add rate limiting and resource management to shrex server Apr 3, 2026
@Wondertan Wondertan changed the title feat(shrex): add rate limiting and resource management to shrex server feat(shrex): add resource limitting to shrex server Apr 3, 2026
@Wondertan Wondertan force-pushed the feat/shrex-rate-limiting branch from 8ffc50e to 554dc92 Compare April 3, 2026 19:45
devin-ai-integration[bot]

This comment was marked as resolved.

@Wondertan Wondertan force-pushed the feat/shrex-rate-limiting branch 4 times, most recently from 64a0f19 to 45af3ae Compare April 4, 2026 13:52
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 16 additional findings in Devin Review.

Open in Devin Review

@Wondertan Wondertan force-pushed the feat/shrex-rate-limiting branch from 45af3ae to 4a80adb Compare April 4, 2026 14:22
Wondertan and others added 2 commits April 4, 2026 18:23
Introduce principled libp2p resource manager limits and exact per-request
memory reservation for the shrex server.

Memory reservation:
- Add ResponseSize(edsSize int) to each shwap ID type so the server can
  compute the exact bytes it will read before touching the accessor.
- Add Size(ctx) to shwap.Accessor to look up the actual EDS square size
  from the open file.
- The handler now calls file.Size() → ResponseSize(edsSize) → ReserveMemory()
  before ResponseReader(), replacing a flat per-stream constant with an
  exact per-request budget.

Resource limits (new limits.go):
- All service and protocol limits are derived from a single source of
  truth: maxResponseSize at MaxSquareSize (e.g. 32 MiB for a 512-wide square).
- streamIncrease = 1 GiB / maxResponseSize keeps stream and memory budgets
  self-consistent and auto-adjusts if response sizes change.
- Service per-peer cap = streamIncrease / minSimultaneousPeers ensures at
  least N peers can be fully served in parallel.
- Protocol per-peer caps are DAS-workload heuristics (256 for SampleID,
  16 for namespace/row, 8 for EDS) and fire at stream creation before the
  handler runs and before any memory is reserved.
- Memory is tracked only at the service scope to avoid double-counting
  with protocol-scope limits.

Other:
- Fix double-reset: statusResourceExhausted no longer calls s.Reset() after
  handleDataRequest already called ResetWithError(StreamResourceLimitExceeded).
- Per-peer rate limiting is deferred to a follow-up; rcmgr stream limits
  provide primary backpressure for this change.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When the remote peer's resource manager rejects a stream due to hitting
a limit (stream count or memory budget), the libp2p stream reset carries
ErrorCode StreamResourceLimitExceeded. Surface this as ErrResourceExhausted
in the client and treat it as a temporary overload — cooldown the peer and
retry on a different one rather than blacklisting.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Wondertan Wondertan force-pushed the feat/shrex-rate-limiting branch from 4a80adb to 86ac4d3 Compare April 4, 2026 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant