feat(ai-rate-limiting): add expression-based limit strategy by nic-6443 · Pull Request #13191 · apache/apisix

nic-6443 · 2026-04-09T09:21:42Z

Description

This PR adds the expression limit strategy to the ai-rate-limiting plugin.

Expression strategy

The expression limit strategy allows defining rate limit groups using lua-resty-expr expressions. Each group can have its own count, time_window, and matching expression. When a request matches multiple groups, the first matching group is used. If no group matches, the request is passed through without rate limiting.

This enables fine-grained AI token rate limiting based on request attributes (headers, query params, variables, etc.).

Example config

{
  "limit_strategy": "expression",
  "cost_expr": "input_tokens + completion_tokens",
  "limit_groups": [
    {
      "expression": [["http_x_model", "==", "gpt-4"]],
      "count": 500,
      "time_window": 60
    },
    {
      "expression": [["http_x_model", "==", "gpt-3.5"]],
      "count": 1000,
      "time_window": 60
    }
  ]
}

Checklist

Expression strategy implementation with schema, compile, eval functions
Handle edge cases: negative, NaN, Inf token usage
Safe expression environment (upstream usage keys cannot shadow built-in functions)
13 test cases covering expression rate limiting scenarios

Note on remaining header accuracy

The X-AI-RateLimit-Remaining header currently shows a value that is off by 1 (e.g., 499 instead of 500) due to the limit-count module deducting cost during the access-phase dry-run peek. This will be fixed in a follow-up PR after apisix-build-tools#455 merges and a new apisix-runtime is released with lua-resty-limit-traffic v1.2.0, which supports cost=0 for non-deducting peeks.

Add a new 'expression' option for the limit_strategy field in ai-rate-limiting plugin, allowing users to define custom Lua arithmetic expressions for dynamic token cost calculation. When limit_strategy is set to 'expression', the plugin evaluates the user-defined cost_expr against the raw LLM API usage response fields (e.g., input_tokens, cache_creation_input_tokens, output_tokens). Missing variables default to 0, and safe math functions (abs, ceil, floor, max, min) are available. This enables use cases like: - Cache-aware billing: input_tokens + cache_creation_input_tokens - Weighted costs: input_tokens + cache_read_input_tokens * 0.1 + output_tokens - Provider-specific fields: any numeric field from the raw usage response

…alculation The open-source limit-count module includes the peek cost (1) in the remaining header during dry_run access phase, unlike the enterprise limit-count-advanced module. Adjust all expected remaining values by -1 to match this behavior.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds an expression-based token cost strategy to the ai-rate-limiting plugin so users can compute rate-limit cost from provider-specific usage fields via a Lua arithmetic expression.

Changes:

Extends limit_strategy with "expression" and adds cost_expr to the plugin schema.
Introduces sandboxed compilation/evaluation of expressions against ctx.llm_raw_usage.
Adds a dedicated test suite covering schema validation and (non-)streaming Anthropic scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 13 comments.

File	Description
apisix/plugins/ai-rate-limiting.lua	Adds `expression` strategy, schema field, and runtime expression evaluation for token-cost calculation.
t/plugin/ai-rate-limiting-expression.t	Adds integration tests validating expression config and Anthropic streaming/non-streaming behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

apisix/plugins/ai-rate-limiting.lua

t/plugin/ai-rate-limiting-expression.t

- Prevent raw usage fields from shadowing safe math functions (e.g., a field named 'math' or 'abs' from LLM response) - Reject non-finite values (NaN/inf) from expression results - Clamp negative expression results to 0 instead of crediting tokens - Add test for negative expression result (cache_read > input)

When expression evaluates to a negative value that gets clamped to 0, calling rate_limit() with cost=0 triggers an assertion failure in resty.limit.count's incoming function. Skip the call entirely when used_tokens is 0 since there's nothing to deduct.

nic-6443 · 2026-04-09T14:03:27Z

Fixed the CI failure in TEST 13: when the expression evaluates to a negative value that gets clamped to 0, calling rate_limit() with cost=0 triggers an assertion failure in resty.limit.count's incoming function (dict:incr(key, 0, ...)). The fix skips the rate_limit() call entirely when used_tokens == 0 since there's nothing to deduct.

…tics The lua-resty-limit-traffic library is being upgraded from v1.0.0 to v1.2.0 in the apisix-runtime build. Key library change: incoming_new() now counts UP (returns consumed) instead of DOWN (returns remaining). Changes: - limit-count-local.lua: Convert consumed return value to remaining (remaining = limit - consumed), matching the enterprise limit-count-advanced module. When commit=false (dry_run), pass cost=0 to the library so it reads current state without deducting, eliminating the off-by-1 in remaining header. - limit-count/init.lua: Add dry_run rejection check inside local-policy branch only (not redis, which always commits and has no dry_run support). - ai-rate-limiting-expression.t: Revert remaining header expectations to match enterprise values now that dry_run shows accurate remaining.

….0 semantics" This reverts commit 98ce8f3.

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. enhancement New feature or request labels Apr 9, 2026

nic-6443 requested a review from Copilot April 9, 2026 12:11

Copilot AI reviewed Apr 9, 2026

View reviewed changes

nic-6443 added 3 commits April 9, 2026 20:25

fix: localize math.huge to pass lj-releng lint check

f3f9176

nic-6443 changed the title ~~feat(ai-rate-limiting): add expression-based limit strategy~~ feat(ai-rate-limiting): add expression-based limit strategy and adapt limit-count for limit-traffic v1.2.0 Apr 10, 2026

jarvis9443 mentioned this pull request Apr 10, 2026

deps: upgrade lua-resty-limit-traffic from 1.0.0 to 1.2.0 api7/apisix-build-tools#455

Merged

Revert "fix: adapt limit-count-local for lua-resty-limit-traffic v1.2…

25d5268

….0 semantics" This reverts commit 98ce8f3.

nic-6443 changed the title ~~feat(ai-rate-limiting): add expression-based limit strategy and adapt limit-count for limit-traffic v1.2.0~~ feat(ai-rate-limiting): add expression-based limit strategy Apr 10, 2026

nic-6443 requested review from AlinsRan, Baoyuantop, membphis, moonming and shreemaan-abhishek April 10, 2026 08:38

membphis approved these changes Apr 10, 2026

View reviewed changes

AlinsRan approved these changes Apr 10, 2026

View reviewed changes

shreemaan-abhishek approved these changes Apr 10, 2026

View reviewed changes

nic-6443 merged commit ac99cd8 into apache:master Apr 10, 2026
23 checks passed

nic-6443 deleted the feat/ai-rate-limiting-expression branch April 10, 2026 09:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ai-rate-limiting): add expression-based limit strategy#13191

feat(ai-rate-limiting): add expression-based limit strategy#13191
nic-6443 merged 7 commits intoapache:masterfrom
nic-6443:feat/ai-rate-limiting-expression

nic-6443 commented Apr 9, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nic-6443 commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

nic-6443 commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Expression strategy

Example config

Checklist

Note on remaining header accuracy

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nic-6443 commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

nic-6443 commented Apr 9, 2026 •

edited

Loading