Skip to content

fix: add null_as_nil option to core.json.decode for AI plugins#13231

Merged
nic-6443 merged 13 commits intoapache:masterfrom
nic-6443:fix/cjson-null-safety-ai-protocols
Apr 18, 2026
Merged

fix: add null_as_nil option to core.json.decode for AI plugins#13231
nic-6443 merged 13 commits intoapache:masterfrom
nic-6443:fix/cjson-null-safety-ai-protocols

Conversation

@nic-6443
Copy link
Copy Markdown
Member

@nic-6443 nic-6443 commented Apr 16, 2026

Problem

When LLM providers (e.g., vLLM) return JSON null for nullable fields like prompt_tokens_details, usage, message, etc., cjson.decode converts them to cjson.null (a Lua userdata value that is truthy). This causes crashes when code does if x then x.field end because cjson.null passes the truthiness check but can't be indexed.

Solution

Instead of adding scattered type(x) == "table" guards at every field access (error-prone, easy to forget for new fields), this PR takes a centralized approach:

  1. Extend core.json.decode(str, opts) with a null_as_nil option that recursively replaces cjson.null with nil after decode
  2. Use it at all JSON decode points in AI plugins (both non-streaming parse_response and SSE event parsers)

This makes standard Lua idioms (if x then, x or 0, x or {}) work correctly on decoded JSON fields that may be null, without any per-field modifications.

Why this approach is better

Aspect Scattered type checks Centralized decode option
New nullable fields Must remember to add guard Automatically safe
Code readability type(x) == "table" everywhere Normal Lua idioms
Risk of missing a spot High Low (only decode call sites)

Changes

  • apisix/core/json.lua: Add strip_nulls() helper and null_as_nil option to decode()
  • apisix/plugins/ai-providers/base.lua: Use null_as_nil in parse_response()
  • apisix/plugins/ai-protocols/*.lua: Use null_as_nil in all SSE event parsers
  • t/core/json.t: Test for the new null_as_nil option
  • t/plugin/ai-proxy-anthropic.t: Tests for null field handling in Anthropic protocol conversion

Test-file adjustments in t/plugin/ai-proxy-anthropic.t

Two small changes scoped to this file are required to make the new null-* cases run:

  • BEGIN { $ENV{TEST_ENABLE_CONTROL_API_V1} = "0"; } — the default Control API v1 location (/v1/*) matches the /v1/chat/completions paths used by the new tests and intercepts them before the proxy/converter runs.
  • Dropping - prometheus from extra_yaml_config — the default Test::Nginx setup already loads prometheus; keeping it here caused a duplicate-plugin error when the new tests were added.

When LLM providers return JSON null for nullable object fields (e.g.,
prompt_tokens_details, completion_tokens_details), cjson.decode converts
them to cjson.null (a userdata value that is truthy in Lua). Code using
'if x then x.field end' patterns passes the guard but crashes when
trying to index userdata.

Replace all unsafe truthiness-based guards with explicit
type(x) == "table" checks across AI protocol adapters and converters:

- anthropic-messages-to-openai-chat.lua: fix guards for choice.message,
  tc.function, res_body.usage, prompt_tokens_details (6 fixes)
- openai-embeddings.lua: fix usage guard (1 fix)
- openai-embeddings-to-vertex-predict.lua: fix pred.embeddings and
  emb.statistics guards (2 fixes)
- anthropic-messages.lua: fix err_data.error guard (1 fix)

Add test cases covering JSON null usage fields in the Anthropic
protocol conversion path.
Copilot AI review requested due to automatic review settings April 16, 2026 04:18
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Apr 16, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens AI protocol adapters/converters against cjson.null (truthy userdata) for nullable JSON fields to prevent crashes when indexing nested fields, and adds regression tests for the Anthropic conversion path.

Changes:

  • Replace truthiness guards with type(x) == "table" checks in multiple AI protocol adapters/converters.
  • Add test coverage for OpenAI-compatible responses containing JSON null in usage/message fields during Anthropic conversion.
  • Simplify the test YAML plugin list (removes prometheus from this suite’s config).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
t/plugin/ai-proxy-anthropic.t Adds regression tests for JSON null in usage/message fields during Anthropic ↔ OpenAI-compatible conversion.
apisix/plugins/ai-protocols/openai-embeddings.lua Tightens usage extraction guard to avoid treating cjson.null as a valid object.
apisix/plugins/ai-protocols/converters/openai-embeddings-to-vertex-predict.lua Adds null-safety around pred.embeddings and emb.statistics access in Vertex response conversion.
apisix/plugins/ai-protocols/converters/anthropic-messages-to-openai-chat.lua Adds table-type checks before accessing choice.message, tool call function fields, and nullable usage sub-objects.
apisix/plugins/ai-protocols/anthropic-messages.lua Adds table-type checks before reading err_data.error fields when parsing Anthropic SSE error events.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread apisix/plugins/ai-protocols/anthropic-messages.lua Outdated
Comment thread apisix/plugins/ai-protocols/openai-embeddings.lua Outdated
- Guard err_data itself as table before accessing .error
- Validate numeric fields (prompt_tokens, completion_tokens, etc.) with
  type(x) == 'number' to prevent cjson.null in arithmetic
- Guard pred as table before accessing .embeddings in vertex converter
- Use type(res_body) ~= 'table' instead of 'not res_body' in embeddings
Use test-type header to select null-field responses from the
existing mock server on port 6724, instead of defining a separate
server on port 6725 which triggers nginx restarts and breaks
route availability.
The /v1/messages URI is needed for Anthropic protocol detection
(matches() checks URI suffix). Use resty.http subrequests to
hit the route within the same test block, avoiding cross-block
route availability issues.
Cover the tool_calls[].function == cjson.null path that was
not exercised by the existing null-message test.
Move route creation into the same test block as the request,
with a brief sleep for etcd propagation, instead of relying
on route persistence across separate test blocks.
Add a centralized approach to handle cjson.null in AI plugin responses.
Instead of adding type() guards at every field access, extend
core.json.decode() with an opts parameter supporting null_as_nil
which recursively replaces cjson.null with nil after decode.

This makes standard Lua idioms (if x then, x or 0, x or {}) work
correctly on decoded JSON fields that may be null, preventing crashes
when LLM providers return null for nullable fields like
prompt_tokens_details, usage, message, etc.
@nic-6443 nic-6443 changed the title fix(ai-protocols): add cjson.null safety checks for nullable JSON fields fix: add null_as_nil option to core.json.decode for AI plugins Apr 16, 2026
Replace subrequest-based tests (404 in CI) with the standard pattern:
- TEST 5: create route via admin API
- TESTs 6-9: send requests directly via --- request directive

This matches the pattern used by TESTs 1-4 in the same file.
When the entire JSON string is 'null', cjson.decode returns cjson.null
(not a table). The null_as_nil option now returns nil for this case,
preventing crashes when callers try to index the result.
The test framework generates a 'location /v1/' block for the control
API, which intercepts POST /v1/messages before APISIX's router. Set
TEST_ENABLE_CONTROL_API_V1=0 to disable it, matching the pattern used
by ai-proxy-protocol-conversion.t. Also use route ID 1 and add
Content-Type header.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread apisix/core/json.lua
Comment thread apisix/plugins/ai-providers/base.lua
JSON key order is not guaranteed. Match input_tokens:0 only instead
of requiring a specific key order.
Copy link
Copy Markdown
Contributor

@shreemaan-abhishek shreemaan-abhishek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noice!

Baoyuantop
Baoyuantop previously approved these changes Apr 17, 2026
Copy link
Copy Markdown
Member

@moonming moonming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small questions about what look like unrelated changes — see inline comments. Otherwise the approach (null_as_nil in core decode + applying it at AI-plugin decode sites) looks good to me.

Comment thread t/core/json.t
Comment thread t/plugin/ai-proxy-anthropic.t
@nic-6443 nic-6443 dismissed stale reviews from Baoyuantop and shreemaan-abhishek via 5284500 April 17, 2026 08:13
@nic-6443 nic-6443 merged commit 4223a07 into apache:master Apr 18, 2026
26 checks passed
@nic-6443 nic-6443 deleted the fix/cjson-null-safety-ai-protocols branch April 18, 2026 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants