fix: add null_as_nil option to core.json.decode for AI plugins#13231
fix: add null_as_nil option to core.json.decode for AI plugins#13231nic-6443 merged 13 commits intoapache:masterfrom
Conversation
When LLM providers return JSON null for nullable object fields (e.g., prompt_tokens_details, completion_tokens_details), cjson.decode converts them to cjson.null (a userdata value that is truthy in Lua). Code using 'if x then x.field end' patterns passes the guard but crashes when trying to index userdata. Replace all unsafe truthiness-based guards with explicit type(x) == "table" checks across AI protocol adapters and converters: - anthropic-messages-to-openai-chat.lua: fix guards for choice.message, tc.function, res_body.usage, prompt_tokens_details (6 fixes) - openai-embeddings.lua: fix usage guard (1 fix) - openai-embeddings-to-vertex-predict.lua: fix pred.embeddings and emb.statistics guards (2 fixes) - anthropic-messages.lua: fix err_data.error guard (1 fix) Add test cases covering JSON null usage fields in the Anthropic protocol conversion path.
There was a problem hiding this comment.
Pull request overview
This PR hardens AI protocol adapters/converters against cjson.null (truthy userdata) for nullable JSON fields to prevent crashes when indexing nested fields, and adds regression tests for the Anthropic conversion path.
Changes:
- Replace truthiness guards with
type(x) == "table"checks in multiple AI protocol adapters/converters. - Add test coverage for OpenAI-compatible responses containing JSON
nullin usage/message fields during Anthropic conversion. - Simplify the test YAML plugin list (removes
prometheusfrom this suite’s config).
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| t/plugin/ai-proxy-anthropic.t | Adds regression tests for JSON null in usage/message fields during Anthropic ↔ OpenAI-compatible conversion. |
| apisix/plugins/ai-protocols/openai-embeddings.lua | Tightens usage extraction guard to avoid treating cjson.null as a valid object. |
| apisix/plugins/ai-protocols/converters/openai-embeddings-to-vertex-predict.lua | Adds null-safety around pred.embeddings and emb.statistics access in Vertex response conversion. |
| apisix/plugins/ai-protocols/converters/anthropic-messages-to-openai-chat.lua | Adds table-type checks before accessing choice.message, tool call function fields, and nullable usage sub-objects. |
| apisix/plugins/ai-protocols/anthropic-messages.lua | Adds table-type checks before reading err_data.error fields when parsing Anthropic SSE error events. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Guard err_data itself as table before accessing .error - Validate numeric fields (prompt_tokens, completion_tokens, etc.) with type(x) == 'number' to prevent cjson.null in arithmetic - Guard pred as table before accessing .embeddings in vertex converter - Use type(res_body) ~= 'table' instead of 'not res_body' in embeddings
Use test-type header to select null-field responses from the existing mock server on port 6724, instead of defining a separate server on port 6725 which triggers nginx restarts and breaks route availability.
The /v1/messages URI is needed for Anthropic protocol detection (matches() checks URI suffix). Use resty.http subrequests to hit the route within the same test block, avoiding cross-block route availability issues.
Cover the tool_calls[].function == cjson.null path that was not exercised by the existing null-message test.
Move route creation into the same test block as the request, with a brief sleep for etcd propagation, instead of relying on route persistence across separate test blocks.
Add a centralized approach to handle cjson.null in AI plugin responses.
Instead of adding type() guards at every field access, extend
core.json.decode() with an opts parameter supporting null_as_nil
which recursively replaces cjson.null with nil after decode.
This makes standard Lua idioms (if x then, x or 0, x or {}) work
correctly on decoded JSON fields that may be null, preventing crashes
when LLM providers return null for nullable fields like
prompt_tokens_details, usage, message, etc.
Replace subrequest-based tests (404 in CI) with the standard pattern: - TEST 5: create route via admin API - TESTs 6-9: send requests directly via --- request directive This matches the pattern used by TESTs 1-4 in the same file.
When the entire JSON string is 'null', cjson.decode returns cjson.null (not a table). The null_as_nil option now returns nil for this case, preventing crashes when callers try to index the result.
The test framework generates a 'location /v1/' block for the control API, which intercepts POST /v1/messages before APISIX's router. Set TEST_ENABLE_CONTROL_API_V1=0 to disable it, matching the pattern used by ai-proxy-protocol-conversion.t. Also use route ID 1 and add Content-Type header.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
JSON key order is not guaranteed. Match input_tokens:0 only instead of requiring a specific key order.
moonming
left a comment
There was a problem hiding this comment.
Two small questions about what look like unrelated changes — see inline comments. Otherwise the approach (null_as_nil in core decode + applying it at AI-plugin decode sites) looks good to me.
5284500
Problem
When LLM providers (e.g., vLLM) return JSON
nullfor nullable fields likeprompt_tokens_details,usage,message, etc.,cjson.decodeconverts them tocjson.null(a Lua userdata value that is truthy). This causes crashes when code doesif x then x.field endbecausecjson.nullpasses the truthiness check but can't be indexed.Solution
Instead of adding scattered
type(x) == "table"guards at every field access (error-prone, easy to forget for new fields), this PR takes a centralized approach:core.json.decode(str, opts)with anull_as_niloption that recursively replacescjson.nullwithnilafter decodeparse_responseand SSE event parsers)This makes standard Lua idioms (
if x then,x or 0,x or {}) work correctly on decoded JSON fields that may be null, without any per-field modifications.Why this approach is better
type(x) == "table"everywhereChanges
apisix/core/json.lua: Addstrip_nulls()helper andnull_as_niloption todecode()apisix/plugins/ai-providers/base.lua: Usenull_as_nilinparse_response()apisix/plugins/ai-protocols/*.lua: Usenull_as_nilin all SSE event parserst/core/json.t: Test for the newnull_as_niloptiont/plugin/ai-proxy-anthropic.t: Tests for null field handling in Anthropic protocol conversionTest-file adjustments in
t/plugin/ai-proxy-anthropic.tTwo small changes scoped to this file are required to make the new
null-*cases run:BEGIN { $ENV{TEST_ENABLE_CONTROL_API_V1} = "0"; }— the default Control API v1 location (/v1/*) matches the/v1/chat/completionspaths used by the new tests and intercepts them before the proxy/converter runs.- prometheusfromextra_yaml_config— the default Test::Nginx setup already loadsprometheus; keeping it here caused a duplicate-plugin error when the new tests were added.