Skip to content

Avoid string_view underflow in ParseCodeMetadataAnnotation#2761

Merged
sbc100 merged 1 commit into
WebAssembly:mainfrom
aizu-m:code-metadata-annotation-prefix
Jun 18, 2026
Merged

Avoid string_view underflow in ParseCodeMetadataAnnotation#2761
sbc100 merged 1 commit into
WebAssembly:mainfrom
aizu-m:code-metadata-annotation-prefix

Conversation

@aizu-m

@aizu-m aizu-m commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

ASAN, wat2wasm --enable-annotations on a (@custom ...) sat inside a function body:

ERROR: AddressSanitizer: heap-buffer-overflow READ of size 8
  #5 std::__string_view_hash::operator()(std::string_view)
  #9 wabt::BinaryWriter::WriteExpr() binary-writer.cc:1189
  #11 wabt::BinaryWriter::WriteFunc() binary-writer.cc:1229

ParseCodeMetadataAnnotation strips the metadata.code. prefix with remove_prefix(14) and assumes it is always there. ParseInstrList hands it every (@...) annotation in an instruction list, and Peek admits the bare (@custom ...) token because that spelling is meaningful at module scope. On the 6-byte custom text remove_prefix(14) underflows the view length to ~2^64 and walks the data pointer past the token. The CodeMetadataExpr then keeps a name whose pointer and size are both rubbish. Nothing trips until the binary writer hashes that name as a map key, which is where ASAN fires.

The binary reader already guards the matching strip with find("metadata.code.") == 0 in binary-reader.cc; the parser never got it. Reject annotations without the prefix.

Repro:

(module
  (func
    (@custom "x")))

Turned up fuzzing wat2wasm with annotations enabled.

(@custom "x")))
(;; STDERR ;;;
out/test/parse/bad-code-metadata-annotation.txt:6:5: error: unexpected annotation: custom
(@custom "x")))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an error, or something we just skip over in ParseCodeMetadataAnnotation?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched it to skip. Peek already discards every annotation it does not recognise; the only reason a non-metadata.code one lands here is the cur.text() == "custom" branch admitting custom for module-scope custom sections. So erroring would single out custom while (@foo ...) and the rest get dropped silently, and the binary reader already skips unknown custom sections too. Skipping keeps it consistent. It now eats the annotation’s balanced parens and carries on, and a real metadata.code annotation in the same body is still emitted. Reworked the test into a dump that shows both.

@sbc100

sbc100 commented Jun 18, 2026

Copy link
Copy Markdown
Member

(Nit: Please start the PR titles with a capital letter).

ParseInstrList hands every (@...) annotation in an instruction list to
ParseCodeMetadataAnnotation, which strips the "metadata.code." prefix
unconditionally. Peek admits a bare (@Custom ...), only meaningful at
module scope, into the stream, so it reaches the strip with 6-byte text;
remove_prefix(14) then underflows the string_view length and the stored
name is read off the heap when the binary writer hashes it.

Every other unrecognised annotation in an instruction list is already
discarded by Peek, so skip a non-metadata.code annotation here too
instead of misparsing it.
@aizu-m aizu-m force-pushed the code-metadata-annotation-prefix branch from f0137d2 to 61b94ff Compare June 18, 2026 17:18
@aizu-m aizu-m changed the title avoid string_view underflow in ParseCodeMetadataAnnotation Avoid string_view underflow in ParseCodeMetadataAnnotation Jun 18, 2026
@aizu-m

aizu-m commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

Title capitalised, thanks. Also flipped the handling from a parse error to skipping the annotation per the inline note.

Comment thread src/wast-parser.cc
WABT_TRACE(ParseCodeMetadataAnnotation);
Token tk = Consume();
std::string_view name = tk.text();
if (name.find("metadata.code.") != 0) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not require C++20 so don't have str.starts_with

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we're on C++17 here (CMAKE_CXX_STANDARD 17 in CMakeLists), so no starts_with. Went with find(...) != 0 to match the existing prefix check in Peek at wast-parser.cc:598, which tests the same string the same way.

@sbc100 sbc100 enabled auto-merge (squash) June 18, 2026 17:23
@sbc100 sbc100 merged commit f42ead6 into WebAssembly:main Jun 18, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants