[libcudacxx] Add support for wider types in fill_bytes by pciolkosz · Pull Request #8333 · NVIDIA/cccl

pciolkosz · 2026-04-08T22:36:14Z

Summary

Generalize fill_bytes to accept uint16_t and uint32_t fill values in addition to uint8_t, matching the CUDA driver's cuMemsetD8Async/cuMemsetD16Async/cuMemsetD32Async capabilities.
Template __fill_bytes_impl on the fill value type, with a static_assert restricting to 1, 2, or 4 byte values.
For wider fill values, validate that the destination size in bytes is a multiple of the fill value size.

Test plan

Existing fill_bytes tests continue to pass (uint8_t path unchanged)
Header tests build across all compiler configurations
Add tests for uint16_t and uint32_t fill patterns

Test fill_bytes with uint16_t and uint32_t patterns on device and pinned memory, plus an mdspan test with uint32_t.

github-actions · 2026-04-09T01:13:07Z

🥳 CI Workflow Results

🟩 Finished in 2h 25m: Pass: 100%/108 | Total: 1d 23h | Max: 2h 15m | Hits: 99%/281468

See results here.

Jacobfaib · 2026-04-10T16:53:19Z

libcudacxx/include/cuda/__algorithm/fill.h

+  static_assert(sizeof(_ValueTy) == 1 || sizeof(_ValueTy) == 2 || sizeof(_ValueTy) == 4,
+                "Fill value must be 1, 2, or 4 bytes (matching CUDA driver memset support)");


This check is already done in __memsetAsync() (though the message there could be improved).

Add support for wider types in fill_bytes

38884b3

pciolkosz requested a review from a team as a code owner April 8, 2026 22:36

pciolkosz requested a review from wmaxey April 8, 2026 22:36

github-project-automation bot added this to CCCL Apr 8, 2026

github-project-automation bot moved this to Todo in CCCL Apr 8, 2026

cccl-authenticator-app bot moved this from Todo to In Review in CCCL Apr 8, 2026

Add tests for uint16 and uint32 fill_bytes

3320b55

Test fill_bytes with uint16_t and uint32_t patterns on device and pinned memory, plus an mdspan test with uint32_t.

Jacobfaib reviewed Apr 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[libcudacxx] Add support for wider types in fill_bytes#8333

[libcudacxx] Add support for wider types in fill_bytes#8333
pciolkosz wants to merge 2 commits intoNVIDIA:mainfrom
pciolkosz:wider_types_in_fill_bytes

pciolkosz commented Apr 8, 2026

Uh oh!

github-actions bot commented Apr 9, 2026

Uh oh!

Jacobfaib Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		static_assert(sizeof(_ValueTy) == 1 \|\| sizeof(_ValueTy) == 2 \|\| sizeof(_ValueTy) == 4,
		"Fill value must be 1, 2, or 4 bytes (matching CUDA driver memset support)");

Conversation

pciolkosz commented Apr 8, 2026

Summary

Test plan

Uh oh!

github-actions bot commented Apr 9, 2026

🥳 CI Workflow Results

🟩 Finished in 2h 25m: Pass: 100%/108 | Total: 1d 23h | Max: 2h 15m | Hits: 99%/281468

Uh oh!

Jacobfaib Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants