Cache structs.fields() result for non-generic structs by Siyet · Pull Request #1005 · jcrist/msgspec

Siyet · 2026-04-09T19:16:21Z

Reopening #997 (auto-closed when fork was deleted).

Summary

Cache the fields() result tuple on the struct class as __struct_field_info__
Avoids expensive get_class_annotations() call and FieldInfo object recreation on every invocation
Generic aliases (e.g. MyStruct[int]) are not cached since type parameters affect annotations

Closes #963

Benchmark (1M calls, 3 fields)

	Before	After
`dataclasses.fields()`	0.85s	0.85s
`msgspec.structs.fields()`	~7.5s (20x slower)	0.44s (2x faster)

Store the computed tuple of FieldInfo on the class as __struct_field_info__ to avoid expensive get_class_annotations() and FieldInfo recreation on every call. Generic aliases are not cached since type parameters affect annotations. Closes #963

ofek · 2026-04-09T21:27:15Z

+
+    # Cache for non-generic structs
+    if not is_generic:
+        cls.__struct_field_info__ = result


Mutating the type as a result of inspection seems wrong to me. Is there no either way we can cache or otherwise close the gap in performance here? Also, does this work if the Struct is configured as frozen?

Thanks for the review!

Let me address your points in order.

Frozen structs. Yes, this works. frozen=True sets tp_setattro = Struct_setattro_frozen on the struct class itself, which only blocks attribute assignment on instances. The cache write cls.__struct_field_info__ = result goes through type(cls).tp_setattro (i.e. StructMetaType's slot, inherited from PyType_Type), which isn't overridden and isn't affected by frozen. That said, you're right that it's not obvious from the diff, I just pushed an explicit test for frozen structs to make the guarantee visible.

Why mutation, and what alternatives I considered. The dominant cost in fields() is _get_class_annotations() (typing.get_type_hints()), which is what makes the difference between 0.44s and 7.5s on the benchmark. I looked at a few alternatives before landing on __struct_field_info__:

Module-level WeakKeyDictionary: keeps fields() non-mutating, but adds an extra lookup on every call and another global to reason about. Felt strictly worse.

Eager compute in StructMeta_new: fastest in theory, but breaks on forward references, which is exactly why the current implementation is lazy.

Caching only hints instead of the full tuple: still requires mutating the class, and we'd still rebuild FieldInfo objects on every call (a meaningful chunk of the remaining cost).

In the end I picked the option with the smallest surface and the biggest win, treating __struct_field_info__ as a conventional cache slot (similar to how CPython uses dunder names for lazily-populated metadata).

On "mutation feels wrong". I hear you and I agree it's not ideal aesthetically. I see two ways forward:

Keep the current approach, document __struct_field_info__ as an internal cache slot, and add the frozen test. Pragmatic, minimal diff.

Port fields() to C as a follow-up PR. StructMetaObject already holds everything we need (struct_encode_fields, struct_offsets, struct_types, struct_defaults), so for non-generic structs we can skip typing.get_type_hints entirely and build the FieldInfo tuple directly. This would likely beat even the cached Python version and fully sidesteps the mutation question.

Personally I lean toward option 2 (port to C): it gives the best perf, removes the class mutation, and fits naturally into the existing StructMetaObject C infrastructure. If you're on board, I'm happy to close this PR and open a new one with the C port. @jcrist, would love to hear your take as well, since this touches StructMeta design.

Siyet · 2026-04-10T12:08:24Z

The failing build job here is unrelated to this PR: it's the link checker tripping on the Pydantic docs redirect (docs.pydantic.dev/latest/ → pydantic.dev/docs/validation/...), which is fixed in #1008. Once #1008 lands and this branch is rebased, CI should go green.

ofek requested changes Apr 9, 2026

View reviewed changes

Siyet requested a review from jcrist April 10, 2026 07:32

Add test for fields() caching on frozen structs

cae504f

Merge branch 'main' into 963-cache-fields

db5c990

Siyet requested a review from ofek April 11, 2026 18:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache structs.fields() result for non-generic structs#1005

Cache structs.fields() result for non-generic structs#1005
Siyet wants to merge 3 commits intomainfrom
963-cache-fields

Siyet commented Apr 9, 2026

Uh oh!

ofek Apr 9, 2026

Uh oh!

Siyet Apr 10, 2026

Uh oh!

Siyet commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Siyet commented Apr 9, 2026

Summary

Benchmark (1M calls, 3 fields)

Uh oh!

ofek Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Siyet Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Siyet commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants