Cache structs.fields() result for non-generic structs#1005
Cache structs.fields() result for non-generic structs#1005
Conversation
Store the computed tuple of FieldInfo on the class as __struct_field_info__ to avoid expensive get_class_annotations() and FieldInfo recreation on every call. Generic aliases are not cached since type parameters affect annotations. Closes #963
|
|
||
| # Cache for non-generic structs | ||
| if not is_generic: | ||
| cls.__struct_field_info__ = result |
There was a problem hiding this comment.
Mutating the type as a result of inspection seems wrong to me. Is there no either way we can cache or otherwise close the gap in performance here? Also, does this work if the Struct is configured as frozen?
There was a problem hiding this comment.
Thanks for the review!
Let me address your points in order.
Frozen structs. Yes, this works. frozen=True sets tp_setattro = Struct_setattro_frozen on the struct class itself, which only blocks attribute assignment on instances. The cache write cls.__struct_field_info__ = result goes through type(cls).tp_setattro (i.e. StructMetaType's slot, inherited from PyType_Type), which isn't overridden and isn't affected by frozen. That said, you're right that it's not obvious from the diff, I just pushed an explicit test for frozen structs to make the guarantee visible.
Why mutation, and what alternatives I considered. The dominant cost in fields() is _get_class_annotations() (typing.get_type_hints()), which is what makes the difference between 0.44s and 7.5s on the benchmark. I looked at a few alternatives before landing on __struct_field_info__:
- Module-level
WeakKeyDictionary: keepsfields()non-mutating, but adds an extra lookup on every call and another global to reason about. Felt strictly worse. - Eager compute in
StructMeta_new: fastest in theory, but breaks on forward references, which is exactly why the current implementation is lazy. - Caching only
hintsinstead of the full tuple: still requires mutating the class, and we'd still rebuildFieldInfoobjects on every call (a meaningful chunk of the remaining cost).
In the end I picked the option with the smallest surface and the biggest win, treating __struct_field_info__ as a conventional cache slot (similar to how CPython uses dunder names for lazily-populated metadata).
On "mutation feels wrong". I hear you and I agree it's not ideal aesthetically. I see two ways forward:
- Keep the current approach, document
__struct_field_info__as an internal cache slot, and add the frozen test. Pragmatic, minimal diff. - Port
fields()to C as a follow-up PR.StructMetaObjectalready holds everything we need (struct_encode_fields,struct_offsets,struct_types,struct_defaults), so for non-generic structs we can skiptyping.get_type_hintsentirely and build theFieldInfotuple directly. This would likely beat even the cached Python version and fully sidesteps the mutation question.
Personally I lean toward option 2 (port to C): it gives the best perf, removes the class mutation, and fits naturally into the existing StructMetaObject C infrastructure. If you're on board, I'm happy to close this PR and open a new one with the C port. @jcrist, would love to hear your take as well, since this touches StructMeta design.
Reopening #997 (auto-closed when fork was deleted).
Summary
fields()result tuple on the struct class as__struct_field_info__get_class_annotations()call andFieldInfoobject recreation on every invocationMyStruct[int]) are not cached since type parameters affect annotationsCloses #963
Benchmark (1M calls, 3 fields)
dataclasses.fields()msgspec.structs.fields()