Skip to content

[BUG] Generic Collections fields - not respecting default_factory field specifications #505

@cagantomer

Description

@cagantomer

Description

At times, a dataclass field may be type-annotated with a generic type (e.g. MutableMapping) and defined with a default_factory. In such case, when deserializing, the code will try to create an instance of the annotated type which is illegal and causes an error.

Code snippet that reproduces the issue

from typing import MutableMapping

from dataclasses import dataclass, field
from dataclasses_json import DataClassJsonMixin


@dataclass
class MyClass(DataClassJsonMixin):
    field1: MutableMapping[str, str] = field(default_factory=dict)

if __name__ == '__main__':
    c = MyClass()
    data = c.to_dict()
    MyClass.from_dict(data)

This will produce the following error:

Traceback (most recent call last):
File "/Users/tomercagan/dev/gen2projection/./bin/error_json.py", line 14, in
MyClass.from_dict(data)
File "/Users/tomercagan/dev/venvs/json-bug/lib/python3.10/site-packages/dataclasses_json/api.py", line 70, in from_dict
return _decode_dataclass(cls, kvs, infer_missing)
File "/Users/tomercagan/dev/venvs/json-bug/lib/python3.10/site-packages/dataclasses_json/core.py", line 220, in _decode_dataclass
init_kwargs[field.name] = _decode_generic(field_type,
File "/Users/tomercagan/dev/venvs/json-bug/lib/python3.10/site-packages/dataclasses_json/core.py", line 300, in _decode_generic
res = materialize_type(xs)
TypeError: MutableMapping() takes no arguments

P.S. There is a workaround for this - to define a decoder that builds the expected actual type (i.e. dict above) which circumvents this situation, but it is kind of a drag:

from typing import MutableMapping

from dataclasses import dataclass, field
from dataclasses_json import DataClassJsonMixin, config


@dataclass
class MyClass(DataClassJsonMixin):
    field1: MutableMapping[str, str] = field(
        default_factory=dict, 
        metadata=config(decoder=lambda x: x),  # a lambda that doesn't do anything because the underlying value is already a dict
    )

if __name__ == '__main__':
    c = MyClass()
    data = c.to_dict()
    MyClass.from_dict(data)

Describe the results you expected

Ideally, the deserialization should instantiate a instance using the specified default_factory and not try to create the generic type from the annotation.

From some playing around with the code, it seems that in core.py:220 (inside _decode_dataclass) there is an opportunity to check whether the field includes a default_factory and pass it into _decode_generic, or call an alternative function that will respect the default_factory.

I am not familiar with the codebase, standards etc but following some discussion and decision on approach, I can possibly contribute a fix...

Python version you are using

Python 3.10.9

Environment description

dataclasses-json==0.6.3
marshmallow==3.20.1
mypy-extensions==1.0.0
packaging==23.2
typing-inspect==0.9.0
typing_extensions==4.9.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions