Skip to content

improvements of subclasses and jsonization time by add two options in global_config#521

Open
democrazyx wants to merge 2 commits intolidatong:masterfrom
democrazyx:zyx
Open

improvements of subclasses and jsonization time by add two options in global_config#521
democrazyx wants to merge 2 commits intolidatong:masterfrom
democrazyx:zyx

Conversation

@democrazyx
Copy link
Copy Markdown

The two contributions of the pr are as follows:

  1. add class info to restore subclasses from json, can be enabled by set global_config.include_class_info = True
  2. save time by cache the result of type checking, can be enabled by set global_config.enable_cache = True

to see detailed usage and comparation, you can open the jupyter notebook file

the following code is derived from the ipynb file

# %% [markdown]
# # 1. include class info in the json result

# %%
from dataclasses import dataclass,field
from typing import Set, Optional

from dataclasses_json import dataclass_json,global_config


@dataclass_json
@dataclass
class Animal:
    id: int = 0
    health: int = 100


@dataclass_json
@dataclass
class Cat(Animal):
    age: int = 1

@dataclass_json
@dataclass
class Dog(Animal):
    age: int = 1

@dataclass_json
@dataclass
class PetCat(Cat):
    name: str = ''

@dataclass_json
@dataclass
class Person:
    name:str = 'zyx'
    animals: list[Animal] = field(default_factory=lambda:[])


# %%
p1=Person(animals=[Animal(),Cat(),PetCat()])
p1.to_dict()

# %%
p2 = Person.from_dict(p1.to_dict())
p2.to_dict()

# %% [markdown]
# some fields are missing!
# 
# to solve this, we need to include class info into the result

# %%
global_config.include_class_info=True
p1.to_dict()

# %%
p2 = Person.from_dict(p1.to_dict())
global_config.include_class_info=False
p2.to_dict()

# %% [markdown]
# now the fields are all restored!

# %% [markdown]
# # 2. use cache to save time

# %% [markdown]
# if i have thousands of objects to jsonize, the code will waste much time on get dataclass info, which will not change however in the process of jsonization 

# %%
import cProfile
import pstats
global_config.enable_cache=False
p3 = Person(animals=[Animal() for _ in range(100000)])

pr = cProfile.Profile()
pr.enable()
result_without_cache = p3.to_json()
pr.disable()
pr.dump_stats('profile_stats1')
stats = pstats.Stats('profile_stats1')
stats.sort_stats('cumulative')
stats.print_stats()

# %%
import cProfile
import pstats
global_config.enable_cache=True
p3 = Person(animals=[Animal() for _ in range(100000)])

pr = cProfile.Profile()
pr.enable()
result_with_cache = p3.to_json()
pr.disable()
pr.dump_stats('profile_stats2')
stats = pstats.Stats('profile_stats2')
stats.sort_stats('cumulative')
stats.print_stats()

# %% [markdown]
# The improvement in program speed is huge, from 6.6s to 2.5s in my laptop
# 
# now let's check if the results are the same

# %%
result_with_cache==result_without_cache

zyx added 2 commits February 28, 2024 21:28
2. save time by cache the result of type checking
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant