Skip to content

Optimization. #32

@sirherrbatka

Description

@sirherrbatka

Hello,

I noticed that cl-csv is slow when compared to the standard parser in python (and I have some jumbo sized CSV to parse so it hurts). I decided to attempt to profile and localize the issues that have sizeable impact on the runtime performance.

I discovered that problems stem from the read-dispatch-table-entry defclass. Since parsing involves calling generic accessors multiple times for every single character i decided to give it a shot and replace defclass with defstruct, and consequently all calls to generic accessors to slots of this class with struct accessors. I added few declaim inlines for internal functions and replaced few repeated calls to other accessors. Those changes yielded ~×3 runtime improvement when benchmarked on my machine. I don't see any flexibility drawbacks when doing this. Furthermore, I believe that even more improvement is possible when applying the same treatment to the read-dispatch-table defclass.

Are those changes acceptable design-wise? If yes, I will open pull request with those.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions