Skip to content

Perf: repeated tree updates need a batch-aware apply engine #6038

@zbeyens

Description

@zbeyens

Repeated tree updates need a batch-aware apply engine

Summary

While benchmarking Plate against Slate on huge documents, I ran into a large transform cliff in Slate for repeated exact-path node updates.

The original trigger on the Plate side was fixed separately, but the Slate benchmark showed a more general problem:

  • repeated exact-path updates on wide sibling arrays are dominated by repeated immutable branch rewrites
  • normalize is not the main cost in that workload
  • path/range ref transforms are not the main cost either
  • the main cost is redoing shared ancestor work over and over in the per-op apply path

That turned this from a narrow setNodes observation into a broader batch-engine problem.

Requirements

Any upstream solution needs to preserve a few important semantics:

  • editor.apply(op) remains the single public/plugin seam
  • plugin authors should not need to learn a second override path just to stay correct in batches
  • code that calls downstream apply(op) and then immediately inspects editor.children should still see correct state
  • previously published node references should remain immutable

Current direction

The direction implemented in PR #6039 is:

  • Editor.withBatch(editor, fn) as an explicit transaction boundary
  • Transforms.applyBatch(editor, ops) as sugar over batched execution
  • a private batch draft for tree changes
  • accessor-backed editor.children, so committed state and staged state can be separated cleanly
  • an optimized exact-path set_node path inside the batch executor
  • generic batched semantics for other tree operations, even where they are not yet highly optimized

This keeps one public seam (editor.apply(op)) while moving batching into the engine underneath it.

Benchmarks

Using:

yarn node -r ./config/babel/register.cjs ./packages/slate/test/perf/set-nodes-bench.js --blocks=5000 --group-size=50 --repeats=3

Current branch results:

  • flat exact set_node
    • Transforms.applyBatch(...): 6.03 ms
    • editor.apply(set_node) loop inside Editor.withoutNormalizing(...): 95.86 ms
    • Transforms.setNodes(...) inside Editor.withoutNormalizing(...): 92.84 ms
    • Transforms.setNodes(...) with full per-call normalize: 1311.22 ms
  • grouped exact set_node
    • Transforms.applyBatch(...): 6.88 ms
    • editor.apply(set_node) loop inside Editor.withoutNormalizing(...): 13.26 ms
    • Transforms.setNodes(...): 42.21 ms
  • mixed exact set_node batch plus one tail insert_node: 9.91 ms
  • pure insert_node batch on an empty document
    • Transforms.applyBatch(...): 3152.11 ms
    • replay-ish editor.apply(insert_node) loop inside Editor.withoutNormalizing(...): 3709.12 ms

So the batch engine now solves the original repeated exact-path set_node cliff and provides the right execution model for broader batching, but further op-family optimizations still need to be justified by benchmarks.

Why I’m rewriting this issue

The original issue body focused on a set_node microbenchmark and possible exact-path bulk-update helpers.

At this point the underlying conclusion is clearer:

  • the root problem is the per-op apply engine on repeated tree updates
  • the durable solution is a batch-aware engine, not a public one-off setNodesBatch API
  • exact-path set_node is the first optimized family, not the entire end state

Tracking

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions