Skip to content

reduce CGO call overhead for exec and bind paths#1386

Merged
mattn merged 2 commits intomasterfrom
perf/reduce-cgo-overhead
Apr 6, 2026
Merged

reduce CGO call overhead for exec and bind paths#1386
mattn merged 2 commits intomasterfrom
perf/reduce-cgo-overhead

Conversation

@mattn
Copy link
Copy Markdown
Owner

@mattn mattn commented Apr 6, 2026

Combine multiple CGO crossings into single C calls for hot paths.

  • _sqlite3_exec_no_args(): merges prepare+step+finalize into one CGO call for parameterless exec (the most common case)
  • _sqlite3_reset_clear(): merges sqlite3_reset + sqlite3_clear_bindings into one CGO call
  • Use semaphore channel instead of result-struct channel in context-aware exec/Next
  • Use time.AppendFormat with stack buffer to avoid heap allocation when binding time.Time
  • Reuse single buffer in bindNamedIndices instead of 3 C.CString allocations
  • Remove intermediate bindIndices slice in named parameter binding
  • Pass explicit length to sqlite3_prepare_v2 to skip C-side strlen
                                │   before    │    after     │                     │
                                │   sec/op    │   sec/op     │                     │
Suite/BenchmarkExec-16             1.913µ        1.349µ        -29.44% (p=0.000)
Suite/BenchmarkQuery-16            4.977µ        4.488µ         -9.83% (p=0.028)
Suite/BenchmarkParams-16           5.074µ        4.750µ         -6.38% (p=0.001)
Suite/BenchmarkStmt-16             3.077µ        3.005µ              ~ (p=0.169)
Suite/BenchmarkRows-16             130.9µ        130.9µ              ~ (p=0.878)
Suite/BenchmarkStmtRows-16        128.8µ        127.7µ              ~ (p=0.234)
Suite/BenchmarkQueryParallel-16    2.310µ        2.291µ              ~ (p=0.959)
geomean                            8.459µ        7.891µ        -6.72%

                                │    B/op    │    B/op     │
Suite/BenchmarkExec-16            144          72           -50.00% (p=0.000)

                                │ allocs/op  │ allocs/op  │
Suite/BenchmarkExec-16            6            4           -33.33% (p=0.000)

mattn added 2 commits April 6, 2026 22:22
- Add _sqlite3_exec_no_args() C function that combines prepare+step+finalize
  into a single CGO crossing for parameterless exec (most common case)
- Add _sqlite3_reset_clear() C function that combines sqlite3_reset and
  sqlite3_clear_bindings into a single CGO crossing
- Use semaphore channel instead of result struct channel in context-aware
  exec/Next paths to reduce allocations
- Use time.AppendFormat with stack buffer to avoid heap allocation in
  time.Time binding
- Optimize bindNamedIndices to reuse a single buffer instead of 3
  separate C.CString allocations
- Remove intermediate bindIndices slice allocation in named parameter
  binding path
- Pass explicit query length to sqlite3_prepare_v2 to avoid C-side strlen

benchstat (n=8):

  BenchmarkExec:   -29.44% sec/op, -50% B/op, -33% allocs/op
  BenchmarkQuery:  -9.83% sec/op
  BenchmarkParams: -6.38% sec/op
  geomean:         -6.72% sec/op
Move extern declarations for _sqlite3_*_blocking functions before
_sqlite3_exec_no_args which references them. Remove unused
_sqlite3_prepare_v2_nolen function.
@mattn mattn merged commit 8d12439 into master Apr 6, 2026
20 checks passed
@mattn mattn deleted the perf/reduce-cgo-overhead branch April 6, 2026 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant