Skip to content

evict least-recently-used stmt when cache is full#1388

Open
mattn wants to merge 4 commits intomasterfrom
stmt-cache-lru
Open

evict least-recently-used stmt when cache is full#1388
mattn wants to merge 4 commits intomasterfrom
stmt-cache-lru

Conversation

@mattn
Copy link
Copy Markdown
Owner

@mattn mattn commented Apr 11, 2026

Follow-up to #1387, addressing rittneje's comment: #1387 (comment)

When the cache was full, putCachedStmt rejected the new entry instead of evicting anything. This meant the first N cached statements squatted on every slot forever, and a hot query prepared later would never benefit from caching. Now we evict the least-recently-used entry to make room.

The cache is a single preallocated []*SQLiteStmt of length _stmt_cache_size, ordered LRU-first (buf[0] is next to be evicted, buf[count-1] is the most recently put). Put at the tail is O(1) when not full; eviction shifts the remaining entries left by one. Take does a backward linear scan (MRU-end first) and shifts the right side left. For the small cache sizes users typically configure (a handful up to a few dozen), these O(N) operations are cache-friendly pointer moves and outperform the previous map + linked list design on every hit-path microbenchmark.

No per-operation allocation, no map, no linked list, no extra fields on SQLiteStmt.

Benchmarks

BenchmarkStmtCache is added in this PR. Measured on linux/amd64, AMD Ryzen 7 7735HS, benchstat over 10 runs.

vs master (before this PR)

                                 │   master    │             this PR              │
                                 │   sec/op    │   sec/op     vs base              │
StmtCache/off                      2.194µ ± 3%   2.234µ ± 2%   +1.85% (p=0.005)
StmtCache/size4_keys1_hit          1.163µ ± 2%   1.127µ ± 3%   -3.10% (p=0.005)
StmtCache/size4_keys4_hit          1.173µ ± 7%   1.158µ ± 6%        ~ (p=0.956)
StmtCache/size16_keys8_hit         1.183µ ± 2%   1.146µ ± 4%        ~ (p=0.165)
StmtCache/size4_keys8_evict        1.762µ ± 2%   2.459µ ± 4%  +39.53% (p=0.000)
StmtCache/size16_keys32_evict      1.798µ ± 6%   2.663µ ± 4%  +48.11% (p=0.000)
                                 │   master    │          this PR           │
                                 │  allocs/op  │ allocs/op   vs base        │
StmtCache/size4_keys1_hit            6.000 ± 0%   5.000 ± 0%  -16.67%
StmtCache/size4_keys4_hit            6.000 ± 0%   5.000 ± 0%  -16.67%
StmtCache/size16_keys8_hit           6.000 ± 0%   5.000 ± 0%  -16.67%

Interpretation

Hit path (working set fits in cache). All three hit cases come out faster than master or within noise, with one fewer allocation per operation. Replacing the previous map[string][]*SQLiteStmt + linked list design with a flat preallocated slice removes the map lookup, the per-put slice reallocation, and all list pointer bookkeeping.

Evict path. The _evict cases cycle through N distinct queries with N > cache size. This is the worst case for LRU under uniform round-robin access:

hit rate why
master (reject-on-full) 50% First M queries fill the cache and stay forever; every other query misses
this PR (LRU) 0% Every cached entry is evicted just before its next use

Master's 50% hit rate is not the result of a smart policy — it comes from the "freeze the cache once full" behavior accidentally suiting a uniform cyclic pattern. On realistic workloads (some hot queries, some cold, not all arriving at startup) the old policy leaves hot queries permanently uncached, which is exactly the problem rittneje raised. The +40-48% here reflects LRU correctly evicting old entries; it is not a regression in the fix.

Users whose real working set is larger than _stmt_cache_size should increase the cache size or disable the cache (_stmt_cache_size=0, the default).

Impact on ad-hoc db.QueryRow

Additional numbers from lirlia/go-sqlite-performanceSELECT id, name, hp, attack FROM monster WHERE id = ? against a 1000-row table, linux/amd64, AMD Ryzen 7 7735HS, single connection, 5 runs. Compares repeated db.QueryRow(query, args) with and without _stmt_cache_size=32:

Parallelism Fair (no cache) FairCached (cache=32) Δ
Seq 10.80 µs 7.96 µs -26%
P1 43.86 µs 26.81 µs -39%
P10 43.60 µs 27.43 µs -37%
P100 45.44 µs 28.15 µs -38%

On this realistic hot-path workload the cache (introduced in #1387 and correctly LRU-managed in this PR) cuts per-query cost by roughly 37-39% under parallelism and 26% serially. The BenchmarkStmtCache microbenchmark above isolates the cache data-structure change; this table shows what it means end-to-end for a user calling db.QueryRow in a loop.

@mattn mattn force-pushed the stmt-cache-lru branch 2 times, most recently from a81a7ee to c70622e Compare April 11, 2026 11:06
Comment thread sqlite3.go Outdated
Comment thread sqlite3.go Outdated
C.sqlite3_finalize(victim.s)
victim.s = nil
}
victim.c = nil
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused why we are doing this. Isn't it falling out of scope?

Comment thread sqlite3.go
c.stmtCacheCount--
c.stmtCacheBuf[c.stmtCacheCount] = nil
s.closed = false
s.cls = false
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we setting this to false?
Likewise, why are we setting s.t to the empty string?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extracted a finalizeCachedStmt helper shared by the eviction path and closeCachedStmtsLocked; dropped the redundant s.t = "" reset in takeCachedStmt (cached stmts always have t == "") and left a comment explaining why closed/cls still need to be reset.

Comment thread sqlite3.go Outdated

func (c *SQLiteConn) putCachedStmt(s *SQLiteStmt) bool {
if c == nil || s == nil || s.s == nil || s.cacheKey == "" {
if c == nil || s == nil || s.s == nil || s.cacheKey == "" || c.stmtCacheSize <= 0 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would s.cacheKey be set if there is no stmt cache?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepareWithCache now only sets cacheKey when the cache is enabled, so the redundant size check in putCachedStmt is gone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants