perf: keep chunk-K residency engaged with runtime LoRA by fszontagh · Pull Request #1598 · leejet/stable-diffusion.cpp

fszontagh · 2026-06-02T17:26:01Z

Summary

Re-enable chunk-K residency for the runtime LoRA path. Two related fixes in compute_streaming_segments / resolve_graph_cut_plan:

Drop the weight_adapter != nullptr bypass. Runtime LoRA composes weight + diff in the compute graph via ggml_add; the resident weight is never mutated, so the cached GPU copy stays valid across sampling steps.
Add the resident buffer size back to free_vram before clamping the streaming budget. Otherwise chunk-K's own allocation is read as "taken by someone else", the budget shrinks step-to-step, and the resident set rebuilds every step instead of every generation.

Related Issue / Discussion

Follow-up to #1576 (--stream-layers). Closes the LoRA-path perf gap left there.

Additional Information

Z-Image bf16, 512x512, 8 steps, --offload-to-cpu --stream-layers --max-vram 8 on RTX 3060:

Config	Before	After	Delta
LoRA 0.8	46.78 s	37.75 s	-19%
LoRA 1.5	45.83 s	37.14 s	-19%
no LoRA	44.64 s	34.72 s	-22%

LoRA multiplier scaling unchanged (0.8 vs 1.5 mean pixel diff 26.17 -> 26.05).

Checklist

I have read and confirmed this PR follows the contribution guidelines.

Keep chunk-K residency engaged with runtime LoRA

5401fb1

fszontagh changed the title ~~Keep chunk-K residency engaged with runtime LoRA~~ perf: keep chunk-K residency engaged with runtime LoRA Jun 2, 2026

fszontagh mentioned this pull request Jun 3, 2026

perf: allocate CPU-offloaded params from runtime device pinned host buffer #1601

Open

1 task

fix: reserve worst merged segment from chunk-K residency budget

7bc4b71

leejet merged commit a7f2e03 into leejet:master Jun 3, 2026
15 checks passed

fszontagh mentioned this pull request Jun 3, 2026

[Bug] CUDA error (Steam-layers) #1600

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: keep chunk-K residency engaged with runtime LoRA#1598

perf: keep chunk-K residency engaged with runtime LoRA#1598
leejet merged 2 commits into
leejet:masterfrom
fszontagh:feature/streaming-lora-chunk-k

fszontagh commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

fszontagh commented Jun 2, 2026

Summary

Related Issue / Discussion

Additional Information

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants