feat(Model Support): add Krea-2-Turbo model + LoRA support (WIP)#9304
Draft
Pfannkuchensack wants to merge 3 commits into
Draft
feat(Model Support): add Krea-2-Turbo model + LoRA support (WIP)#9304Pfannkuchensack wants to merge 3 commits into
Pfannkuchensack wants to merge 3 commits into
Conversation
Integrate Krea-2-Turbo (krea/Krea-2-Turbo) text-to-image per
NEW_MODEL_INTEGRATION.md: Krea2Transformer2DModel (single-stream MMDiT)
+ Qwen3-VL text encoder (12-layer hidden-state tap, 4D prompt_embeds)
+ reused Qwen-Image VAE + FlowMatchEulerDiscrete scheduler.
Backend:
- taxonomy: BaseModelType.Krea2, ModelType/ModelFormat.Qwen3VLEncoder,
Krea2VariantType (Turbo = "krea2_turbo" to avoid Z-Image collision)
- config probes: Main_Diffusers/Checkpoint_Krea2, Qwen3VLEncoder,
LoRA_LyCORIS_Krea2 (text_fusion/time_mod_proj signature; excluded
from the Qwen-Image probe to avoid double-match)
- loaders for the diffusers pipeline + standalone Qwen3-VL encoder,
with runtime workarounds for the HF model's version mismatches
(AutoTokenizer, extra_special_tokens={}, rope_parameters->rope_scaling)
- native sampling (pack/unpack, position_ids, linear-mu shift) and
hand-written Euler denoise loop; reuses qwen_image l2i/i2l
- invocations: model_loader, text_encoder, denoise, lora_loader, plus
two ecosystem enhancers (conditioning rebalance, seed variance)
- LoRA conversion for diffusers PEFT (lora_transformer- prefix)
Frontend:
- 'krea-2' base + qwen3_vl_encoder type/format across model maps,
buildKrea2Graph, addKrea2LoRAs, graph-builder denoise/base lists,
optimal dimension 1024, regenerated schema.ts
Fixes:
- estimate transformer working memory in krea2_denoise so the cache
reserves activation headroom and offloads more model under partial
loading; fixes fp8 + LoRA OOM at 1024 (model was placed before LoRA
patches were applied, leaving no room for their activations)
WIP: requires diffusers main (>=0.39 dev) for Krea2Transformer2DModel;
pyproject.toml temporarily pins diffusers to git main.
Collaborator
|
Amazing! I was just thinking of working on this myself and you did it for me! |
Allow non-diffusers Krea-2 transformers (GGUF/fp8) to run with standalone single-file VAE + Qwen3-VL encoder, fixing several blockers found in testing. - buildKrea2Graph: drop the hard "requires Diffusers-format" assert; instead require both a VAE and a Qwen3-VL encoder to be selected when the transformer is not diffusers (mirrors readiness.ts). - Qwen3-VL encoder remap: handle both single-file key conventions — implicit (model.layers.*) and explicit (model.language_model.*). The old blind model.* -> language_model.* turned the bf16 file's keys into language_model.language_model.* (398 meta tensors -> "Cannot copy out of meta tensor" crash). Both files now load 0 missing / 0 unexpected / 0 meta. - Qwen3-VL tokenizer/config: broaden the offline-cache fallback from OSError to Exception so a partial HF cache (config present, vocab missing) re-fetches instead of dying with TypeError. - Qwen3-VL encoder fp8: keep an fp8 source checkpoint fp8-resident with per-layer upcast (storage float8_e4m3fn, compute bf16) instead of dequantizing to bf16. Halves resident VRAM (~8.9GB -> ~4.4GB), avoiding partial-load thrashing alongside a large transformer. Auto-enabled for fp8 sources on CUDA; bf16 files stay bf16. - Qwen-Image VAE: a native-layout qwen_image_vae single file is classified with the Anima base and loaded as AutoencoderKLWan, but the qwen l2i/i2l nodes need AutoencoderKLQwenImage. Add backend/krea2/vae_compat.py::as_qwen_image_vae to reinterpret a Wan VAE as AutoencoderKLQwenImage (state dicts are identical, 194/194 keys); both qwen VAE nodes use it. Idempotent for real QwenImage VAEs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Integrate Krea-2-Turbo (
krea/Krea-2-Turbo) text-to-image perNEW_MODEL_INTEGRATION.md:Krea2Transformer2DModel(single-stream MMDiT, ~12B) + Qwen3-VL text encoder (12-layer hidden-state tap → 4Dprompt_embeds) + reused Qwen-Image VAE +FlowMatchEulerDiscreteScheduler. Turbo is distilled (is_distilled=true→ fixedmu=1.15, 8 steps, CFG off by default).WIP: requires diffusers
main(>= 0.39 dev) forKrea2Transformer2DModel;pyproject.tomltemporarily pinsdiffusersto git main. Flip to the stable release containing Krea-2 once it ships, then un-draft.Related Issues / Discussions
Depends on
Krea2Transformer2DModel/Krea2Pipelinelanding in a stable diffusers release (currently diffusersmainonly).QA Instructions
uv pip install "git+https://github.com/huggingface/diffusers.git"; confirmpython -c "from diffusers import Krea2Transformer2DModel, Krea2Pipeline".Krea-2-Turbodiffusers folder. Confirm it probes asmain / diffusers / krea-2 / krea2_turboand the Qwen3-VL encoder asqwen3_vl_encoder.lora.lycoris.krea-2(not qwen-image) and applies; confirm 1024² + LoRA + fp8 no longer OOMs (partial loading enabled).Merge Plan
Draft until diffusers ships Krea-2 in a stable release. Before merging: flip the
diffuserspin inpyproject.tomlfrom git main to that release and updateuv.lock. Isolate thepyproject.toml/uv.lockchange so it's easy to review/revert. No DB schema changes.Checklist
What's Newcopy (if doing a release after this PR)