Skip to content

[Bugfix] Fix shape mismatch in LongCatAudioDiTTransformer conversion#13494

Merged
dg845 merged 1 commit intohuggingface:mainfrom
RuixiangMa:fixlongcataudio
Apr 16, 2026
Merged

[Bugfix] Fix shape mismatch in LongCatAudioDiTTransformer conversion#13494
dg845 merged 1 commit intohuggingface:mainfrom
RuixiangMa:fixlongcataudio

Conversation

@RuixiangMa
Copy link
Copy Markdown
Contributor

@RuixiangMa RuixiangMa commented Apr 16, 2026

Fixes # (issue)

RuntimeError: Error(s) in loading state_dict for LongCatAudioDiTTransformer:
        size mismatch for blocks.0.ffn.ff.0.weight: copying a param with shape torch.Size([9216, 2560]) from checkpoint, the shape in current model is torch.Size([10240, 2560]).

Root cause:
The ff_mult parameter was hardcoded to 4.0 in the transformer but the LongCat-AudioDiT-3.5B checkpoint uses ff_mult=3.6, causing FFN weight shape mismatch (9216 vs 10240).

Now reads dit_ff_mult from config.json, enabling conversion of models with non-default ff_mult (e.g., 3.5B model uses 3.6).

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Signed-off-by: Lancer <maruixiang6688@gmail.com>
@github-actions github-actions bot added size/S PR with diff < 50 LOC models and removed size/S PR with diff < 50 LOC labels Apr 16, 2026
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Collaborator

@dg845 dg845 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

@dg845
Copy link
Copy Markdown
Collaborator

dg845 commented Apr 16, 2026

Merging as the CI failures are unrelated.

@dg845 dg845 merged commit c507097 into huggingface:main Apr 16, 2026
12 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants