Skip to content

[chore] Add diffusers-format example to LongCatAudioDiTPipeline#13483

Merged
dg845 merged 5 commits intohuggingface:mainfrom
RuixiangMa:longcatdiffusersmodel
Apr 16, 2026
Merged

[chore] Add diffusers-format example to LongCatAudioDiTPipeline#13483
dg845 merged 5 commits intohuggingface:mainfrom
RuixiangMa:longcatdiffusersmodel

Conversation

@RuixiangMa
Copy link
Copy Markdown
Contributor

@RuixiangMa RuixiangMa commented Apr 15, 2026

What does this PR do?

  • add diffusers-format example(repo_id: ruixiangma/LongCat-AudioDiT-1B-Diffusers)
  • support seed parameter
import soundfile as sf
import torch
from diffusers import LongCatAudioDiTPipeline

pipeline = LongCatAudioDiTPipeline.from_pretrained(
    "ruixiangma/LongCat-AudioDiT-1B-Diffusers",
    torch_dtype=torch.bfloat16,
)
pipeline = pipeline.to("cuda")

prompt = "A calm ocean wave ambience with soft wind in the background."
audio = pipeline(
    prompt,
    audio_duration_s=5.0,
    num_inference_steps=20,
    guidance_scale=4.0,
    generator=torch.Generator("cuda").manual_seed(42),
).audios[0, 0]

sf.write("output.wav", audio, pipeline.sample_rate)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@github-actions github-actions Bot added documentation Improvements or additions to documentation pipelines size/S PR with diff < 50 LOC labels Apr 15, 2026
…ioDiTPipeline

Signed-off-by: Lancer <maruixiang6688@gmail.com>
@RuixiangMa RuixiangMa force-pushed the longcatdiffusersmodel branch from f25c3a7 to 974c829 Compare April 15, 2026 16:36
@github-actions github-actions Bot added size/M PR with diff < 200 LOC and removed size/S PR with diff < 50 LOC labels Apr 15, 2026
@RuixiangMa
Copy link
Copy Markdown
Contributor Author

@dg845 I uploaded a Diffusers-format repository, updated usage docs.

Comment thread docs/source/en/api/pipelines/longcat_audio_dit.md Outdated
Comment thread src/diffusers/pipelines/longcat_audio_dit/pipeline_longcat_audio_dit.py Outdated
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

- `output_type="pt"` returns a PyTorch tensor shaped `(batch, channels, samples)`.
- `audio_duration_s` is the most direct way to control output duration.
- `seed` makes generation reproducible (optional, defaults to None).
- Output shape is `(batch, channels, samples)` - use `.audios[0, 0]` to get a single audio sample.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think it might be more clear here if we clarify how the pipeline handles mono and stereo outputs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added mono/stereo clarification in Tips

@dg845
Copy link
Copy Markdown
Collaborator

dg845 commented Apr 16, 2026

@bot /style

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 16, 2026

Style bot fixed some files and pushed the changes.

@github-actions github-actions Bot added size/M PR with diff < 200 LOC and removed size/M PR with diff < 200 LOC labels Apr 16, 2026
Copy link
Copy Markdown
Collaborator

@dg845 dg845 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the follow-up PR! Left a few small comments/suggestions :).

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
@github-actions github-actions Bot added size/M PR with diff < 200 LOC and removed size/M PR with diff < 200 LOC labels Apr 16, 2026
Signed-off-by: Lancer <maruixiang6688@gmail.com>
@github-actions github-actions Bot added size/S PR with diff < 50 LOC and removed size/M PR with diff < 200 LOC labels Apr 16, 2026
@RuixiangMa
Copy link
Copy Markdown
Contributor Author

RuixiangMa commented Apr 16, 2026

Thanks for the follow-up PR! Left a few small comments/suggestions :).

Fixed, PTAL

@RuixiangMa RuixiangMa changed the title [chore] Add diffusers-format example and seed parameter to LongCatAudioDiTPipeline [chore] Add diffusers-format example to LongCatAudioDiTPipeline Apr 16, 2026
@dg845
Copy link
Copy Markdown
Collaborator

dg845 commented Apr 16, 2026

@bot /style

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 16, 2026

Style bot fixed some files and pushed the changes.

@github-actions github-actions Bot removed the size/S PR with diff < 50 LOC label Apr 16, 2026
@github-actions github-actions Bot added the size/M PR with diff < 200 LOC label Apr 16, 2026
Copy link
Copy Markdown
Collaborator

@dg845 dg845 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! (BTW, you can fix the code style with make style and make quality.)

@dg845
Copy link
Copy Markdown
Collaborator

dg845 commented Apr 16, 2026

Merging as the CI failures are unrelated.

@dg845 dg845 merged commit 947bc23 into huggingface:main Apr 16, 2026
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation pipelines size/M PR with diff < 200 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants