Skip to content

Mamba-3 code release#858

Merged
berlinchen7 merged 1 commit intomainfrom
mamba3-release
Mar 17, 2026
Merged

Mamba-3 code release#858
berlinchen7 merged 1 commit intomainfrom
mamba3-release

Conversation

@aakashlahoti
Copy link
Copy Markdown
Collaborator

Official code for "Mamba-3: Improved Sequence Modeling using State Space Principles".
Aakash Lahoti*, Kevin Y. Li*, Berlin Chen*, Caitlin Wang*, Aviv Bick, J. Zico Kolter, Tri Dao†, Albert Gu†

Paper: https://arxiv.org/abs/2603.15569

Co-authored-by: Kevin Li <kevin.li5505@gmail.com>
Co-authored-by: Berlin Chen <berlinchen7@gmail.com>
Co-authored-by: Caitlin Wang <caitlinwang@princeton.edu>
@berlinchen7 berlinchen7 merged commit 37f9d02 into main Mar 17, 2026
3 of 5 checks passed
@skaae
Copy link
Copy Markdown

skaae commented Mar 18, 2026

Hi,
Thank you for releasing the code! I wonder if you plan to implement varlen for the MIMO variant either with cu_seqlen or seq_idx?

Do you know if it would be difficult to implement varlen for MIMO if we want to give it a shot?

@tridao
Copy link
Copy Markdown
Collaborator

tridao commented Mar 18, 2026

I feel it's not too difficult, but @berlinchen7 should know better

@berlinchen7
Copy link
Copy Markdown
Collaborator

Thanks for your interest! Indeed varlen MIMO shouldn’t be too difficult to implement. We just need to be careful not to introduce unnecessary overhead.

Support for varlen MIMO is currently in progress, and we plan to release it once the implementation is ready.

@skaae
Copy link
Copy Markdown

skaae commented Mar 19, 2026

Sounds excellent. We'll start experimenting with the siso variant. Hopefully it works even better than the excellent mamba2 :) 🤞

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants