Skip to content

Pull requests: bigscience-workshop/Megatron-DeepSpeed

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Bump black from 21.4b0 to 24.3.0 dependencies Pull requests that update a dependency file
#402 opened Mar 20, 2024 by dependabot bot Loading…
Add xPos embeddings
#370 opened Mar 7, 2023 by janEbert Collaborator Loading…
Fix various small problems
#367 opened Feb 28, 2023 by janEbert Collaborator Loading…
Bloom model training with AML
#365 opened Feb 21, 2023 by savitamittal1 Loading…
Add UL2 data sampling and pretraining
#358 opened Dec 13, 2022 by janEbert Collaborator Loading…
Add FlashAttention
#357 opened Dec 12, 2022 by NouamaneTazi Loading…
Enable rocm-support
#353 opened Oct 7, 2022 by luukkonenr Loading…
Encoding checkpoint reshaping guide
#349 opened Sep 20, 2022 by tjruwase Collaborator Draft
Add multiple evaluation compat
#336 opened Aug 30, 2022 by Muennighoff Collaborator Loading…
[checkpoints] replace bf16 with fp32 checkpoint weights
#327 opened Aug 10, 2022 by stas00 Contributor Loading…
Prefix LM Eval
#313 opened Jul 16, 2022 by Muennighoff Collaborator Loading…
Add Bitfit
#311 opened Jul 10, 2022 by Muennighoff Collaborator Loading…
Tool for CKPT averaging
#310 opened Jul 10, 2022 by Muennighoff Collaborator Loading…
Enable loading ckpt for t0 finetuning
#309 opened Jul 10, 2022 by Muennighoff Collaborator Loading…
[WIP] Hack my way to get OPT running
#301 opened Jul 4, 2022 by thomasw21 Member Draft
[MLM] Train script for non causal decoder
#300 opened Jul 4, 2022 by thomasw21 Member Draft
a branch combining layer-norm-auto-sync and ds_ckpt_reshape
#292 opened Jun 29, 2022 by stas00 Contributor Loading…
BigScience Eval Harness
#291 opened Jun 29, 2022 by Muennighoff Collaborator Loading…
No-ZeRO reshaping
#289 opened Jun 23, 2022 by Muennighoff Collaborator Loading…
WIP: Shared t5 code
#286 opened Jun 21, 2022 by thomasw21 Member Loading…
2 of 4 tasks
[WIP] add debug utils
#275 opened Mar 28, 2022 by stas00 Contributor Loading…
Sync 4 layer norms - bf16, fp32, optimizer states on restart
#274 opened Mar 28, 2022 by tjruwase Collaborator Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.