Skip to content

Support for collections of HF datasets, --top-k, update to input masking training, etc.#4

Open
chimezie wants to merge 46 commits intomainfrom
mlx_tuning_fork_completion_learning_and_hf_ds
Open

Support for collections of HF datasets, --top-k, update to input masking training, etc.#4
chimezie wants to merge 46 commits intomainfrom
mlx_tuning_fork_completion_learning_and_hf_ds

Conversation

@chimezie
Copy link
Copy Markdown
Owner

No description provided.

Added --mask-input/--no-mask-input.  Integrated handling for input masking and learning rate schedules. Added new CLI argument `--mask-input`.
…_sweep)

This commit adds a new script entry for 'mlx_tuning_fork_wandb_sweep' in the pyproject.toml file. This allows running the 'wandb_sweep' module directly as a script, simplifying the workflow.
Replaced an outdated dataset import with config imports for more accurate settings. Also, incorporated default validation parameters to enhance configuration consistency.
Renamed CONFIG_DEFAULTS to TF_CONFIG_DEFAULTS to clarify its origin from the tuning fork module. This change ensures better readability and maintainability of the configuration parameters being used in the project.
Updated length calculations to use a fixed number of iterations rather than the dataset length and updates to training steps and validation intervals
Added `--mask-inputs` option to enable input masking during training. Integrates new batch iterator and loss function when input masking is activated, and replaces learning schedule building logic with mlx_lm's.
Updated `train_set` to `train_dataset` and `valid_set` to `val_dataset` for consistency with the rest of the codebase.
Eliminated the colorize option from CLI arguments and its associated print statements to clean up the code and simplify the `generate.py` logic. The `wandb_sweep.py` file had a minor formatting update as well.
Removed unused 'generate' import from training.py and 'save_config' and 'Path' imports from wandb_sweep.py to clean up the code. Moved the YAML loader setup inside the main function in wandb_sweep.py for better encapsulation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant