Don't cache reinit_modules#5543
Conversation
epwalsh
left a comment
There was a problem hiding this comment.
I think one of @dirkgr's points was that we don't want to add a reinitialized transformer to the cache. So maybe only run _model_cache[spec] = transformer when reinit_modules is None.
| - Removed a spurious error message "'torch.cuda' has no attribute '_check_driver'" that would be appear in the logs | ||
| when a `ConfigurationError` for missing GPU was raised. | ||
| - Load model on CPU post training to save GPU memory. | ||
| - Don't cache models with `cached_transformers` when `reinit_modules` is not `None`. |
There was a problem hiding this comment.
Better to omit this actually since this feature hasn't been released yet.
Oh right, good catch. I have fixed this. |
|
The way I would almost say that we should have an entirely separate function that reinits some layers from a given transformer model. It doesn't have to be part of |
|
@dirkgr I had originally added this functionality to |
|
I'd say for ease of use, just have it in the |
Fixes #5505 (comment)
Changes proposed in this pull request:
reinit_modulesis provided.reinit_modulesfrom the transformer specreinit_modulesis notNoneBefore submitting
section of the
CONTRIBUTINGdocs.Writing docstrings section of the
CONTRIBUTINGdocs.After submitting
codecov/patchreports high test coverage (at least 90%).You can find this under the "Actions" tab of the pull request once the other checks have finished.