ENH: Automatic CUDA dependency management

This feature [originated](https://github.com/nomic-ai/gpt4all/blob/b666d16db5aeab8b91aaf7963adcee9c643734d7/gpt4all-bindings/python/setup.py#L98) in GPT4all.

You may wish to consider adding a dependency on the [nvidia-cuda-runtime](https://pypi.org/project/nvidia-cuda-runtime-cu12/) package for the CUDA version of xllamacpp.

Once this is done, you can integrate a modified version of the code [here](https://github.com/nomic-ai/gpt4all/blob/b666d16db5aeab8b91aaf7963adcee9c643734d7/gpt4all-bindings/python/gpt4all/_pyllmodel.py#L67) to automatically set the LD_LIBRARY_PATH or PATH environment variable, as appropriate, so that the runtime files are available for inference. This will hopefully allow xllamacpp to be used without needing to separately configure a CUDA runtime package.

AMD is [currently building](https://github.com/ROCm/TheRock/) a version of ROCm that can be installed on both Windows and Linux using [pip](https://github.com/ROCm/TheRock/blob/433c3f58e10178fe73d157aa566d56f37408d6fd/RELEASES.md#installing-rocm-python-packages). This means that an identical feature will soon be possible for ROCm users, e.g. on Strix Halo machines.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Automatic CUDA dependency management #62

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ENH: Automatic CUDA dependency management #62

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions