🦙 Llama-CPP-Python Pre-built Wheels (Python 3.13)The Solution for Hugging Face Spaces "Build Timeout" Errors .If you are trying to run LLMs on the Hugging Face Free CPU Tier using Python 3.13, you’ve likely noticed that compiling llama-cpp-python from source either takes 20+ minutes or crashes the Space entirely.This repository contains pre-compiled .whl files (Wheels) built specifically for compatibility with HF Spaces and other low-resource Linux environments.🚀 Why use these wheels?Python 3.13 Ready: Built specifically for the latest Python version.Fast Installation: Installs in seconds instead of minutes.Generic CPU Support: Compiled with GGML_NATIVE=OFF. This prevents "Illegal Instruction" or "Core Dump" errors on older cloud CPUs.Manylinux Compatible: Built using GitHub Actions on Ubuntu to ensure broad Linux support.🛠️ How to use in Hugging Face1. Update your requirements.txtInstead of adding llama-cpp-python normally, paste the direct link to the wheel:Plaintext# Pre-built wheel for Python 3.13 https://github.com/Jameson040/my_lama-wheels/releases/download/v0.1/llama_cpp_python-0.3.16-cp313-cp313-linux_x86_64.whl
Set Python VersionEnsure your Space is actually using Python 3.13. In your README.md metadata (or Dockerfile), set:python_version: 3.13📦 What's Included? FileDescription llama_cpp_python-....whl The main engine.numpy, diskcache, etc.Necessary dependencies compiled for Python 3.13. 🛠️ How this was built. This wheel was built using GitHub Actions with the following configuration:OS: Ubuntu Latest (manylinux)Compiler Flags: CMAKE_ARGS="-DGGML_NATIVE=OFF -DGGML_BLAS=OFF"Python: 3.13 🤝 Contributing & Support If you find this helpful, feel free to Star ⭐ this repository so others can find it! If you need a specific version of llama-cpp-python built, feel free to open an Issue.