A desktop application that combines Bolt.new's AI-powered web development capabilities with node-llama-cpp for local LLM inference. Build full-stack web applications using a locally-running language model, without requiring cloud API keys or internet connectivity.
- Local LLM Inference: Run language models locally using node-llama-cpp with hardware acceleration (Metal, CUDA, Vulkan)
- AI Code Generation: Generate HTML, CSS, and JavaScript code from natural language prompts
- Full-Stack Development: Support for React, Vue, Svelte, and vanilla HTML/CSS/JS projects
- Live Preview: Real-time preview of generated code
- Project Management: Create, organize, and manage multiple projects
- Chat Interface: Interactive chat for iterative development
- Code Editor: Professional code editing with Monaco Editor
- Cross-Platform: Works on macOS, Windows, and Linux
- Privacy-First: All processing happens locally on your machine
- Node.js: 18.0 or higher
- RAM: Minimum 8GB (16GB+ recommended for larger models)
- Disk Space: 10GB+ for models and dependencies
- GPU (Optional): NVIDIA (CUDA), AMD (Vulkan), or Apple Silicon (Metal) for faster inference
- Install Node.js 18+ from nodejs.org
- Clone or download this repository
# Navigate to project directory
cd bolt-llama-electron
# Install dependencies
npm install
# Build TypeScript and prepare assets
npm run build:vite# Start development server with hot reload
npm run devThis will:
- Start the Vite dev server on
http://localhost:5173 - Launch the Electron app
- Enable DevTools for debugging
# Build for production
npm run build
# Run the built application
npm startBefore using the app, you need to download a GGUF format model:
Recommended Models:
-
CodeLlama 7B (Best for code generation)
# Download to ~/.bolt-llama/models/ mkdir -p ~/.bolt-llama/models cd ~/.bolt-llama/models wget https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGUF/resolve/main/codellama-7b-instruct.Q4_K_M.gguf
-
Mistral 7B Instruct (Fast and versatile)
wget https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf
-
DeepSeek Coder 6.7B (Specialized for coding)
wget https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct-GGUF/resolve/main/deepseek-coder-6.7b-instruct.Q4_K_M.gguf
When you first run the app:
- Go to Settings
- Select your model path (e.g.,
~/.bolt-llama/models/codellama-7b-instruct.Q4_K_M.gguf) - Click "Load Model"
- Wait for the model to load (this may take a minute)
- Click "New Project" in the File menu
- Enter a project name
- Select a template (React + TypeScript, Vanilla HTML/CSS/JS, etc.)
- Click "Create"
- Open the Chat panel
- Describe what you want to build (e.g., "Create a todo list app with add and delete buttons")
- Press Enter or click "Generate"
- The AI will generate code and display it in the editor
- Review and edit the code as needed
- The preview panel will update in real-time
- Click on a file in the File Explorer to open it
- Edit the code in the Monaco Editor
- Changes are auto-saved (configurable)
- The preview updates automatically
- Right-click on a project in the Project List
- Select "Export"
- Choose export location
- The project will be exported as a ZIP file
Configuration is stored in ~/.bolt-llama/config.json:
{
"modelPath": "~/.bolt-llama/models/codellama-7b-instruct.Q4_K_M.gguf",
"projectsPath": "~/.bolt-llama/projects",
"theme": "dark",
"autoSave": true,
"autoSaveInterval": 5000,
"defaultTemplate": "react-ts"
}Adjust LLM behavior in the Settings panel:
- Temperature (0.0-2.0): Higher = more creative, Lower = more deterministic
- Top P (0.0-1.0): Nucleus sampling parameter
- Top K (0-100): Number of top tokens to consider
- Max Tokens: Maximum length of generated response
- GPU Layers: Number of layers to offload to GPU (if available)
bolt-llama-electron/
├── src/
│ ├── main/ # Electron main process
│ │ ├── index.ts # Entry point
│ │ ├── ipc-handlers.ts # IPC message handlers
│ │ ├── llm-engine.ts # node-llama-cpp integration
│ │ ├── file-manager.ts # File operations
│ │ └── preload.ts # Security preload script
│ │
│ ├── renderer/ # React UI
│ │ ├── App.tsx # Main component
│ │ ├── components/ # UI components
│ │ ├── stores/ # Zustand state management
│ │ ├── utils/ # Utilities
│ │ └── styles/ # SCSS styles
│ │
│ └── shared/ # Shared types and constants
│
├── public/ # Static assets
├── vite.config.ts # Vite configuration
├── tsconfig.json # TypeScript config
└── package.json # Dependencies
| Component | Technology |
|---|---|
| Desktop Framework | Electron 39+ |
| UI Framework | React 19+ |
| Language | TypeScript 5+ |
| Build Tool | Vite 7+ |
| Code Editor | Monaco Editor |
| State Management | Zustand 5+ |
| LLM Backend | node-llama-cpp 3+ |
| Styling | SCSS |
The app uses Electron's IPC (Inter-Process Communication) for secure communication between the main process (LLM engine, file system) and renderer process (UI):
Main Process:
- Runs node-llama-cpp for LLM inference
- Manages file system operations
- Handles project management
Renderer Process:
- React UI components
- User interactions
- Real-time preview
Error: "Model not found"
- Ensure the model file exists at the specified path
- Check file permissions
- Try downloading the model again
Error: "Insufficient memory"
- Close other applications
- Try a smaller model (Q4 quantization instead of Q5)
- Increase system swap space
Error: "Generation failed"
- Check model is loaded (Settings → Model Status)
- Try a simpler prompt
- Increase max tokens in settings
- Restart the application
Slow Generation
- Use GPU acceleration if available
- Try a smaller model
- Reduce max tokens
- Close other applications
App won't start
- Delete
~/.bolt-llamaand start fresh - Check Node.js version:
node --version(should be 18+) - Try:
npm installandnpm run build
Preview not updating
- Check browser console for errors (DevTools)
- Ensure file is saved
- Try refreshing the preview
# Install dependencies
npm install
# Type checking
npm run type-check
# Build for production
npm run build
# Create distributable packages
npm run build:electron- Main Process: Use
console.log()- output appears in terminal - Renderer Process: Use DevTools (Ctrl+Shift+I or Cmd+Option+I)
- IPC Communication: Check DevTools Console and terminal
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
- Use GPU Acceleration: Set
gpuLayersto a high value in settings - Optimize Model Selection: Use smaller quantizations (Q4) for faster inference
- Reduce Context Size: Smaller context = faster generation
- Close Unused Projects: Each open project consumes memory
- Monitor System Resources: Use Activity Monitor (macOS) or Task Manager (Windows)
- Model Size: Limited by available RAM (typically 7B-13B models work best)
- Generation Speed: Depends on hardware (CPU/GPU)
- Context Length: Limited by model training (typically 4K-8K tokens)
- No Cloud Sync: Projects are stored locally only
- No Collaboration: Single-user application
- Model management UI (download/delete models)
- Template library with pre-built components
- Git integration for version control
- Advanced debugging tools
- Performance profiling
- Plugin system for extensions
- Multi-file generation
- Code refactoring suggestions
MIT License - See LICENSE file for details
For issues, questions, or suggestions:
- Check the Troubleshooting section
- Review existing issues on GitHub
- Create a new issue with detailed information
- Include system specs and error messages
- Bolt.new - AI-powered web development platform
- node-llama-cpp - Node.js bindings for llama.cpp
- Electron - Cross-platform desktop framework
- React - UI library
- Monaco Editor - Code editor
This application runs language models locally on your machine. The quality and accuracy of generated code depends on the model used and the prompts provided. Always review and test generated code before using it in production.
Happy coding! 🚀