Skip to content
This repository was archived by the owner on Apr 8, 2026. It is now read-only.

WGPU backend for cross-platform GPU compression (Metal, Vulkan, WebGPU) #2

@noahgift

Description

@noahgift

Summary

Add WGPU backend alongside cudarc to enable GPU-accelerated compression on all platforms.

Current State

trueno-zram GPU support:
├── cudarc (CUDA) → NVIDIA Linux only
└── No Mac, AMD, Intel, or browser support

Proposed State

trueno-zram GPU backends:
├── cudarc (CUDA) → NVIDIA Linux/Windows (existing)
├── wgpu (Metal) → macOS (Apple Silicon + Intel)
├── wgpu (Vulkan) → Linux AMD/Intel, Android
├── wgpu (DX12) → Windows AMD/Intel
└── wgpu (WebGPU) → Browsers (WASM)

Target Platforms

Platform GPU API Status
Linux + NVIDIA CUDA ✅ Implemented
Linux + AMD/Intel Vulkan via WGPU 🔲 Planned
macOS Intel Metal via WGPU 🔲 Planned
macOS Apple Silicon Metal via WGPU 🔲 Planned
Windows DX12/Vulkan via WGPU 🔲 Planned
Browser/WASM WebGPU via WGPU 🔲 Planned

Implementation

Dependencies

[dependencies]
wgpu = "23"

Backend Selection

pub enum GpuBackend {
    Cuda(CudaContext),      // Existing
    Wgpu(wgpu::Device),     // New
}

impl GpuBackend {
    pub fn auto_detect() -> Option<Self> {
        // 1. Try CUDA first (fastest on NVIDIA)
        if let Some(cuda) = CudaContext::new().ok() {
            return Some(GpuBackend::Cuda(cuda));
        }
        // 2. Fall back to WGPU (Metal, Vulkan, DX12)
        if let Some(wgpu) = WgpuContext::new().ok() {
            return Some(GpuBackend::Wgpu(wgpu));
        }
        None
    }
}

Compute Shader (WGSL)

@group(0) @binding(0) var<storage, read> input_pages: array<u32>;
@group(0) @binding(1) var<storage, read_write> output: array<u32>;

@compute @workgroup_size(256)
fn lz4_compress(@builtin(global_invocation_id) id: vec3<u32>) {
    let page_idx = id.x;
    // LZ4 compression kernel
}

Use Cases

  1. Intel Mac via SSH - Metal GPU for coverage builds
  2. Apple Silicon Macs - Native Metal acceleration
  3. AMD GPU workstations - Vulkan backend
  4. Browser-based tools - WebGPU for web IDEs
  5. Edge devices - Portable WASM deployment

Acceptance Criteria

  • Add wgpu feature flag
  • Implement WgpuContext with device auto-detection
  • Port LZ4 compress kernel to WGSL
  • Port Zstd compress kernel to WGSL
  • Benchmark Metal vs CUDA performance
  • Test on Intel Mac (ssh mac)
  • WASM compilation works

Performance Targets

Platform Target Throughput
CUDA (RTX 4090) 60 GB/s
Metal (M1 Max) 30 GB/s
Vulkan (RX 7900) 40 GB/s
WebGPU (browser) 10 GB/s

Related

Labels

enhancement, gpu, cross-platform, wgpu

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions