-
Notifications
You must be signed in to change notification settings - Fork 21
Issues
is:issue state:open
is:issue state:open
Issue creation is restricted in this repository
Search results
feat: OpenXLA reference backend - export-route spike through 4-bit quantized decode
area:architectureArchitecture and code structure changesArchitecture and code structure changespriority:mediumMedium priorityMedium prioritystatus:readyReady to be worked onReady to be worked ontype:enhancementNew features, capabilities, or significant additionsNew features, capabilities, or significant additionsStatus: Open.#449 In lablup/mlxcel;refactor: redraw ComputeBackend as an inference-session engine contract and move the MLX path behind it (byte-identical)
area:architectureArchitecture and code structure changesArchitecture and code structure changespriority:mediumMedium priorityMedium prioritystatus:doneCompletedCompletedtype:refactorCode restructuring without changing functionalityCode restructuring without changing functionalityStatus: Open.#448 In lablup/mlxcel;feat: distribute the mlxcel binary via pip so
pip installyields a runnable managed modepriority:mediumMedium priorityMedium prioritystatus:readyReady to be worked onReady to be worked ontype:enhancementNew features, capabilities, or significant additionsNew features, capabilities, or significant additionsStatus: Open.#416 In lablup/mlxcel;fix(router): emit usage on the disaggregated /v1/chat/completions responses (streaming and non-streaming)
area:architectureArchitecture and code structure changesArchitecture and code structure changespriority:lowLow priorityLow prioritystatus:backlogIn the backlog, not yet readyIn the backlog, not yet readytype:bugBug fixes, error corrections, or issue resolutionsBug fixes, error corrections, or issue resolutionsStatus: Open.#398 In lablup/mlxcel;perf(core): adaptive selector for the native paged-attention decode kernel
area:coremlxcel-core: MLX FFI, primitives, KV cache, layersmlxcel-core: MLX FFI, primitives, KV cache, layersarea:inferenceGeneration, sampling, decoding (incl. speculative, DRY)Generation, sampling, decoding (incl. speculative, DRY)priority:mediumMedium priorityMedium prioritytype:performancePerformance improvementsPerformance improvementsStatus: Open.perf(moe): backend-aware fused-MoE Dff cap (CUDA crossover) and dispatch heuristic
area:coremlxcel-core: MLX FFI, primitives, KV cache, layersmlxcel-core: MLX FFI, primitives, KV cache, layersarea:modelsModel architectures, weights, loading, metadataModel architectures, weights, loading, metadatapriority:mediumMedium priorityMedium prioritytype:performancePerformance improvementsPerformance improvementsStatus: Open.perf(nemotron-h): decode gap is MoE-block op-density (routed + shared expert), not SSM/attention
area:inferenceGeneration, sampling, decoding (incl. speculative, DRY)Generation, sampling, decoding (incl. speculative, DRY)area:modelsModel architectures, weights, loading, metadataModel architectures, weights, loading, metadataplatform:macosmacOS (Apple Silicon) specificmacOS (Apple Silicon) specificpriority:mediumMedium priorityMedium prioritytype:performancePerformance improvementsPerformance improvementsStatus: Open.feat: need a logo
area:docsUser and developer documentationUser and developer documentationhelp wantedExtra attention is neededExtra attention is neededpriority:mediumMedium priorityMedium prioritytype:enhancementNew features, capabilities, or significant additionsNew features, capabilities, or significant additionsStatus: Open.feat: Native Windows + CUDA build feasibility spike and porting plan (x86_64-pc-windows-msvc)
area:coremlxcel-core: MLX FFI, primitives, KV cache, layersmlxcel-core: MLX FFI, primitives, KV cache, layersarea:inferenceGeneration, sampling, decoding (incl. speculative, DRY)Generation, sampling, decoding (incl. speculative, DRY)platform:windowsWindows (native) specificWindows (native) specificpriority:mediumMedium priorityMedium prioritystatus:investigationFeasibility spike / under investigationFeasibility spike / under investigationtype:enhancementNew features, capabilities, or significant additionsNew features, capabilities, or significant additionsStatus: Open.Support Windows and Linux x86_64 binary builds and release artifacts
priority:mediumMedium priorityMedium prioritystatus:readyReady to be worked onReady to be worked ontype:enhancementNew features, capabilities, or significant additionsNew features, capabilities, or significant additionsStatus: Open.chore: harden packaging environment to enforce 4-eyes review on signed releases
priority:lowLow priorityLow prioritystatus:backlogIn the backlog, not yet readyIn the backlog, not yet readytype:choreMaintenance tasks (build, CI, etc.)Maintenance tasks (build, CI, etc.)Status: Open.