Skip to content

CoreML companion model build helper #5

@uqio

Description

@uqio

whispercpp ships coreml as an opt-in feature. Whisper.cpp's CoreML encoder needs a .mlmodelc companion file generated alongside the ggml-*.bin checkpoint. Currently this is out-of-band — users have to run whisper.cpp's models/generate-coreml-model.sh themselves.

Goal: close the loop so cargo build --features coreml produces a usable runtime, or at least documents the path clearly.

Options

A. whispercpp-tools companion crate (preferred).

A small CLI: whispercpp-coreml-convert <ggml-*.bin> <out-dir>. Wraps the upstream Python tooling but doesn't drag the dependency into normal builds. Users run it once per checkpoint.

Trade-off: still requires coremltools (Python) + ane_transformers. That's a heavy install for a build-tooling crate. Mitigation: detect at runtime, print a clear error referencing the upstream script.

B. build.rs helper in whispercpp itself.

Auto-generate the companion at build time when coreml feature is on AND a WHISPER_MODEL_PATH env var points at a checkpoint. Bad fit for cargo install-style consumers (no model at install time) and the Python dep makes cargo build brittle.

C. Documentation + a cargo xtask recipe.

Cheapest. Add a section to README explaining the upstream script + write a Makefile/justfile recipe that wraps it. Not automation, but predictable.

Recommendation

Start with C for v0.1.x and revisit A once there's user pull. Document the conversion step explicitly in README and whispercpp/src/context.rs's coreml doc-comment.

Notes

  • The .mlmodelc location whisper.cpp expects is sibling to the .bin (e.g. models/ggml-large-v3-turbo-encoder.mlmodelc/). Codified by whisper_get_coreml_path_encoder in upstream.
  • Whisper.cpp falls back gracefully when .mlmodelc is absent (we set WHISPER_COREML_ALLOW_FALLBACK=ON in build.rs). So missing the companion isn't catastrophic — just slower encode.

From whispercpp/TODO.md § 3 "Larger work".

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions