Skip to content

Wrap lower-level encode/decode entry points #14

@uqio

Description

@uqio

Tracked under #6.

Scope

Today the wrapper exposes state.full(params, samples) only. The lower-level encode/decode flow — running the encoder once, then calling decode token-by-token with a custom sampler — is meaningful for research and custom-sampler use cases but doesn't fit whispery's pump architecture.

Symbols

  • whisper_encode, whisper_encode_with_state.
  • whisper_decode, whisper_decode_with_state.
  • whisper_get_logits, whisper_get_logits_from_state.
  • whisper_set_mel, whisper_set_mel_with_state.
  • whisper_pcm_to_mel, whisper_pcm_to_mel_with_state.

Why deferred

The full surface is large (≈10 functions plus the logits-reading patterns) and each adds an unsafe boundary that needs its own audit-axis coverage. The natural caller is someone building a custom sampler or a research harness — not yet on the table.

Acceptance

  • State::pcm_to_mel(&mut self, samples) / set_mel for explicit mel preparation.
  • State::encode(&mut self, ...) for one-shot encode.
  • State::decode(&mut self, tokens, ...) returning logits.
  • State::logits() accessor.
  • All entry points covered by the safety-axis matrix in safety_audit.rs.
  • Tests on a poisoned fixture for input validation; integration tests for the round-trip on a small fixture model.

Probably wants a dedicated PR per major entry point given the audit overhead.

If you're building a custom sampler / research harness on top of whisper.cpp, drop a comment with your design.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions