Skip to content

Wrap grammar-constrained decoding #11

@uqio

Description

@uqio

Tracked under #6.

Scope

Expose whisper.cpp's grammar-constrained decoding so callers can constrain the decoder's output to a CFG (e.g. JSON, a specific lexicon, a finite list).

Symbols

  • whisper_full_params::grammar_rules, grammar_n_rules, grammar_i_start_rule, grammar_penalty.
  • set_grammar, set_grammar_penalty, set_start_rule.
  • whisper_grammar_element, whisper_grammar_* helpers.

Why deferred

Pulls a sizeable struct hierarchy (whisper_grammar_element, rules, stacks) and a non-trivial ownership model. The grammar rules need to outlive the whisper_full_with_state call but the caller owns them — the wrapper has to choose between borrowing (lifetime threading through Params) and owning (deep-copying the rule graph).

No caller has asked. Once a caller materialises with a concrete grammar shape we can pick the ownership model that matches.

Acceptance

  • A Grammar type wrapping the rule graph with safe construction.
  • Params::set_grammar(&Grammar) -> WhisperResult<&mut Self> + set_grammar_penalty / set_start_rule.
  • Lifetime / ownership story documented in Grammar's doc comment.
  • Tests covering grammar construction, lifetime correctness (compile-fail tests for misuse), and at least one integration test that confirms a constrained decode honours the grammar.

If you're a caller who needs this, drop a comment with your scenario.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions