0.1.0 — engine-agnostic Task abstraction, no_std-tiered features, HIR-anchored regex by uqio · Pull Request #1 · Findit-AI/llmtask

uqio · 2026-05-10T01:02:44Z

First publishable cut of the crate. Renamed from vlm-tasks and rebuilt around the principle that the prompt + grammar + parser shape is engine-independent — the same Task should run against lfm (llguidance) and qwen (mistralrs) without translation.

PR base is 0.1.0-base (a snapshot of the real local main, pushed as a fresh branch). The repo's actual main ref is an unrelated template-rs placeholder that has no shared history with this work.

What's in scope

Generic Task trait. Three associated types (Output, Value, ParseError: core::error::Error) replace the prior fixed-JSON-schema, fixed-error-enum shape. Engines bind Value to the schema type they consume directly (serde_json::Value for JSON-schema-only engines, SmolStr for Lark/Regex engines). No thread-safety bounds at the trait level — Send + Sync + 'static is added at engine call sites where it's actually needed, so non-Send Tasks compile without ceremony.

Grammar enum, #[non_exhaustive]. JsonSchema (behind json), Lark, Regex (behind regex). Engines pattern-match and return UnsupportedGrammar for variants they don't speak, so callers can route to a different backend.

RegexGrammar private wrapper. A bare Grammar::Regex(regex::Regex) would let callers smuggle in a RegexBuilder::case_insensitive(true) regex whose as_str() returned the plain pattern but whose is_match matched additional case-flipped strings — silently diverging local validation from engine constraint. The wrapper forces construction through Grammar::regex(&str) (default options only).

HIR-anchored full-match validator (Grammar::is_regex_full_match / RegexGrammar::is_full_match). The Rust regex crate is unanchored leftmost-first, but engines like llguidance treat the supplied pattern as anchor-implicit / full-match. Two simpler approaches don't work and are rejected on purpose:

Regex::new(format!(r"\A(?:{p})\z")) breaks verbose-mode patterns: (?x)[0-9]+ # comment compiles bare but explodes wrapped because the comment swallows the injected \z.
find() + span equality breaks prefix alternations: a|ab against input ab returns the shorter 0..1 match for a, and the span check fails — even though ab is in the language.

The HIR path (Hir::concat([Look::Start, parsed_hir, Look::End])) puts the anchors in the regex grammar itself, so backtracking has a reason to retry longer alternatives. Pinned by regression tests for both failure modes.

ImageAnalysis canonical type. Single-image VLM output shape (scene/description/subjects/objects/actions/mood/lighting/shot-type/tags). Detection-array fields are flat Vec<SmolStr> — VLM self-reported confidence is poorly calibrated, and a flat hardcoded confidence on every entry is a no-op for both UX and search ranking. Per-detection scoring belongs in search-time embedding similarity, not in the VLM output type.

no_std/alloc/std feature tiers. Bare no_std builds against the alloc prelude via extern crate alloc as std;. Tests that need std::sync::OnceLock are gated on feature = "std"; the rest typecheck under --no-default-features --features alloc.

Feature graph hardening:

All deps declared with default-features = false so consumers don't get silent std linkage from --no-default-features --features alloc,….
std re-enables dep-side std/default features via weak ?/std features.
json no longer transitively pulls the public serde feature — serde_json brings serde-the-crate into the dep tree, but #[cfg(feature = "serde")] derives only fire when the user opts in explicitly.
regex adds regex-syntax as a direct dep (already a transitive dep of regex) for HIR access.

Diff scope

27 commits, +1590/-101 lines across 11 files. Crate rename, README rewrite in mediatime style, then 13 rounds of Codex adversarial review iterating on the feature graph, generic Task trait, error type, regex wrapper, and finally the HIR-anchored validator. Final cleanup commit removes round-number meta-commentary from doc comments.

Test plan

cargo test --all-features — 23 lib tests, 0 failures
cargo check --no-default-features — bare no_std builds clean
cargo check --no-default-features --features alloc — alloc-only API surface compiles
cargo check --no-default-features --features regex — regex without std works (anchored validator + HIR path)
cargo check --no-default-features --features serde — opt-in serde derive on ImageAnalysis works
Codex adversarial review approved at round 13 (21c2127)

🤖 Generated with Claude Code

…-anchored regex First publishable cut of the crate. Renamed from `vlm-tasks` and rebuilt around the principle that the prompt + grammar + parser shape is engine-independent — the same `Task` should run against `lfm` (llguidance) and `qwen` (mistralrs) without translation. ## Generic `Task` trait Three associated types (`Output`, `Value`, `ParseError: core::error::Error`) replace the prior fixed-JSON-schema, fixed-error-enum shape. Engines bind `Value` to the schema type they consume directly (`serde_json::Value` for JSON-schema-only engines, `SmolStr` for Lark/Regex engines). No thread-safety bounds at the trait level — `Send + Sync + 'static` is added at engine call sites where it's actually needed, so non-`Send` Tasks compile without ceremony. ## `Grammar` enum, `#[non_exhaustive]` `JsonSchema` (behind `json`), `Lark`, `Regex` (behind `regex`). Engines pattern-match and return `UnsupportedGrammar` for variants they don't speak, so callers can route to a different backend. A bare `Grammar::Regex(regex::Regex)` would let callers smuggle in a `RegexBuilder::case_insensitive(true)` regex whose `as_str()` returned the plain pattern but whose `is_match` matched additional case-flipped strings — silently diverging local validation from engine constraint. The `RegexGrammar` private wrapper forces construction through `Grammar::regex(&str)` (default options only). ## HIR-anchored full-match validator `Grammar::is_regex_full_match` / `RegexGrammar::is_full_match`. The Rust `regex` crate is unanchored leftmost-first, but engines like llguidance treat the supplied pattern as anchor-implicit / full-match. Two simpler approaches don't work and are rejected on purpose: - `Regex::new(format!(r"\A(?:{p})\z"))` breaks verbose-mode patterns: `(?x)[0-9]+ # comment` compiles bare but explodes wrapped because the comment swallows the injected `\z`. - `find()` + span equality breaks prefix alternations: `a|ab` against input `ab` returns the shorter `0..1` match for `a`, and the span check fails — even though `ab` is in the language. The HIR path (`Hir::concat([Look::Start, parsed_hir, Look::End])`) puts the anchors in the regex grammar itself, so backtracking has a reason to retry longer alternatives. Pinned by regression tests for both failure modes. ## `ImageAnalysis` canonical type Single-image VLM output shape (scene/description/subjects/objects/ actions/mood/lighting/shot-type/tags). Detection-array fields are flat `Vec<SmolStr>` — VLM self-reported confidence is poorly calibrated, and a flat hardcoded confidence on every entry is a no-op for both UX and search ranking. Per-detection scoring belongs in search-time embedding similarity, not in the VLM output type. ## no_std/alloc/std feature tiers Bare no_std builds against the `alloc` prelude via `extern crate alloc as std;`. Tests that need `std::sync::OnceLock` are gated on `feature = "std"` AND `any(feature = "json", feature = "regex")` to avoid unused-import warnings under `--features std` alone (the test module is empty when neither test-bearing feature is on). ## Feature graph hardening - All deps declared with `default-features = false` so consumers don't get silent std linkage from `--no-default-features --features alloc,…`. - `std` re-enables dep-side `std`/`default` features via weak `?/std` features. - `json` no longer transitively pulls the public `serde` feature — `serde_json` brings serde-the-crate into the dep tree, but `#[cfg(feature = "serde")]` derives only fire when the user opts in explicitly. - `regex` adds `regex-syntax` as a direct dep (already a transitive dep of `regex`) for HIR access. ## Verification - `cargo test --all-features` — 23 lib tests, 0 failures - `cargo hack --feature-powerset test` — all 21 feature combinations build and pass - `cargo check --no-default-features` — bare no_std builds clean - Codex adversarial review approved (rebuilt from 13 review rounds) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov · 2026-05-10T01:54:43Z

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

ℹ️ You can also turn on project coverage checks and project coverage reporting on Pull Request comment

Thanks for integrating Codecov - We've got you covered ☂️

uqio force-pushed the 0.1.0 branch from e541158 to ceae01d Compare May 10, 2026 01:42

uqio merged commit 2a93592 into 0.1.0-base May 10, 2026
28 checks passed

uqio deleted the 0.1.0 branch May 10, 2026 01:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.1.0 — engine-agnostic Task abstraction, no_std-tiered features, HIR-anchored regex#1

0.1.0 — engine-agnostic Task abstraction, no_std-tiered features, HIR-anchored regex#1
uqio merged 1 commit into
0.1.0-basefrom
0.1.0

uqio commented May 10, 2026

Uh oh!

codecov Bot commented May 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

uqio commented May 10, 2026

What's in scope

Diff scope

Test plan

Uh oh!

codecov Bot commented May 10, 2026

Welcome to Codecov 🎉

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant