Skip to content

chore: fix strict clippy violations and apply rustfmt#8

Merged
anaslimem merged 9 commits intomainfrom
features
Mar 2, 2026
Merged

chore: fix strict clippy violations and apply rustfmt#8
anaslimem merged 9 commits intomainfrom
features

Conversation

@anaslimem
Copy link
Copy Markdown
Owner

Summary

  • remove unsupported Clippy config key ()
  • apply rustfmt across workspace
  • fix all strict clippy diagnostics instead of suppressing them
  • refactor combined index APIs to reduce argument count via and

Validation

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features -- -D warnings

Notes

  • work was done on branch per request

Copilot AI review requested due to automatic review settings March 2, 2026 16:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on bringing the workspace into strict clippy -D warnings compliance and consistent formatting, while also making a few behavior/API refinements (query intent anchors, chunking edge-cases, and combined index call ergonomics).

Changes:

  • Replace ad-hoc stderr printing with log in core runtime paths and adjust dependencies accordingly.
  • Improve/expand chunking logic and test coverage (Rust + Python wrapper edge cases).
  • Refactor/extend query/index APIs (intent anchors in QueryOptions, new builder for DB config, and grouping combined-index parameters).

Reviewed changes

Copilot reviewed 15 out of 19 changed files in this pull request and generated no comments.

Show a summary per file
File Description
crates/cortexadb-py/cortexadb/chunker.py Adds legacy chunk_text() guards for empty/short inputs.
crates/cortexadb-core/tests/integration.rs New end-to-end integration tests covering WAL recovery, checkpointing, deletion, graph edges, namespaces, metadata, and capacity eviction.
crates/cortexadb-core/src/store.rs Switches diagnostics to log and adds intent-anchor-based option adjustments during query planning.
crates/cortexadb-core/src/query/mod.rs Re-exports updated query types, including IntentAnchors.
crates/cortexadb-core/src/query/hybrid.rs Adds IntentAnchors, relaxes ScoreWeights normalization, and expands tests.
crates/cortexadb-core/src/lib.rs Reorders re-exports (rustfmt/clippy cleanup).
crates/cortexadb-core/src/index/vector.rs Adds Default derive, minor API cleanups, and switches to log for HNSW fallback warnings.
crates/cortexadb-core/src/index/temporal.rs Rustfmt-only formatting change.
crates/cortexadb-core/src/index/hnsw.rs Derive defaults + minor type cleanup in result mapping.
crates/cortexadb-core/src/index/graph.rs Small BFS optimization and assertion cleanup.
crates/cortexadb-core/src/index/combined.rs Introduces TimeRange/GraphScope to reduce argument count in combined-index APIs.
crates/cortexadb-core/src/facade.rs Removes config Default, adds CortexaDBBuilder, changes open() signature, and tweaks namespace query over-fetching.
crates/cortexadb-core/src/core/state_machine.rs Minor clippy-driven simplifications (or_default).
crates/cortexadb-core/src/chunker.rs Substantial fixed chunking refactor + significantly expanded tests.
crates/cortexadb-core/src/bin/sync_bench.rs Formatting / minor clippy cleanups.
crates/cortexadb-core/src/bin/startup_bench.rs Import ordering (rustfmt).
crates/cortexadb-core/src/bin/monkey_writer.rs Switches progress output to log.
crates/cortexadb-core/src/bin/monkey_verify.rs Formatting / minor clippy cleanups.
crates/cortexadb-core/src/bin/manual_store.rs Formatting / minor clippy cleanups.
crates/cortexadb-core/benches/storage_bench.rs Import ordering (rustfmt).
crates/cortexadb-core/Cargo.toml Adds log dependency and serial_test dev-dependency.
clippy.toml Removes unsupported clippy config key.
Cargo.lock Updates lockfile for new deps and version bumps.
.gitignore Ignores skills/ directory.
Comments suppressed due to low confidence (7)

crates/cortexadb-core/src/store.rs:586

  • query_with_snapshot applies intent adjustments by calling embedder.embed(query_text) when options.intent_anchors is set, but the query execution path embeds the query again later. For expensive embedders this adds avoidable latency. Consider moving the intent_anchors adjustment to the executor (after the embedding is already computed) or passing a precomputed embedding through, so the query is embedded only once.
        let snapshot = self.snapshot();

        if let Some(anchors) = options.intent_anchors.take() {
            Self::apply_intent_adjustments(&mut options, &anchors, query_text, embedder);
        }

crates/cortexadb-core/src/store.rs:619

  • apply_intent_adjustments unconditionally overwrites options.score_weights when anchors are present. This differs from the existing intent auto-adjust behavior in the executor (which only adjusts when weights are still default) and can unexpectedly override caller-specified weights. Consider applying the same guard (only auto-adjust when weights are still default), or clearly documenting that providing intent_anchors implies weights will be overridden.
            ) {
                options.score_weights = adj.score_weights;
                if let Some(exp) = options.graph_expansion.as_mut() {
                    exp.hops = adj.graph_hops;
                }
            }

crates/cortexadb-core/src/facade.rs:550

  • This test config sets checkpoint_policy: Disabled in the struct literal and then sets config.checkpoint_policy = Disabled again immediately after. The second assignment is redundant and can be removed to keep the test setup minimal.
        let mut config = CortexaDBConfig {
            vector_dimension: 3,
            sync_policy: crate::engine::SyncPolicy::Strict,
            checkpoint_policy: crate::store::CheckpointPolicy::Disabled,
            capacity_policy: crate::engine::CapacityPolicy::new(None, None),
            index_mode: crate::index::IndexMode::Exact,
        };
        config.checkpoint_policy = crate::store::CheckpointPolicy::Disabled;

crates/cortexadb-core/src/facade.rs:366

  • This over-fetch implementation increases QueryOptions.top_k (4×) to get more candidates, which also increases the amount of ranking/sorting work inside the query engine (since results are truncated to options.top_k). If the goal is strictly to increase recall before namespace filtering while still returning top_k, consider over-fetching via candidate_multiplier / a dedicated “candidate_k” knob instead of inflating top_k, so downstream ranking work stays proportional to the requested result count.
        let mut options = QueryOptions::with_top_k(top_k.saturating_mul(4).max(top_k));
        options.namespace = Some(namespace.to_string());
        options.metadata_filter = metadata_filter;
        let execution = self.inner.query("", options, &embedder)?;

crates/cortexadb-core/src/chunker.rs:55

  • chunk_fixed treats overlap as a character budget (see comment “overlap, measured in chars”), but apply_overlap (used by recursive/semantic/markdown) interprets overlap as a word count. This makes overlap semantics inconsistent across strategies and also affects chunk_recursive’s fixed fallback (it passes the same overlap value into chunk_fixed). Consider standardizing on one unit (words or chars) across all strategies, and updating the calculation accordingly to avoid surprising behavior for callers.
fn chunk_fixed(text: &str, chunk_size: usize, overlap: usize) -> Vec<ChunkResult> {
    let text = text.trim();
    if text.is_empty() || chunk_size == 0 {
        return vec![];
    }

    // overlap >= chunk_size → step would be 0 → infinite loop guard
    if overlap >= chunk_size {
        return vec![];
    }

crates/cortexadb-core/src/chunker.rs:216

  • In the !split_done fallback, chunk_recursive calls chunk_fixed(text, chunk_size, overlap) and then later calls apply_overlap(chunks, overlap) on the whole result. This effectively applies overlap twice (once inside chunk_fixed, once in apply_overlap) and with potentially different units. To keep overlap behavior predictable, consider generating non-overlapping fixed chunks here (e.g., pass overlap=0) and relying on apply_overlap, or alternatively skip apply_overlap when the fallback path is used.
        if !split_done {
            // Fall back to fixed chunking when no separator works.
            let fixed_chunks = chunk_fixed(text, chunk_size, overlap);
            for chunk in fixed_chunks {
                chunks.push(ChunkResult { text: chunk.text, index: *index, metadata: None });
                *index += 1;
            }
        }

crates/cortexadb-core/src/query/hybrid.rs:82

  • ScoreWeights::normalized() now allows totals in 98..=102 and normalizes by the actual total, but other execution paths still enforce total == 100 (e.g. query/executor.rs re-derives weights and errors when total != 100). This makes the public ScoreWeights behavior inconsistent depending on which query engine is used. Consider centralizing weight validation/normalization (reuse ScoreWeights::normalized() everywhere) so the tolerance change applies consistently and doesn’t reintroduce errors in the planner/executor path.
    fn normalized(self) -> Result<(f32, f32, f32)> {
        let total =
            self.similarity_pct as u16 + self.importance_pct as u16 + self.recency_pct as u16;
        if !(98..=102).contains(&total) {
            return Err(HybridQueryError::InvalidScoreWeights {
                similarity_pct: self.similarity_pct,
                importance_pct: self.importance_pct,
                recency_pct: self.recency_pct,
            });
        }
        // Divide by actual total so the three floats always sum to exactly 1.0.
        let t = total as f32;
        Ok((
            self.similarity_pct as f32 / t,
            self.importance_pct as f32 / t,
            self.recency_pct as f32 / t,

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@anaslimem anaslimem merged commit 3b5ed07 into main Mar 2, 2026
1 check passed
@anaslimem anaslimem deleted the features branch March 7, 2026 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants