small up by holden · Pull Request #718 · razrfly/cinegraph

holden · 2026-03-29T15:52:09Z

Summary by CodeRabbit

Documentation
- Added Prediction Integrity guidelines covering hypothesis pre-registration, holdout testing protocols, baseline comparisons, and required reporting standards.
New Features
- Added command-line interface for executing weight sweeps with configurable sample count, top-results filtering, optional persistence, and JSON/table output formatting.

coderabbitai · 2026-03-29T15:52:16Z

📝 Walkthrough

Walkthrough

This pull request introduces a new weight optimization sweep feature for the prediction system. It adds documentation guidelines on prediction integrity, a new public API WeightOptimizer.sweep/3 for evaluating weight vectors in-memory, and a corresponding Mix task CLI that exposes this functionality with configurable sampling, optional persistence, and JSON/table output formats.

Changes

Cohort / File(s)	Summary
Documentation `CLAUDE.md`, `README.md`	Added development guidelines requiring authors to review the Prediction Integrity section before modifying prediction/scoring logic, and introduced an anti-self-deception protocol documenting pre-registration, holdout testing, and baseline comparison standards for model evaluation.
Weight Optimizer API `lib/cinegraph/predictions/weight_optimizer.ex`	Implemented new public `sweep/3` function that loads decade feature data once, evaluates randomly sampled weight vectors and named profiles in-memory, ranks results by accuracy, and optionally persists best weights via `MovieLists.save_trained_weights/2`.
Mix Task CLI `lib/mix/tasks/predictions.sweep.ex`	Added new Mix task `predictions.sweep` that accepts CLI options (`--list-key`, `--n-samples`, `--top`, `--save`, `--json`), delegates to `WeightOptimizer.sweep/3`, and outputs results as formatted console tables or JSON with error handling and startup banners.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant MixTask as Mix Task<br/>(predictions.sweep)
    participant WeightOptimizer
    participant MovieLists
    participant Output

    User->>MixTask: Run with options
    MixTask->>MixTask: Parse CLI args
    MixTask->>WeightOptimizer: sweep(source_key, n_samples, opts)
    WeightOptimizer->>WeightOptimizer: Load cached decade features
    WeightOptimizer->>WeightOptimizer: Generate n_samples random weights
    WeightOptimizer->>WeightOptimizer: Evaluate all weights in-memory
    WeightOptimizer->>WeightOptimizer: Rank by accuracy (descending)
    
    alt --save option provided
        WeightOptimizer->>MovieLists: save_trained_weights(best)
        MovieLists-->>WeightOptimizer: OK
    end
    
    WeightOptimizer-->>MixTask: Top results with ranks
    MixTask->>Output: Format as JSON or table
    Output-->>User: Display results

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 Weights sweep through the simplex dance,
Each vector gets its fair chance,
Integrity guards our models true,
No self-deception in our view,
Baselines beaten, holdouts blessed! 🎬✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'small up' is vague and generic, providing no meaningful information about the substantial changes introduced in this PR (new prediction integrity documentation, weight optimizer API, and sweep task).	Revise the title to clearly describe the main change, such as 'Add weight optimizer sweep feature with prediction integrity guidelines' or 'Implement random weight sweep CLI task and integrity protocol.'

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch 03-29-small_up

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

holden · 2026-03-29T15:52:22Z

small up #718 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

lib/mix/tasks/predictions.sweep.ex (1)

123-129: Consider guarding against empty results.

hd(results) will crash if results is empty. While named profiles should always be present, a defensive check would improve robustness.

♻️ Proposed defensive check

+   if results == [] do
+     Mix.shell().error("No results returned from sweep.")
+     exit({:shutdown, 1})
+   end
+
    best = hd(results)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/mix/tasks/predictions.sweep.ex` around lines 123 - 129, Guard against an
empty results list before calling hd(results): check results (e.g., with a case
or if) and handle the empty branch (log a message via Mix.shell().info/error and
return or skip further processing) so hd(results) is only called when results is
non-empty; update the block that assigns best and calls Mix.shell().info (the
hd(results) usage and subsequent Mix.shell().info with
format_weights(best.weights)) to first verify results is not [].

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/cinegraph/predictions/weight_optimizer.ex`:
- Around line 322-327: The random_simplex_point/1 function can produce :infinity
when using :rand.uniform() and taking 1.0 / 0.0; update the sampling to use
:rand.uniform_real/0 (which never returns 0.0) and compute the exponential
variates via the negative log directly (i.e., generate each raw element with
-:math.log(:rand.uniform_real()) inside the for comprehension) before
normalizing by Enum.sum to produce the simplex point.

---

Nitpick comments:
In `@lib/mix/tasks/predictions.sweep.ex`:
- Around line 123-129: Guard against an empty results list before calling
hd(results): check results (e.g., with a case or if) and handle the empty branch
(log a message via Mix.shell().info/error and return or skip further processing)
so hd(results) is only called when results is non-empty; update the block that
assigns best and calls Mix.shell().info (the hd(results) usage and subsequent
Mix.shell().info with format_weights(best.weights)) to first verify results is
not [].

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ac76c43b-d6ce-42b8-8210-ee19604fe353

📥 Commits

Reviewing files that changed from the base of the PR and between e376c46 and 94a8a3f.

📒 Files selected for processing (4)

CLAUDE.md
README.md
lib/cinegraph/predictions/weight_optimizer.ex
lib/mix/tasks/predictions.sweep.ex

coderabbitai · 2026-03-29T15:56:55Z

+  # Sample a uniform random point on the (n-1)-simplex via the exponential method.
+  defp random_simplex_point(n) do
+    raw = for _ <- 1..n, do: :math.log(1.0 / :rand.uniform())
+    total = Enum.sum(raw)
+    Enum.map(raw, &(&1 / total))
+  end


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Does Erlang :rand.uniform/0 return exactly 0.0 as a possible value, and does :rand.uniform_real/0 exclude 0.0?

💡 Result:

Yes, Erlang's :rand.uniform/0 can return exactly 0.0 as a possible value in the range 0.0 =< X < 1.0. :rand.uniform_real/0 excludes 0.0, returning values in the range 0.0 < X < 1.0.

Citations:

1: https://www.erlang.org/doc/apps/stdlib/rand.html

2: https://erlang.org/doc/man/rand.html

3: https://www.erlang.org/docs/22/man/rand

Use :rand.uniform_real/0 to avoid log(infinity) edge case in simplex sampling.

Erlang's :rand.uniform/0 can return exactly 0.0 (range [0.0, 1.0)), which causes 1.0 / 0.0 to produce :infinity and corrupts the sample vector. Use :rand.uniform_real/0 instead, which excludes 0.0 and returns the mathematically correct negative log directly:

Proposed fix

defp random_simplex_point(n) do - raw = for _ <- 1..n, do: :math.log(1.0 / :rand.uniform()) + raw = for _ <- 1..n, do: -:math.log(:rand.uniform_real()) total = Enum.sum(raw) Enum.map(raw, &(&1 / total)) end

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@lib/cinegraph/predictions/weight_optimizer.ex` around lines 322 - 327, The random_simplex_point/1 function can produce :infinity when using :rand.uniform() and taking 1.0 / 0.0; update the sampling to use :rand.uniform_real/0 (which never returns 0.0) and compute the exponential variates via the negative log directly (i.e., generate each raw element with -:math.log(:rand.uniform_real()) inside the for comprehension) before normalizing by Enum.sum to produce the simplex point.

greptile-apps · 2026-03-29T16:02:12Z

Greptile Summary

This PR adds a random weight sweep capability to the predictions system — a fast in-memory alternative to the logistic regression trainer (mix predictions.train) that evaluates thousands of simplex-sampled weight vectors without any per-vector DB round-trips. It also adds process discipline documentation (README "Prediction Integrity" section + CLAUDE.md cross-reference) to guard against overfitting and self-deception during model development.

Key changes:

WeightOptimizer.sweep/3 — loads decade feature data once via the existing load_decade_data_parallel path, then evaluates n_samples random simplex points plus all named profiles entirely in memory using a new fast_evaluate/4 helper
fast_evaluate/4 — computes precision@K (recall) per decade; metric is consistent with HistoricalValidator.validate_decade (both take top-K, count hits)
random_simplex_point/1 — uniform sampling on the (n-1)-simplex via the exponential method; mathematically correct
mix predictions.sweep — CLI wrapper with --n-samples, --top, --save, and --json flags
One crash path: print_results/3 calls hd(results) without guarding against an empty list; passing --top 0 will raise a FunctionClauseError

Confidence Score: 4/5

Safe to merge after fixing the hd/1 crash in print_results/3.

One P1 crash path exists: passing --top 0 produces an empty result list, causing print_results/3 to call hd([]) and raise an unhandled FunctionClauseError. The core logic — simplex sampling, fast_evaluate metric, caching strategy — is sound and consistent with the existing HistoricalValidator accuracy convention. All other findings are P2 or lower.

lib/mix/tasks/predictions.sweep.ex — specifically the hd(results) call in print_results/3

Important Files Changed

Filename	Overview
lib/mix/tasks/predictions.sweep.ex	New Mix task wrapping WeightOptimizer.sweep/3. Well-structured with JSON output mode, good logging, and rescue around the sweep call. Has a crash path in print_results/3 when --top 0 is passed (hd on empty list).
lib/cinegraph/predictions/weight_optimizer.ex	Adds sweep/3, fast_evaluate/4, and random_simplex_point/1. The in-memory evaluation correctly reuses cached criterion scores (weight-independent), and the simplex sampling via the exponential method is mathematically sound. The accuracy metric (precision@K) is consistent with HistoricalValidator.validate_decade.
README.md	Adds a Prediction Integrity: Anti-Self-Deception Protocol section documenting pre-registration, sacred holdout, baseline requirements, and reporting standards. Documentation-only change.
CLAUDE.md	Single-line addition cross-referencing the new README Prediction Integrity section from the dev conventions guide.

Sequence Diagram

sequenceDiagram
    participant CLI as mix predictions.sweep
    participant WO as WeightOptimizer.sweep/3
    participant HV as HistoricalValidator
    participant DB as Database
    participant CS as CriteriaScoring

    CLI->>WO: sweep(list_key, n_samples, opts)
    WO->>HV: get_all_decades(source_key)
    HV-->>WO: [1930, 1940, ..., 2010]
    WO->>WO: load_decade_data_parallel (Task.async_stream)
    loop per decade (parallel)
        WO->>DB: get_decade_movies_query(decade)
        DB-->>WO: [movies...]
        WO->>CS: batch_score_movies(movies, @default_weights)
        CS-->>WO: [{criteria_scores: %{mob, critics, ...}}]
        WO->>WO: build feature rows (scores / 100), labels
    end
    Note over WO: decade_data cached in memory
    WO->>CS: get_named_profiles()
    CS-->>WO: [default, festival-heavy, audience-first, ...]
    WO->>WO: build named_vectors from profiles
    loop n_samples random vectors
        WO->>WO: random_simplex_point(n_criteria)
    end
    loop (named + random) vectors — in memory only
        WO->>WO: fast_evaluate(vec, decade_data) → precision@K
    end
    WO->>WO: sort by accuracy desc, take top N
    opt --save
        WO->>DB: MovieLists.save_trained_weights(source_key, weights)
    end
    WO-->>CLI: [%{rank, accuracy, label, weights}, ...]
    CLI->>CLI: print_results / JSON output

_{Reviews (1): Last reviewed commit: "small up" | Re-trigger Greptile}

greptile-apps · 2026-03-29T16:02:15Z

+
+    Mix.shell().info("  " <> String.duplicate("-", 80))
+
+    Enum.each(results, fn r ->


hd/1 on potentially empty list crashes the task

print_results/3 calls hd(results) without guarding against an empty list. Passing --top 0 causes WeightOptimizer.sweep to return [] (via Enum.take(results, 0)), which makes hd([]) raise a FunctionClauseError and kills the Mix task with an unformatted stack trace.

The JSON output branch handles this correctly (it just maps over an empty list), so the inconsistency is only in the text-rendering path. The guard needs to wrap the entire best-dependent block, not just the hd call.

small up

94a8a3f

holden marked this pull request as ready for review March 29, 2026 15:52

coderabbitai bot reviewed Mar 29, 2026

View reviewed changes

greptile-apps bot reviewed Mar 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

small up#718

small up#718
holden wants to merge 1 commit intomainfrom
03-29-small_up

holden commented Mar 29, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 29, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

holden commented Mar 29, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 29, 2026

Uh oh!

greptile-apps bot commented Mar 29, 2026

Uh oh!

greptile-apps bot Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		Mix.shell().info(" " <> String.duplicate("-", 80))

		Enum.each(results, fn r ->

Conversation

holden commented Mar 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

holden commented Mar 29, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Mar 29, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

holden commented Mar 29, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 29, 2026 •

edited

Loading