Conversation
📝 WalkthroughWalkthroughThis pull request introduces a new weight optimization sweep feature for the prediction system. It adds documentation guidelines on prediction integrity, a new public API Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant MixTask as Mix Task<br/>(predictions.sweep)
participant WeightOptimizer
participant MovieLists
participant Output
User->>MixTask: Run with options
MixTask->>MixTask: Parse CLI args
MixTask->>WeightOptimizer: sweep(source_key, n_samples, opts)
WeightOptimizer->>WeightOptimizer: Load cached decade features
WeightOptimizer->>WeightOptimizer: Generate n_samples random weights
WeightOptimizer->>WeightOptimizer: Evaluate all weights in-memory
WeightOptimizer->>WeightOptimizer: Rank by accuracy (descending)
alt --save option provided
WeightOptimizer->>MovieLists: save_trained_weights(best)
MovieLists-->>WeightOptimizer: OK
end
WeightOptimizer-->>MixTask: Top results with ranks
MixTask->>Output: Format as JSON or table
Output-->>User: Display results
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
This stack of pull requests is managed by Graphite. Learn more about stacking. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
lib/mix/tasks/predictions.sweep.ex (1)
123-129: Consider guarding against empty results.
hd(results)will crash ifresultsis empty. While named profiles should always be present, a defensive check would improve robustness.♻️ Proposed defensive check
+ if results == [] do + Mix.shell().error("No results returned from sweep.") + exit({:shutdown, 1}) + end + best = hd(results)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/mix/tasks/predictions.sweep.ex` around lines 123 - 129, Guard against an empty results list before calling hd(results): check results (e.g., with a case or if) and handle the empty branch (log a message via Mix.shell().info/error and return or skip further processing) so hd(results) is only called when results is non-empty; update the block that assigns best and calls Mix.shell().info (the hd(results) usage and subsequent Mix.shell().info with format_weights(best.weights)) to first verify results is not [].
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/cinegraph/predictions/weight_optimizer.ex`:
- Around line 322-327: The random_simplex_point/1 function can produce :infinity
when using :rand.uniform() and taking 1.0 / 0.0; update the sampling to use
:rand.uniform_real/0 (which never returns 0.0) and compute the exponential
variates via the negative log directly (i.e., generate each raw element with
-:math.log(:rand.uniform_real()) inside the for comprehension) before
normalizing by Enum.sum to produce the simplex point.
---
Nitpick comments:
In `@lib/mix/tasks/predictions.sweep.ex`:
- Around line 123-129: Guard against an empty results list before calling
hd(results): check results (e.g., with a case or if) and handle the empty branch
(log a message via Mix.shell().info/error and return or skip further processing)
so hd(results) is only called when results is non-empty; update the block that
assigns best and calls Mix.shell().info (the hd(results) usage and subsequent
Mix.shell().info with format_weights(best.weights)) to first verify results is
not [].
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ac76c43b-d6ce-42b8-8210-ee19604fe353
📒 Files selected for processing (4)
CLAUDE.mdREADME.mdlib/cinegraph/predictions/weight_optimizer.exlib/mix/tasks/predictions.sweep.ex
| # Sample a uniform random point on the (n-1)-simplex via the exponential method. | ||
| defp random_simplex_point(n) do | ||
| raw = for _ <- 1..n, do: :math.log(1.0 / :rand.uniform()) | ||
| total = Enum.sum(raw) | ||
| Enum.map(raw, &(&1 / total)) | ||
| end |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does Erlang :rand.uniform/0 return exactly 0.0 as a possible value, and does :rand.uniform_real/0 exclude 0.0?
💡 Result:
Yes, Erlang's :rand.uniform/0 can return exactly 0.0 as a possible value in the range 0.0 =< X < 1.0. :rand.uniform_real/0 excludes 0.0, returning values in the range 0.0 < X < 1.0.
Citations:
- 1: https://www.erlang.org/doc/apps/stdlib/rand.html
- 2: https://erlang.org/doc/man/rand.html
- 3: https://www.erlang.org/docs/22/man/rand
Use :rand.uniform_real/0 to avoid log(infinity) edge case in simplex sampling.
Erlang's :rand.uniform/0 can return exactly 0.0 (range [0.0, 1.0)), which causes 1.0 / 0.0 to produce :infinity and corrupts the sample vector. Use :rand.uniform_real/0 instead, which excludes 0.0 and returns the mathematically correct negative log directly:
Proposed fix
defp random_simplex_point(n) do
- raw = for _ <- 1..n, do: :math.log(1.0 / :rand.uniform())
+ raw = for _ <- 1..n, do: -:math.log(:rand.uniform_real())
total = Enum.sum(raw)
Enum.map(raw, &(&1 / total))
end🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/cinegraph/predictions/weight_optimizer.ex` around lines 322 - 327, The
random_simplex_point/1 function can produce :infinity when using :rand.uniform()
and taking 1.0 / 0.0; update the sampling to use :rand.uniform_real/0 (which
never returns 0.0) and compute the exponential variates via the negative log
directly (i.e., generate each raw element with -:math.log(:rand.uniform_real())
inside the for comprehension) before normalizing by Enum.sum to produce the
simplex point.
Greptile SummaryThis PR adds a random weight sweep capability to the predictions system — a fast in-memory alternative to the logistic regression trainer ( Key changes:
Confidence Score: 4/5Safe to merge after fixing the hd/1 crash in print_results/3. One P1 crash path exists: passing --top 0 produces an empty result list, causing print_results/3 to call hd([]) and raise an unhandled FunctionClauseError. The core logic — simplex sampling, fast_evaluate metric, caching strategy — is sound and consistent with the existing HistoricalValidator accuracy convention. All other findings are P2 or lower. lib/mix/tasks/predictions.sweep.ex — specifically the hd(results) call in print_results/3 Important Files Changed
Sequence DiagramsequenceDiagram
participant CLI as mix predictions.sweep
participant WO as WeightOptimizer.sweep/3
participant HV as HistoricalValidator
participant DB as Database
participant CS as CriteriaScoring
CLI->>WO: sweep(list_key, n_samples, opts)
WO->>HV: get_all_decades(source_key)
HV-->>WO: [1930, 1940, ..., 2010]
WO->>WO: load_decade_data_parallel (Task.async_stream)
loop per decade (parallel)
WO->>DB: get_decade_movies_query(decade)
DB-->>WO: [movies...]
WO->>CS: batch_score_movies(movies, @default_weights)
CS-->>WO: [{criteria_scores: %{mob, critics, ...}}]
WO->>WO: build feature rows (scores / 100), labels
end
Note over WO: decade_data cached in memory
WO->>CS: get_named_profiles()
CS-->>WO: [default, festival-heavy, audience-first, ...]
WO->>WO: build named_vectors from profiles
loop n_samples random vectors
WO->>WO: random_simplex_point(n_criteria)
end
loop (named + random) vectors — in memory only
WO->>WO: fast_evaluate(vec, decade_data) → precision@K
end
WO->>WO: sort by accuracy desc, take top N
opt --save
WO->>DB: MovieLists.save_trained_weights(source_key, weights)
end
WO-->>CLI: [%{rank, accuracy, label, weights}, ...]
CLI->>CLI: print_results / JSON output
Reviews (1): Last reviewed commit: "small up" | Re-trigger Greptile |
|
|
||
| Mix.shell().info(" " <> String.duplicate("-", 80)) | ||
|
|
||
| Enum.each(results, fn r -> |
There was a problem hiding this comment.
hd/1 on potentially empty list crashes the task
print_results/3 calls hd(results) without guarding against an empty list. Passing --top 0 causes WeightOptimizer.sweep to return [] (via Enum.take(results, 0)), which makes hd([]) raise a FunctionClauseError and kills the Mix task with an unformatted stack trace.
The JSON output branch handles this correctly (it just maps over an empty list), so the inconsistency is only in the text-rendering path. The guard needs to wrap the entire best-dependent block, not just the hd call.

Summary by CodeRabbit
Documentation
New Features