fix: copy size array before reorder in multi-color scatter plots#4040
fix: copy size array before reorder in multi-color scatter plots#4040Ekin-Kahraman wants to merge 3 commits intoscverse:mainfrom
Conversation
When plotting with multiple color keys and a per-point size array, the size variable was reordered in-place within the per-color loop. Each subsequent subplot applied its z-order permutation to the already-reordered array from the previous iteration, causing cumulative corruption of marker sizes. Use a loop-local variable `_size` so each iteration reorders from the original array. Closes scverse#4024 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4040 +/- ##
==========================================
+ Coverage 78.51% 78.61% +0.10%
==========================================
Files 117 117
Lines 12753 12726 -27
==========================================
- Hits 10013 10005 -8
+ Misses 2740 2721 -19
Flags with carried forward coverage won't be shown. Click here to find out more.
|
JhonatanFelix
left a comment
There was a problem hiding this comment.
Nice fix!
I think that you could strengthen the regression test on test_scatter_size_not_mutated_across_panels because currently it cheacks that the caller's original sizes array is unchanged, but the bug in #4024 is really that later subplots receive the wrong per-point sizes.
Would it make sense to asset the plotted sizes from the matplotliv artists for each axis, so the test proves that all pannels get the same size mapping?? Another thing is that the issue's reproduction used three panels (["louvain", "n_genes", "n_counts"]), which might exercise the cumulative reorder more clearly than the current 2 panels case
tests/test_plotting.py
Outdated
| sizes = rng.uniform(10, 200, size=pbmc.n_obs) | ||
| sizes_original = sizes.copy() | ||
|
|
||
| axes = sc.pl.umap(pbmc, color=["louvain", "n_genes"], size=sizes, show=False) |
There was a problem hiding this comment.
The variable axes is not getting used afterwards and giving problems with the ruff check, you can remove it and just call directly the sc.pl.umap(...) or call it as _ = sc.pl.umap(...).
edit: I made a small PR just to take off the variable.
Strengthen the regression test per review feedback: - Use 3 color keys (categorical + two continuous) matching the original issue reproduction, exercising the cumulative reorder more clearly. - Extract actual plotted sizes from each axis's PathCollection and verify they match the expected size distribution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Thanks for the review @JhonatanFelix! Both great suggestions — I've pushed an update that:
|
Summary
Fixes #4024.
embedding(), the per-pointsizearray was reordered in-place (size = np.array(size)[order]) inside the per-color loop. Each iteration's z-order permutation was applied to the already-reordered array from the previous iteration, causing cumulative corruption of marker sizes across subplots._sizeso each iteration reorders from the originalsizearray. This matches the behavior ofcolor_source_vector,color_vector, andcoords, which are all freshly derived each iteration.Changes
src/scanpy/plotting/_tools/scatterplots.py: Replace in-place mutation ofsizewith a loop-local_sizevariable; update all downstream references within the loop body.tests/test_plotting.py: Addtest_scatter_size_not_mutated_across_panelsregression test that verifies the input size array is not modified after plotting with multiple color keys.Test plan
test_scatter_size_not_mutated_across_panelspasses — confirms the size array is preserved