feat: add mixscale continuous perturbation scoring#945
feat: add mixscale continuous perturbation scoring#945stefanm808 wants to merge 2 commits intoscverse:mainfrom
Conversation
Implements the Mixscale scoring method (Jiang et al., Nat Cell Biol 2025) as a new method on the Mixscape class. Unlike the binary KO/NP classification in mixscape(), mixscale() computes a continuous perturbation efficiency score per cell likescalar projection onto the estimated perturbation direction vector. Reuses existing _get_perturbation_markers() pipeline for DE gene detection and follows the same code patterns as mixscape() for split handling, layer access & scaling. Closes scverse#921 (partial - continuous scoring component)
for more information, see https://pre-commit.ci
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #945 +/- ##
==========================================
- Coverage 73.54% 72.08% -1.46%
==========================================
Files 48 48
Lines 5613 5804 +191
==========================================
+ Hits 4128 4184 +56
- Misses 1485 1620 +135
🚀 New features to boost your workflow:
|
Zethson
left a comment
There was a problem hiding this comment.
Hi,
thanks for the contribution! I'd very happy to get this feature merged eventually.
This looks a bit AI generated. Could you please disclose that? There's a few things that the contributing guide outlines that was missed. It would be awesome if you could have another look, please.
Some initial feedback:
- Should this be in the mixscape code base or should this be its own Tool? Even from a user perspective.
- This also needs to show up in the tutorials. Maybe a general version of mixscape & mixscale? We might need a new term then.
- We should likely also point this out clearer in our documentation which pretty much just mentions mixscale for now.
- The ground truth is the R implementation. Could you please compare your version against the R version? We did the same for mixscape. They should be as close as possible.
| ): | ||
| """Calculate continuous perturbation scores using the Mixscale method. | ||
|
|
||
| Unlike :meth:`mixscape` which performs binary KO/NP classification via |
There was a problem hiding this comment.
This sphinx ref doesn't work. Please fix it.
| """Calculate continuous perturbation scores using the Mixscale method. | ||
|
|
||
| Unlike :meth:`mixscape` which performs binary KO/NP classification via | ||
| Gaussian Mixture Models, this method assigns a continuous perturbation |
There was a problem hiding this comment.
Please format these docstrings like the others are formatted. This is LLM generated.
| test_method=test_method, | ||
| ) | ||
|
|
||
| # Get perturbation signature matrix |
| categories = split_obs.unique() | ||
| split_masks = [split_obs == category for category in categories] | ||
|
|
||
| # Reuse the existing DE gene detection pipeline |
There was a problem hiding this comment.
I guess this is LLM noise. Please remove it
| continue | ||
|
|
||
| de_genes = perturbation_markers[(category, gene)] | ||
| # Limit to max_de_genes |
| pvec = dat.dot(vec) / vec_norm_sq if isinstance(dat, spmatrix) else np.dot(dat, vec) / vec_norm_sq | ||
| pvec = np.asarray(pvec).flatten() | ||
|
|
||
| # Extract scores for guide and NT cells |
|
|
||
|
|
||
| @pytest.fixture | ||
| def synthetic_perturbation_adata(): |
There was a problem hiding this comment.
Can you reuse the fixture that we're using to test mixscape?
| return adata | ||
|
|
||
|
|
||
| class TestMixscale: |
There was a problem hiding this comment.
We're not really using test classes in pertpy
| assert gene_a_scores.abs().mean() > 0 | ||
| assert gene_b_scores.abs().mean() > 0 | ||
|
|
||
| def test_sparse_input(self, synthetic_perturbation_adata): |
There was a problem hiding this comment.
A bit of a useless test. But ideally this were parametrized for the different array types instead of being its own test.
Implements the Mixscale scoring method (Jiang et al., Nat Cell Biol 2025) as a new method on the Mixscape class. Unlike the binary KO/NP classification in mixscape(), mixscale() computes a continuous perturbation efficiency score per cell likescalar projection onto the estimated perturbation direction vector.
Reuses existing _get_perturbation_markers() pipeline for DE gene detection and follows the same code patterns as mixscape() for split handling, layer access & scaling.
Closes #921 (partial - continuous scoring component)
PR Checklist
Description of changes
Added
Mixscape.mixscale()for continuous perturbation efficiency scoring, implementing the method from Jiang, Dalgarno et al., Nature Cell Biology (2025). Unlike the binary KO/NP classification inmixscape(),mixscale()computes a continuous perturbation efficiency score per cell via scalar projection onto the estimated perturbation direction vector. This is particularly useful for CRISPRi/CRISPRa screens where cells exhibit a gradient of perturbation responses.Usage:
Technical details
Algorithm:
_get_perturbation_markers()pipeline for DE gene detectionFollows the same code patterns as
mixscape()for split handling, layer access, and scaling. No new dependencies added. 9 new tests added, all 5 existing Mixscape tests still pass.Additional context
This addresses item 1 (continuous perturbation scoring) from #921. Items 2-4 (weighted DE, decomposition, program signatures) are planned for follow-up PRs.