Skip to content

feat: add mixscale continuous perturbation scoring#945

Open
stefanm808 wants to merge 2 commits intoscverse:mainfrom
stefanm808:feat/mixscale
Open

feat: add mixscale continuous perturbation scoring#945
stefanm808 wants to merge 2 commits intoscverse:mainfrom
stefanm808:feat/mixscale

Conversation

@stefanm808
Copy link
Copy Markdown

Implements the Mixscale scoring method (Jiang et al., Nat Cell Biol 2025) as a new method on the Mixscape class. Unlike the binary KO/NP classification in mixscape(), mixscale() computes a continuous perturbation efficiency score per cell likescalar projection onto the estimated perturbation direction vector.

Reuses existing _get_perturbation_markers() pipeline for DE gene detection and follows the same code patterns as mixscape() for split handling, layer access & scaling.

Closes #921 (partial - continuous scoring component)

PR Checklist

  • Referenced issue is linked
  • If you've fixed a bug or added code that should be tested, add tests!

Description of changes
Added Mixscape.mixscale() for continuous perturbation efficiency scoring, implementing the method from Jiang, Dalgarno et al., Nature Cell Biology (2025). Unlike the binary KO/NP classification in mixscape(), mixscale() computes a continuous perturbation efficiency score per cell via scalar projection onto the estimated perturbation direction vector. This is particularly useful for CRISPRi/CRISPRa screens where cells exhibit a gradient of perturbation responses.

Usage:

ms = pt.tl.Mixscape()
ms.perturbation_signature(adata, "perturbation", "NT", split_by="replicate")
ms.mixscale(adata, "gene_target", "NT", layer="X_pert")
# Continuous scores in adata.obs["mixscale_score"]

Technical details

Algorithm:

  1. Reuses existing _get_perturbation_markers() pipeline for DE gene detection
  2. Subsets perturbation signatures to DE genes
  3. Computes perturbation direction vector (mean perturbed − mean control)
  4. Scalar-projects each cell's signature onto this direction
  5. The Z-score standardizes relative to the NT control distribution

Follows the same code patterns as mixscape() for split handling, layer access, and scaling. No new dependencies added. 9 new tests added, all 5 existing Mixscape tests still pass.

Additional context

This addresses item 1 (continuous perturbation scoring) from #921. Items 2-4 (weighted DE, decomposition, program signatures) are planned for follow-up PRs.

Stefan and others added 2 commits April 12, 2026 21:31
Implements the Mixscale scoring method (Jiang et al., Nat Cell Biol 2025)
as a new method on the Mixscape class. Unlike the binary KO/NP
classification in mixscape(), mixscale() computes a continuous perturbation
efficiency score per cell likescalar projection onto the estimated
perturbation direction vector.

Reuses existing _get_perturbation_markers() pipeline for DE gene detection
and follows the same code patterns as mixscape() for split handling,
layer access & scaling.

Closes scverse#921 (partial - continuous scoring component)
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 12, 2026

Codecov Report

❌ Patch coverage is 86.20690% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.08%. Comparing base (12897e1) to head (c0d36c8).
⚠️ Report is 51 commits behind head on main.

Files with missing lines Patch % Lines
pertpy/tools/_mixscape.py 86.20% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #945      +/-   ##
==========================================
- Coverage   73.54%   72.08%   -1.46%     
==========================================
  Files          48       48              
  Lines        5613     5804     +191     
==========================================
+ Hits         4128     4184      +56     
- Misses       1485     1620     +135     
Files with missing lines Coverage Δ
pertpy/tools/_mixscape.py 85.47% <86.20%> (+0.17%) ⬆️

... and 8 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Member

@Zethson Zethson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

thanks for the contribution! I'd very happy to get this feature merged eventually.

This looks a bit AI generated. Could you please disclose that? There's a few things that the contributing guide outlines that was missed. It would be awesome if you could have another look, please.

Some initial feedback:

  1. Should this be in the mixscape code base or should this be its own Tool? Even from a user perspective.
  2. This also needs to show up in the tutorials. Maybe a general version of mixscape & mixscale? We might need a new term then.
  3. We should likely also point this out clearer in our documentation which pretty much just mentions mixscale for now.
  4. The ground truth is the R implementation. Could you please compare your version against the R version? We did the same for mixscape. They should be as close as possible.

):
"""Calculate continuous perturbation scores using the Mixscale method.

Unlike :meth:`mixscape` which performs binary KO/NP classification via
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sphinx ref doesn't work. Please fix it.

"""Calculate continuous perturbation scores using the Mixscale method.

Unlike :meth:`mixscape` which performs binary KO/NP classification via
Gaussian Mixture Models, this method assigns a continuous perturbation
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please format these docstrings like the others are formatted. This is LLM generated.

test_method=test_method,
)

# Get perturbation signature matrix
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of a noisy comment.

categories = split_obs.unique()
split_masks = [split_obs == category for category in categories]

# Reuse the existing DE gene detection pipeline
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is LLM noise. Please remove it

continue

de_genes = perturbation_markers[(category, gene)]
# Limit to max_de_genes
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLM noise?

pvec = dat.dot(vec) / vec_norm_sq if isinstance(dat, spmatrix) else np.dot(dat, vec) / vec_norm_sq
pvec = np.asarray(pvec).flatten()

# Extract scores for guide and NT cells
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLM noise?



@pytest.fixture
def synthetic_perturbation_adata():
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reuse the fixture that we're using to test mixscape?

return adata


class TestMixscale:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not really using test classes in pertpy

assert gene_a_scores.abs().mean() > 0
assert gene_b_scores.abs().mean() > 0

def test_sparse_input(self, synthetic_perturbation_adata):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of a useless test. But ideally this were parametrized for the different array types instead of being its own test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add mixscale for continuous perturbation scoring

3 participants