Add SVM classifier based separability scoring by mffrank · Pull Request #10 · czbiohub-sf/grassp

mffrank · 2025-07-18T03:13:04Z

No description provided.

mffrank · 2025-07-18T03:17:22Z

@duopeng we have automatic formatting with pre-commit. You can install by just running make setup-develop in the grassp repo!

mffrank

Thanks for adding this!

mffrank · 2025-07-18T03:21:04Z

+
+    Parameters
+    ----------
+    data : DataFrame or AnnData 


The docstrings should not have the data type (:str , etc). Rather the arguments themselves, e.g.:

label_col: str = "consensus_graph_annnotation",

mffrank · 2025-07-18T03:25:33Z

+    auc_clustermap = sns.clustermap(auc_mat, 
+                                    square=True, 
+                                    annot=True, 
+                                    fmt=".2f",
+                                    cmap="rocket", 
+                                    vmin=0.5, 
+                                    vmax=1,
+                                    cbar_kws=dict(label=f"ROC-AUC ({auc_model.upper()})"),
+                                    figsize=(heatmap_size[0], heatmap_size[1]))
+    auc_clustermap.fig.suptitle("Label separability\nPair-wise classifier-AUC")
+    auc_clustermap.ax_heatmap.set_xticklabels(
+        auc_clustermap.ax_heatmap.get_xticklabels(), rotation=45, ha='right')
+    figures['auc_fig'] = auc_clustermap


Plotting should be handled separately from calculations in grassp.pl

mffrank · 2025-07-18T03:27:20Z

+    # Drop rows with missing coords or labels
+    df = df.dropna(subset=[label_col, *coord_cols])
+
+    X_all = df[list(coord_cols)].values


You're converting a np.array (the .obsm or .X into a DataFrame and then back to a np.array. This seems inefficient

mffrank · 2025-07-21T23:01:54Z

+    if isinstance(data, ad.AnnData):
+        assert label_col in data.obs.columns, f"label_col {label_col} not in data.obs.columns"
+        X_all = sc.tools._utils._choose_representation(data, use_rep=use_rep, n_pcs=n_pcs)
+        df = pd.DataFrame(X_all)


You're still converting X_all into a dataFrame and then back into an array (line 316).

mffrank · 2025-07-21T23:06:13Z

+        if DataFrame, then use column name as label
+        Defaults to "consensus_graph_annnotation"
+    use_rep : str, optional
+        coordinates (X in the classifier)


X in the classifier might not mean much to users. Scanpy has Use the indicated representation. 'X' or any key for .obsm is valid.

mffrank · 2025-07-21T23:07:55Z

+        Defaults to "consensus_graph_annnotation"
+    use_rep : str, optional
+        coordinates (X in the classifier)
+        if AnnData, use .obsm[use_rep] if use_rep is a *str*, and .var[use_rep] if *list*


Don't need to support the .var[use_rep] case. Users can simply subset before!

mffrank · 2025-07-21T23:09:20Z

+    np.fill_diagonal(auc_mat.values, 0.5)
+
+    if inplace:
+        data.uns[f"separability ({label_col})"] = {


prefer no spaces or special characters in names: separability_{label_col}. don't forget to also fix in plotting

mffrank · 2025-07-21T23:10:20Z

+
+
+def sep_auc_heatmap(
+    data: np.ndarray | pd.DataFrame,


this should also be able to take an anndata object and look in .uns for entries with "separability_"

…satility to plotting function

duopeng · 2025-07-23T23:13:09Z

I tested the code after Marika's cleanup, and it works nicely! we can merge this branch with main!

Update scoring.py

6f60686

mffrank commented Jul 18, 2025

View reviewed changes

Separating Duo's SVM into independent calculation + plotting functions

7ed74e7

mffrank commented Jul 21, 2025

View reviewed changes

Changing df/array inefficiency, updating docstrings, adding adata ver…

34ba9e3

…satility to plotting function

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SVM classifier based separability scoring#10

Add SVM classifier based separability scoring#10
mffrank wants to merge 3 commits intomainfrom
separability_AUC

mffrank commented Jul 18, 2025

Uh oh!

mffrank commented Jul 18, 2025

Uh oh!

mffrank left a comment

Uh oh!

mffrank Jul 18, 2025

Uh oh!

mffrank Jul 18, 2025

Uh oh!

mffrank Jul 18, 2025

Uh oh!

mffrank Jul 21, 2025

Uh oh!

mffrank Jul 21, 2025

Uh oh!

mffrank Jul 21, 2025

Uh oh!

mffrank Jul 21, 2025

Uh oh!

mffrank Jul 21, 2025

Uh oh!

duopeng commented Jul 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mffrank commented Jul 18, 2025

Uh oh!

mffrank commented Jul 18, 2025

Uh oh!

mffrank left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

duopeng commented Jul 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants