Skip to content

Harmony v2#279

Open
pati-ni wants to merge 63 commits intoimmunogenomics:masterfrom
pati-ni:harmony-v2
Open

Harmony v2#279
pati-ni wants to merge 63 commits intoimmunogenomics:masterfrom
pati-ni:harmony-v2

Conversation

@pati-ni
Copy link
Copy Markdown
Collaborator

@pati-ni pati-ni commented Apr 8, 2026

No description provided.

Defined 2 new typedefs one for R datastructures and one for internal
datastructures that use floats.
performance improvement attempt for huge datasets
- reduced iterations of kmeans to 4
- kmeans++ seeding centers now done according to efficient weighted
sampling

- mt19937 random()

- uniform_real_distribution 0.01,0.99: Avoid bias towards high random
numbers to bias towards high-acceptance range

[DEV] kmeans centroids initialization

- Low memory/ batch by cluster operation
- Verbose for logging progress

[FIX] remove existing centroids from kmeans++ init centroids

[BUG] replace fill::arma::randu with stdlib::rand()

Rcpp armadillo's rand function does not work with randu

[FIX] add elements to set during centroid initialization

If element exists already then backtrack and retry
Parameterize the batch proportion cutoff
unshuffle and then return
- also print some messages
- When covariate has one trivial level after subsetting it is dropped
altogether
Flat dense matrix, not the most memory efficient but better performance
These is just for archival purposes
A cell may belong to several different batches when different
covariates exist.

The design was assuming that all cells MUST have covariates+1 entries
in Phi.

However, if for a cell only one batch was dropped but the other
covariate has support, this is not true.
- Reproducible clusters by the R set.seed for the same embedding set
- null hypothesis gives a ratio close to 1
- Added pseudocount for cases when E is close to 1
pati-ni and others added 30 commits January 2, 2026 12:40
- Fixed vignettes to interface correctly with the new API change
- New Getter that gets all lambdas for each cluster
added times
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants