Hi Karel, create_batches.py works as expected, here are just some minor suggestions.
-
In the help message, clustered_fastas.tsv is the input metafile containing file and species column, while its name is kind of misunderstood, I thought it was the output. How about meta_file.tsv.
-
log output is inaccurate. 1932811 should be 1932812 after checking both the input data and output data.
Loaded 1932811 genomes across 10357 species clusters
-
Some instruction or notification might be added to tell users to delete the output directory before running this script, cause it does not complain if the output directory is not empty, which might bring some unexpected results.
Hi Karel,
create_batches.pyworks as expected, here are just some minor suggestions.In the help message,
clustered_fastas.tsvis the input metafile containing file and species column, while its name is kind of misunderstood, I thought it was the output. How aboutmeta_file.tsv.log output is inaccurate. 1932811 should be 1932812 after checking both the input data and output data.
Some instruction or notification might be added to tell users to delete the output directory before running this script, cause it does not complain if the output directory is not empty, which might bring some unexpected results.