BioForge ships with a composable command-line interface that mirrors the library pipeline: ingest structures, clean and repair them, add hydrogens, solvate, transform, and emit both coordinates and explicit topologies. This document explains the command surface in detail and provides examples so you can adopt the CLI confidently in scripts or interactive shells.
Download the latest release for your platform from the GitHub Releases.
cargo install bio-forge- Formats – BioForge understands PDB (
.pdb,.ent) and mmCIF (.cif,.mmcif). Formats are auto-detected from file extensions, but you can override them with--formatand--out-format. - Streaming – Every subcommand can read from stdin (
-iomitted) and write to stdout (-oomitted). Non-interactive safeguards prevent dumping structured data straight into a terminal; either redirect to a file or pipe into another command. - Context sharing – Subcommands accept the same IO flags so you can combine them consistently.
| Flag | Description | Default |
|---|---|---|
-i, --input <FILE> |
Structure to read. When absent, stdin is used. | stdin |
-o, --output <FILE> |
Destination for the resulting structure/topology. When absent, stdout is used. | stdout |
--format <pdb|mmcif> |
Force input parsing format. | Auto (or PDB when stdin) |
--out-format <pdb|mmcif> |
Force output serialization format. | Auto (or PDB when stdout) |
Each subcommand focuses on a single stage. Combine them to produce richer pipelines.
bioforge info -i prepared.pdb- Computes per-chain statistics (residue count, atom count, polymer class).
- Reports unit cell vectors/angles when available.
- Estimates total charge using residue templates and common ions.
bioforge clean -i raw.pdb -o cleaned.pdb --water --ions --remove NAG --keep LIGOptions:
| Flag | Purpose |
|---|---|
--water |
Drop crystallographic water (HOH). |
--ions |
Remove metal and monatomic ions. |
--hydrogens |
Strip all hydrogen atoms. |
--hetero |
Drop hetero residues. |
--keep <RES> |
Protect specific residues from removal (may repeat). |
--remove <RES> |
Forcibly remove residues regardless of other filters (may repeat). |
bioforge repair -i cleaned.pdb -o repaired.pdb- Aligns each standard residue to its template and fills in missing heavy atoms, including peptide termini (OXT) and nucleic acid 5'-terminal phosphate (OP3).
- Ideal immediately after
cleanto ensure the structure is chemically complete before protonation.
bioforge hydro -i repaired.pdb -o protonated.pdb --ph 7.0 --his networkAdds hydrogen atoms using pH-aware protonation and geometric optimization. The pipeline operates in three phases:
- Non-HIS Protonation (when
--phis specified) – Applies pKa rules to ASP, GLU, LYS, ARG, CYS, TYR. - HIS Protonation – Uses pH thresholds, salt bridge detection, and tautomer strategy to determine HID/HIE/HIP states.
- Hydrogen Placement – Reconstructs hydrogen geometry from templates with tetrahedral terminal handling.
When --ph is omitted, the pipeline skips automatic protonation and only adds hydrogens to residues as-named, preserving user-specified protonation states.
Options:
| Flag | Purpose |
|---|---|
--ph <value> |
Target pH for protonation decisions. Omit to preserve original residue names. |
--no-strip |
Keep existing hydrogens instead of stripping before rebuild. |
--his <hid|hie|random|network> |
Histidine tautomer strategy. Defaults to network (hydrogen-bond-aware). |
--no-his-salt-bridge |
Disable salt bridge detection for HIS → HIP conversion near carboxylate groups (ASP⁻/GLU⁻/COO⁻). |
bioforge solvate -i protonated.pdb -o solvated.pdb --margin 12 --spacing 3.0 --cation Na --anion Cl --neutralize --seed 42Options:
| Flag | Purpose |
|---|---|
--margin <Å> |
Padding around the solute before packing waters (default 10 Å). |
--spacing <Å> |
Lattice spacing for initial water grid (default 3.1 Å). |
--cation <element> |
Cation species swapped into the solvent (Na, K, Mg, Ca, Li, Zn). |
--anion <element> |
Anion species (Cl, Br, I, F). |
--neutralize |
Target zero net charge by adding/removing ions. |
--target-charge <int> |
Explicit charge goal (conflicts with --neutralize). |
--seed <int> |
RNG seed for deterministic ion placement. |
bioforge transform -i solvated.pdb -o boxed.pdb --center --rotate-z 90 --translate 0,0,5Options:
| Flag | Purpose |
|---|---|
--center |
Move geometric center to the origin. |
--center-mass |
Move center of mass to the origin. |
--rotate-x/--rotate-y/--rotate-z <deg> |
Rotate around axes (applied in X → Y → Z order). |
--translate <x,y,z> |
Translate by Cartesian vector (Å). |
bioforge topology -i boxed.pdb -o boxed-topology.pdb --out-format pdb --ss-cutoff 2.1- Builds a
Topologyobject using peptide, nucleic, and disulfide heuristics. - Outputs either CONECT records (PDB) or
_struct_conncategories (mmCIF).
Option:
| Flag | Purpose |
|---|---|
--ss-cutoff <Å> |
Maximum S–S distance used to infer disulfide bonds (default 2.2 Å). |
--hetero-template <FILE> |
Include a Tripos MOL2 ligand template for hetero residues (repeatable). |
- Molecule name in
@<TRIPOS>MOLECULEmust match the hetero residue name in the structure. The builder uses this name to locate the correct template. - Atom labels in the MOL2 file must match the atom names in the structure. Any mismatch will raise a topology atom missing error during bond graph generation.
- Atom names within a single MOL2 file must be unique. The parser enforces this and rejects duplicates to prevent ambiguous bonding.
- Residue-internal atom names must also be unique within the structure for the same residue. Duplicates in the coordinates will be rejected earlier in the pipeline.
Please refer to the examples directory for end-to-end usage scenarios demonstrating both single-command-per-stage and streaming pipeline approaches.
- Clarity over mutation –
infonever mutates the structure, so you can insert it anywhere to inspect intermediate states. - Format overrides – When piping between commands with mismatched defaults, specify
--format/--out-formatexplicitly to avoid accidental PDB/mmCIF flips. - Determinism – Provide
--seedtosolvatewhenever you need reproducible water/ion placement. - Performance – Piping avoids temporary files, but large systems may benefit from writing intermediate snapshots for debugging.
With these commands, you can automate structure preparation pipelines entirely from the terminal while reusing the same algorithms that power BioForge's Rust API.