Command-Line Interface
This reference documents the installed CLI entry points defined in pyproject.toml.
bsvae-train
Train a GMMModuleVAE on a features x samples matrix.
bsvae-train NAME --dataset PATH [options]
Required:
NAME: experiment name--dataset: expression matrix path
Common options:
--outdirdefaultresults--seeddefault13--epochsdefault100--batch-sizedefault128--lrdefault5e-4--checkpoint-everydefault10--warmup-epochsdefault20--transition-epochsdefault10--freeze-gmm-epochsdefault0--n-modulesdefault20--latent-dimdefault32--hidden-dimsdefault[512, 256, 128]--dropoutdefault0.1--use-batch-norm/--no-batch-norm--sigma-mindefault0.3--normalize-input--betadefault1.0--free-bitsdefault0.0--kl-warmup-epochsdefault0--kl-anneal-modelinearorcyclical--kl-cycle-lengthdefault50--sep-strengthdefault0.1--sep-alphadefault2.0--bal-strengthdefault0.1--bal-ema-blenddefault0.5--pi-entropy-strengthdefault0.0--hier-strengthdefault0.0--corr-strengthdefault0.0--latent-corr-strengthdefault0.0--masked-recon--tx2gene--isoform-stratified--p-multidefault0.5--collapse-thresholddefault0.5--collapse-noise-scaledefault0.5--no-eval--eval-batch-size--no-cuda--log-level--no-progress-bar
Outputs:
model.ptspecs.jsontrain_losses.csvmodel-<epoch>.ptwhen checkpointing is enabled
bsvae-sweep-k
Run a held-out validation sweep over candidate K values and optionally retrain the selected model on the full dataset.
bsvae-sweep-k NAME --dataset PATH [options]
Key options:
--k-gridcomma list orstart:end:step--sweep-epochsdefault60--stability-repsdefault1--stability-seeddefault13--val-fracdefault0.1--val-seeddefault13--train-finalenabled by default--no-train-final--final-epochs
Training-related flags mirror bsvae-train for architecture and optimization.
Selection behavior:
--stability-reps 1: select the bestKby validation loss--stability-reps > 1: select the bestKby mean pairwise ARI across held-out-feature assignments
Outputs:
results/<name>/sweep_k/sweep_results.csvresults/<name>/sweep_k/sweep_summary.jsonresults/<name>/sweep_k/k<K>/rep_<rep>/...results/<name>/final_k<K>/...when final retraining is enabled
bsvae-networks
Post-training utilities for trained models.
Subcommands:
extract-networksextract-modulesexport-latentslatent-analysis
extract-networks
Build sparse feature-feature graphs from latent outputs.
bsvae-networks extract-networks \
--model-path results/run \
--dataset data/expression.csv \
--output-dir results/run/networks \
--methods mu_cosine gamma_knn \
--top-k 50
Options:
--methodschoices:mu_cosine,gamma_knn--top-kdefault50--batch-sizedefault128--no-cuda
Outputs:
<method>_adjacency.npz
extract-modules
Extract GMM assignments and optional eigengenes or comparison clusters.
bsvae-networks extract-modules \
--model-path results/run \
--dataset data/expression.csv \
--output-dir results/run/modules \
--expr data/expression.csv \
--soft-eigengenes
Options:
--exprexpression matrix for eigengene computation--soft-eigengenes--use-leiden--leiden-resolutiondefault1.0--tx2gene--aggregate-to-gene--batch-sizedefault128--no-cuda
Outputs:
gamma.npzhard_assignments.npzsoft_eigengenes.csvwhen requestedleiden_modules.csvwhen requestedgamma_gene.npzandhard_assignments_gene.npzwhen gene aggregation is requested
export-latents
Export latent arrays for all features.
bsvae-networks export-latents \
--model-path results/run \
--dataset data/expression.csv \
--output results/run/latents
Options:
--batch-sizedefault128--no-cuda
Output:
latents.npzor<output>.npzwith arraysmu,logvar,gamma, andfeature_ids
latent-analysis
Run clustering, embeddings, and optional covariate correlations on latent outputs.
bsvae-networks latent-analysis \
--model-path results/run \
--dataset data/expression.csv \
--output-dir results/run/latent_analysis \
--kmeans-k 10 \
--umap
Options:
--kmeans-k--gmm-k--umap--tsne--tsne-perplexitydefault30.0--covariates--batch-sizedefault128--no-cuda
Outputs may include:
latent_mu.csvlatent_logvar.csvlatent_clusters.csvlatent_embeddings.csvlatent_covariate_correlations.csv
bsvae-simulate
Simulation and benchmarking utilities.
Subcommands:
generatebenchmarkinit-configgenerate-gridgenerate-scenariovalidate-grid
generate
bsvae-simulate generate \
--output data/sim.csv \
--save-ground-truth data/gt.csv
Important options:
--n-featuresdefault500--n-samplesdefault200--n-modulesdefault10--within-corrdefault0.8--between-corrdefault0.0--noise-stddefault0.2--seeddefault13
benchmark
bsvae-simulate benchmark \
--dataset data/sim.csv \
--ground-truth data/gt.csv \
--model-path results/run \
--output results/run/sim_metrics.json
Outputs JSON metrics including ari, nmi, and n_features.
Scenario-grid commands
bsvae-simulate init-config --output sim.yaml
bsvae-simulate generate-grid --config sim.yaml --outdir results/sim_pub_v1 --reps 30 --base-seed 13
bsvae-simulate generate-scenario --config sim.yaml --scenario-id S001__... --rep 0 --outdir results/sim_pub_v1
bsvae-simulate validate-grid --grid-dir results/sim_pub_v1
Per-replicate outputs are written under:
<outdir>/scenarios/<scenario_id>/rep_<rep>/
Common files:
expr/features_x_samples.tsv.gzexpr/samples_x_features.tsv.gztruth/modules_hard.csvmethod_inputs.json