Usage Guide

This page describes the current end-to-end workflow for typical CLI users.

Training

Direct training:

bsvae-train study1 \
  --dataset data/expression.csv \
  --epochs 120 \
  --n-modules 24 \
  --latent-dim 32

Recommended model-selection flow:

bsvae-sweep-k sweep1 \
  --dataset data/expression.csv \
  --k-grid 8,12,16,24,32 \
  --sweep-epochs 60 \
  --stability-reps 5 \
  --val-frac 0.1

This creates:

  • sweep metrics in results/sweep1/sweep_k/

  • a final retrained model in results/sweep1/final_k<K>/

Post-Training Outputs

Training directories contain:

  • model.pt

  • specs.json

  • train_losses.csv

  • model-<epoch>.pt when checkpointing is enabled

Sweep directories additionally contain:

  • sweep_results.csv

  • sweep_summary.json

  • per-K replicate subdirectories

Network Extraction

bsvae-networks extract-networks \
  --model-path results/sweep1/final_k16 \
  --dataset data/expression.csv \
  --output-dir results/sweep1/final_k16/networks \
  --methods mu_cosine gamma_knn

Use mu_cosine when you want a graph based on latent-mean similarity. Use gamma_knn when you want a graph based on GMM soft assignments and have faiss-cpu available.

Module Extraction

bsvae-networks extract-modules \
  --model-path results/sweep1/final_k16 \
  --dataset data/expression.csv \
  --output-dir results/sweep1/final_k16/modules \
  --expr data/expression.csv \
  --soft-eigengenes

Outputs:

  • gamma.npz

  • hard_assignments.npz

  • soft_eigengenes.csv when requested

Optional extras:

  • --use-leiden to write leiden_modules.csv

  • --aggregate-to-gene --tx2gene to write gene-level assignment files

Latent Export And Analysis

Export:

bsvae-networks export-latents \
  --model-path results/sweep1/final_k16 \
  --dataset data/expression.csv \
  --output results/sweep1/final_k16/latents

Analyze:

bsvae-networks latent-analysis \
  --model-path results/sweep1/final_k16 \
  --dataset data/expression.csv \
  --output-dir results/sweep1/final_k16/latent_analysis \
  --kmeans-k 16 \
  --umap

Simulation Workflow

Generate one synthetic dataset:

bsvae-simulate generate \
  --output data/sim_expr.csv \
  --save-ground-truth data/sim_truth.csv

Create a scenario grid:

bsvae-simulate init-config --output sim.yaml

bsvae-simulate generate-grid \
  --config sim.yaml \
  --outdir results/sim_pub_v1 \
  --reps 30 \
  --base-seed 13

Validate the grid:

bsvae-simulate validate-grid --grid-dir results/sim_pub_v1

Each replicate contains method-ready files for BSVAE and comparator pipelines.