# Usage Guide This page describes the current end-to-end workflow for typical CLI users. ## Recommended Workflow 1. Prepare a `features x samples` matrix. 2. Run `bsvae-sweep-k` to choose the number of modules. 3. Use the retrained `final_k` model for downstream analysis. 4. Extract networks, module assignments, and latent outputs. 5. Use `bsvae-simulate` when you need synthetic benchmarking. ## Training Direct training: ```bash bsvae-train study1 \ --dataset data/expression.csv \ --epochs 120 \ --n-modules 24 \ --latent-dim 32 ``` Recommended model-selection flow: ```bash bsvae-sweep-k sweep1 \ --dataset data/expression.csv \ --k-grid 8,12,16,24,32 \ --sweep-epochs 60 \ --stability-reps 5 \ --val-frac 0.1 ``` This creates: - sweep metrics in `results/sweep1/sweep_k/` - a final retrained model in `results/sweep1/final_k/` ## Post-Training Outputs Training directories contain: - `model.pt` - `specs.json` - `train_losses.csv` - `model-.pt` when checkpointing is enabled Sweep directories additionally contain: - `sweep_results.csv` - `sweep_summary.json` - per-K replicate subdirectories ## Network Extraction ```bash bsvae-networks extract-networks \ --model-path results/sweep1/final_k16 \ --dataset data/expression.csv \ --output-dir results/sweep1/final_k16/networks \ --methods mu_cosine gamma_knn ``` Use `mu_cosine` when you want a graph based on latent-mean similarity. Use `gamma_knn` when you want a graph based on GMM soft assignments and have `faiss-cpu` available. ## Module Extraction ```bash bsvae-networks extract-modules \ --model-path results/sweep1/final_k16 \ --dataset data/expression.csv \ --output-dir results/sweep1/final_k16/modules \ --expr data/expression.csv \ --soft-eigengenes ``` Outputs: - `gamma.npz` - `hard_assignments.npz` - `soft_eigengenes.csv` when requested Optional extras: - `--use-leiden` to write `leiden_modules.csv` - `--aggregate-to-gene --tx2gene` to write gene-level assignment files ## Latent Export And Analysis Export: ```bash bsvae-networks export-latents \ --model-path results/sweep1/final_k16 \ --dataset data/expression.csv \ --output results/sweep1/final_k16/latents ``` Analyze: ```bash bsvae-networks latent-analysis \ --model-path results/sweep1/final_k16 \ --dataset data/expression.csv \ --output-dir results/sweep1/final_k16/latent_analysis \ --kmeans-k 16 \ --umap ``` ## Simulation Workflow Generate one synthetic dataset: ```bash bsvae-simulate generate \ --output data/sim_expr.csv \ --save-ground-truth data/sim_truth.csv ``` Create a scenario grid: ```bash bsvae-simulate init-config --output sim.yaml bsvae-simulate generate-grid \ --config sim.yaml \ --outdir results/sim_pub_v1 \ --reps 30 \ --base-seed 13 ``` Validate the grid: ```bash bsvae-simulate validate-grid --grid-dir results/sim_pub_v1 ``` Each replicate contains method-ready files for BSVAE and comparator pipelines.