Benchmarks

CLI Reference

python -m benchmarks.run [OPTIONS]

Flag	Description	Values
`-b`, `--backend`	Hardware backend	`cpu`, `cuda`, `tpu`, `metal`
`-a`, `--algorithm`	Algorithm(s) to benchmark	`dmrg`, `idmrg`, `trg`, `hotrg`, `ipeps`, `all`
`-s`, `--size`	Problem size(s)	`small`, `medium`, `large`, `all`
`-n`, `--trials`	Number of trials per config	Integer (default varies)
`-o`, `--output`	Save results to JSON	File path
`--csv`	Save results to CSV	File path
`--list-backends`	Show available backends	—

Each algorithm defines three problem sizes. Larger sizes stress-test bond dimension scaling and hardware throughput.

Backend	Flag	Requirements
CPU	`cpu`	Default — works everywhere
NVIDIA GPU	`cuda`	`pip install tenax-tn[cuda13]` or `tenax-tn[cuda12]`
Google Cloud TPU	`tpu`	`pip install tenax-tn[tpu]`
Apple Silicon GPU	`metal`	`pip install tenax-tn[metal]` (macOS, experimental)

Check what’s available on your machine:

python -m benchmarks.run --list-backends

Full structured output with timings, parameters, and device info:

python -m benchmarks.run -b cpu -a dmrg -s medium -n 3 -o results.json

Flat table for analysis in pandas, Excel, or plotting tools:

python -m benchmarks.run -b cpu -a all -s all --csv results.csv

python -m benchmarks.run -b cpu -a dmrg -s medium -n 3 -o cpu.json
python -m benchmarks.run -b cuda -a dmrg -s medium -n 3 -o gpu.json

GPU typically helps when chi >= 64. For smaller bond dimensions, CPU may be faster due to transfer overhead.

Fix the backend and vary size to study computational scaling:

python -m benchmarks.run -b cpu -a dmrg -s small medium large -n 5 --csv dmrg_scaling.csv

Report median time, not mean — avoids JIT warmup skew from the first trial.
GPU memory — large sizes may OOM on GPUs with limited VRAM.
First trial is slow — JAX recompiles on first call. Use --trials 5 or more for stable numbers.