Performance
Plots in this section are generated by
uv run benchmarks performance-plot
Speed comparison
We benchmark the wall-clock time for decomposing all 4200 spectra in the GRS test field (424 channels each) using both PHSpectra and GaussPy+ (Riener et al. 2019). Each spectrum is processed individually through the full pipeline of each tool to ensure a fair per-spectrum comparison. Both algorithms are run with their recommended configurations:
- PHSpectra: (default), C-accelerated Levenberg-Marquardt solver and persistence peak detection
- GaussPy+: two-phase decomposition with , (trained values from Riener et al. 2019, Sect. 4.1), SNR threshold = 3.0
Results
| Metric | PHSpectra | GaussPy+ | Factor |
|---|---|---|---|
| Total time (4200 spectra) | 746.4 s | 1477.4 s | 2.0× |
| Mean per spectrum | 177.7 ms | 350.5 ms | 2.0× |
| Median per spectrum | 51.6 ms | 40.4 ms | — |
| P95 per spectrum | 705.4 ms | 702.5 ms | — |
| P99 per spectrum | 2976.0 ms | 8407.5 ms | 2.8× |
| Mean components detected | 2.48 | 2.44 | — |
PHSpectra is 2× faster in total wall-clock time than GaussPy+. The median per-spectrum times are comparable (51.6 ms vs 40.4 ms), but the distributions have very different tails: GaussPy+'s P99 reaches 8.4 s compared to PHSpectra's 3.0 s, and these extreme outliers dominate the aggregate timing.

Figure 1. Performance benchmark for PHSpectra (solid) and GaussPy+ (dashed) on all 4200 GRS test-field spectra. Generated by uv run benchmarks performance-plot.
Timing characteristics
The two tools have similar median performance but differ sharply in tail behaviour:
-
PHSpectra uses a custom C extension for both the bounded Levenberg-Marquardt solver (with analytic Jacobian) and persistence-based peak detection, keeping per-call overhead low. The timing distribution is tighter, with P99 at 3.0 s.
-
GaussPy+ has comparable median performance but much higher variance (std dev 2013.8 ms vs 688.9 ms). A small fraction of spectra trigger long optimisation chains, with P99 reaching 8.4 s. These outliers dominate the mean and total time.
Algorithmic differences
-
No smoothing sweep. GaussPy+ convolves the spectrum with a family of Gaussian kernels at each scale, computing derivatives at every scale. The two-phase decomposition repeats this process twice (once per ). PHSpectra skips smoothing entirely — it operates directly on the raw spectrum using persistence-based peak detection, which is in the number of channels.
-
No training required. GaussPy+'s parameters must be trained per survey (or per survey region), which adds a separate computational cost not reflected in the per-spectrum timing. PHSpectra's parameter requires no training — the default value works across surveys.
-
Tail behaviour. PHSpectra's execution time is more predictable (P99 3.0 s vs 8.4 s). GaussPy+'s derivative-based approach can trigger costly iterative refinement on complex spectra, leading to extreme outliers that dominate aggregate timing.
Benchmark details
- Hardware: single-core sequential processing for both tools (no parallelization)
- PHSpectra: native Python 3.14, C extension for LM solver and peak detection
- GaussPy+: Python 3.10 in Docker (required for compatibility with legacy numpy/scipy); each spectrum is processed individually through the full
GaussPyDecomposepipeline (init, decompose, improve_fitting, save) - Spectra: all 4200 GRS test-field pixels, 424 velocity channels each