Skip to main content

Performance

Plots in this section are generated by

uv run benchmarks performance-plot

Speed comparison

We benchmark the wall-clock time for decomposing all 4200 spectra in the GRS test field (424 channels each) using both PHSpectra and GaussPy+ (Riener et al. 2019). Each spectrum is processed individually through the full pipeline of each tool to ensure a fair per-spectrum comparison. Both algorithms are run with their recommended configurations:

  • PHSpectra: β=3.5\beta = 3.5 (default), C-accelerated Levenberg-Marquardt solver and persistence peak detection
  • GaussPy+: two-phase decomposition with α1=2.89\alpha_1 = 2.89, α2=6.65\alpha_2 = 6.65 (trained values from Riener et al. 2019, Sect. 4.1), SNR threshold = 3.0

Results

MetricPHSpectraGaussPy+Factor
Total time (4200 spectra)746.4 s1477.4 s2.0×
Mean per spectrum177.7 ms350.5 ms2.0×
Median per spectrum51.6 ms40.4 ms
P95 per spectrum705.4 ms702.5 ms
P99 per spectrum2976.0 ms8407.5 ms2.8×
Mean components detected2.482.44

PHSpectra is 2× faster in total wall-clock time than GaussPy+. The median per-spectrum times are comparable (51.6 ms vs 40.4 ms), but the distributions have very different tails: GaussPy+'s P99 reaches 8.4 s compared to PHSpectra's 3.0 s, and these extreme outliers dominate the aggregate timing.

Performance benchmark

Figure 1. Performance benchmark for PHSpectra (solid) and GaussPy+ (dashed) on all 4200 GRS test-field spectra. Generated by uv run benchmarks performance-plot.

Timing characteristics

The two tools have similar median performance but differ sharply in tail behaviour:

  • PHSpectra uses a custom C extension for both the bounded Levenberg-Marquardt solver (with analytic Jacobian) and persistence-based peak detection, keeping per-call overhead low. The timing distribution is tighter, with P99 at 3.0 s.

  • GaussPy+ has comparable median performance but much higher variance (std dev 2013.8 ms vs 688.9 ms). A small fraction of spectra trigger long optimisation chains, with P99 reaching 8.4 s. These outliers dominate the mean and total time.

Algorithmic differences

  1. No smoothing sweep. GaussPy+ convolves the spectrum with a family of Gaussian kernels at each α\alpha scale, computing derivatives at every scale. The two-phase decomposition repeats this process twice (once per α\alpha). PHSpectra skips smoothing entirely — it operates directly on the raw spectrum using persistence-based peak detection, which is O(n)O(n) in the number of channels.

  2. No training required. GaussPy+'s α\alpha parameters must be trained per survey (or per survey region), which adds a separate computational cost not reflected in the per-spectrum timing. PHSpectra's β\beta parameter requires no training — the default value works across surveys.

  3. Tail behaviour. PHSpectra's execution time is more predictable (P99 3.0 s vs 8.4 s). GaussPy+'s derivative-based approach can trigger costly iterative refinement on complex spectra, leading to extreme outliers that dominate aggregate timing.

Benchmark details

  • Hardware: single-core sequential processing for both tools (no parallelization)
  • PHSpectra: native Python 3.14, C extension for LM solver and peak detection
  • GaussPy+: Python 3.10 in Docker (required for compatibility with legacy numpy/scipy); each spectrum is processed individually through the full GaussPyDecompose pipeline (init, decompose, improve_fitting, save)
  • Spectra: all 4200 GRS test-field pixels, 424 velocity channels each