Magnitude-Squared Coherence
- Magnitude-squared coherence is a frequency-dependent measure quantifying the linear correlation between two zero-mean time series, with values ranging from 0 to 1.
- Various estimation methods, including Welch and multitaper techniques, balance trade-offs between spectral resolution, bias, and variance in MSC analysis.
- MSC is widely applied in fields like geophysics and astrophysics to analyze oscillatory phenomena and validate hypotheses through statistical significance tests.
The magnitude-squared coherence (MSC) is a frequency-dependent statistical measure quantifying the strength of linear correlation between two zero-mean time series as a function of frequency. Its scope spans geophysics, heliophysics, and astrophysics, where spectral cross-relations between physical time series are of scientific interest. MSC represents the squared modulus of the normalized cross-spectral density, bound between 0 and 1. Values near unity indicate that, at a given frequency, one process can be linearly predicted from the other with a fixed phase shift. The behavior, estimation, and interpretation of MSC depend strongly on the chosen estimation methodology, data sampling properties, and the structure of the data under investigation.
1. Definition and Fundamental Properties
Given two zero-mean time series and , the population MSC at frequency is defined by: where:
- and are the power spectral densities (PSDs) of and , respectively,
- is the cross-spectral density between and , typically obtained by averaging products over independent realizations or segments,
- and denote the Fourier transforms of and .
By construction, . A value close to unity denotes strong linear phase-consistent correlation at frequency ; a value near zero denotes the absence of such relationship.
2. MSC Estimation Methodologies
Several classes of estimators are in prevalent use. The choice of estimator is dictated by the stationarity, regularity, and sampling gaps in the data, as well as the statistical requirements of the analysis.
2.1 Classical and Welch Segment Averaging
The fundamental approach involves segmenting the time series, possibly overlapping, and tapering each segment (e.g., Kaiser, Blackman-Harris windows), followed by Fourier transformation to compute auto- and cross-spectra (Holm, 2015, Dodson-Robinson et al., 2022). For each segment, one computes: The final estimates are averaged across segments (possibly with segment length normalization):
and analogously for autospectra, yielding the pooled estimator: Welch’s method typically employs 50% overlap and strong tapering to reduce spectral leakage (Dodson-Robinson et al., 2022).
2.2 Multitaper Estimation
The multitaper approach uses orthogonal tapers (specially, Discrete Prolate Spheroidal Sequences or DPSS) to form multiple spectral estimates from a single segment, controlling the trade-off between spectral resolution (bandwidth), variance, and estimator bias. For gapless data, independent tapers with spectral concentration are employed, where is the time-bandwidth product (Dodson-Robinson et al., 9 May 2025).
For gapped data sampled on an underlying uniform grid (common for heliospheric/solar data), missing-data Slepian sequences (MDSS) are computed via a reduced Slepian eigenproblem restricted to observed samples. Multi-taper eigencoefficients are computed using the non-uniform FFT (NUFFT): These are combined as: This method optimally balances bias, variance, and spectral resolution, and is robust to missing data (Dodson-Robinson et al., 9 May 2025).
2.3 Wavelet-Based and MEM Approaches
Wavelet coherence uses the continuous wavelet transform (usually complex Morlet) to form time–frequency localized spectral estimates. Smoothing in both time and scale yields: where is the cross-wavelet spectrum, and brackets denote localized smoothing (Holm, 2015). The maximum entropy method (MEM) fits an autoregressive model in each window, sharpening spectral lines but assuming (local) stationarity (Holm, 2013).
3. Statistical Inference and Bias–Variance Trade-Off
Correct statistical inference for MSC demands rigorous treatment of bias, variance, and significance—especially under the null hypothesis of no true coherence and in the presence of serial correlation or strong periodicity.
3.1 Degrees of Freedom
Segmenting and/or tapering produces nominal spectra, but overlapping windows and correlated tapers can lower the effective degrees of freedom, , often estimated from the window autocorrelation or spectral concentration (for tapers) (Holm, 2015, Dodson-Robinson et al., 2022).
3.2 Independence Thresholds
Under the null , sample MSC estimates are biased upwards (Beta in multitaper). Analytical “independence thresholds” (IT) at confidence , such as
are used to separate meaningful coherence from fluctuations due to limited independent averages (Holm, 2013).
3.3 Monte Carlo Significance for Periodic or Serial Data
Standard white- or red-noise nulls are inappropriate for data with strong periodicity. Non-parametric random phase surrogates preserve each series’ power spectrum while scrambling the phase, generating the correct null distribution for coherence (Holm, 2015):
- Randomize Fourier phases within each series.
- Invert to time domain to generate surrogate series.
- Recompute MSC over many (e.g., ) surrogate realizations.
- Empirical 95th (or chosen) percentile of MSC from surrogates provides the significance threshold (any observed MSC above this is significant at the level).
3.4 Bias/Variance Correction
Small-sample bias can be managed using the Fisher atanh variance-stabilizing transform and jackknife techniques: Jackknife leave-one-out estimates provide confidence intervals on , which are then mapped back to space (Dodson-Robinson et al., 9 May 2025, Dodson-Robinson et al., 2022).
4. Practical Implementation: Algorithmic Steps
| Estimator | Main Steps | Limitations/Notes |
|---|---|---|
| Welch | Segment → Window → FFT → Avg. periodograms & cross-spectra → Form MSC | Segment length and taper affect bias/resolution; not optimal for gapped/correlated series (Dodson-Robinson et al., 2022, Holm, 2013) |
| Multitaper | Compute (missing-data) Slepian tapers → Eigencoefficients (NUFFT if gapped) → Average | Discards low-concentration tapers; optimal bias/variance; needs specialized linear algebra (Dodson-Robinson et al., 9 May 2025) |
| Wavelet | Morlet wavelet transform → Cross-wavelet → Time/scale smoothing → MSC map | Edge effects at low frequency; time–frequency tradeoff (Holm, 2015) |
Implementation must account for:
- Segment/taper selection (bandwidth–variance trade-off).
- Accounting for gaps: zero-insertion/interpolation is discouraged except when justified.
- Sidelobe leakage: window/taper selection is critical.
- Correct computation of independent degrees of freedom (overlaps/taper autocorrelation).
5. Applications and Case Studies
5.1 Geophysical and Climate Analysis
In solar–planetary–climate correlation studies, both wavelet and periodogram methods revealed substantial MSC peaks (e.g., at 15–22 and 50–60 year bands), but Monte Carlo testing with random-phase surrogates showed that these peaks lacked statistical significance. This undermined claims of planetary influence on climate variability via coherent oscillations, especially in the absence of a physical mechanism (Holm, 2015, Holm, 2013).
5.2 Solar and Heliospheric Physics
Multitaper MSC for missing-data time series (solar Lyman- flux, geomagnetic Dst index) robustly recovered known mid-term solar–geomagnetic oscillations at periods of 50 days, 1.7 years, and 5 years, all exceeding the 99% significance threshold with jackknife confidence bands, validating the technique’s power and reliability for oscillatory solar/geomagnetic processes (Dodson-Robinson et al., 9 May 2025).
5.3 Stellar Activity and Exoplanet Detection
Mapping RV measurements against stellar activity indicators, segment-averaged/tapered MSC exposes oscillations (stellar rotation, harmonics) mimicking planetary signals. Welch's method, combined with bias correction and Fisher transform, enabled a robust separation of stellar and planetary signatures in challenging datasets (Dodson-Robinson et al., 2022).
6. Practical Guidelines and Methodological Caveats
- Detrend input series with care to avoid augmenting low-frequency spectral content; physical interpretation is not robust under arbitrary detrending (Holm, 2015).
- Select segment/taper bandwidth for minimum when using analytical thresholds; otherwise, use Monte Carlo significance (Holm, 2013).
- For strongly periodic or serially correlated data, random-phase Monte Carlo surrogates provide the only suitable significance baseline (Holm, 2015).
- Avoid window/taper choices that produce high sidelobes or phase/duty-cycle artifacts; always inspect the effective spectral window (Dodson-Robinson et al., 2022).
- Report MSC with bias correction and error bars (Fisher/jackknife or MC percentile); raw MSC values tend to overestimate the strength of connection in small samples or strongly periodic data.
7. Limitations and Future Directions
- Stationarity assumptions: Strict stationarity is often violated over long geophysical time series. Local windowing (e.g., in MEM or wavelet) partially mitigates this, but interpretation remains challenging (Holm, 2013).
- Small sample bias: For , statistical robustness of coherence peaks is questionable; high MSC values can arise from chance alone.
- Gapped data: Multitaper with missing-data Slepian sequences is the preferred approach for uniformly sampled but incomplete series, superior to interpolation or zero-filling methods (Dodson-Robinson et al., 9 May 2025).
- Spectrally colored noise: Approximations based on white/red noise may under- or overstate significance. Only non-parametric surrogates are valid for complex periodic/colored records (Holm, 2015).
- Cross-domain applications: The efficacy of MSC relies on shared linear oscillatory features; nonlinear or phase-intermittent processes are not fully captured.
A plausible implication is that MSC, when properly estimated with adequate statistical safeguards and domain-specific model validation, serves as a rigorous diagnostic for frequency-domain linear relationships in a broad spectrum of physical systems. However, claims of physical causation should always rest on both statistical significance and plausible dynamical mechanisms.