Gaussian-Distributed Serial Intervals
- Gaussian-distributed serial intervals are statistical constructs that capture the time differences between primary and secondary cases, including negative values from presymptomatic transmission.
- The truncated Gaussian formulation, defined by parameters μ, σ, and a biologically motivated lower cutoff, corrects biases in the estimation of reproduction numbers.
- These models bridge epidemiological analysis and stationary process theory by linking persistence intervals between sign changes to the spectral properties of the underlying process.
Gaussian-distributed serial intervals (SIs) are statistical constructs used to capture the temporal relationship between linked events in processes exhibiting symmetric, potentially two-sided temporal distributions. In epidemiological modeling, particularly for diseases manifesting presymptomatic transmission such as COVID-19, empirical SI-histograms demonstrate significant support for negative values. This leads to natural adoption of a Gaussian framework rather than traditional, strictly positive models. In the context of stationary Gaussian processes, related constructions arise from gap lengths (or serial intervals) between sign changes, whose persistence intervals similarly exhibit statistical regularities described by the spectral properties of the underlying process.
1. Conceptual Definition and Rationale
The serial interval for infectious diseases is defined as the difference in onset times between primary (infector) and secondary (infectee) cases. For diseases with exclusively postsymptomatic transmission, SI is inherently positive. Conversely, in COVID-19 and related infections, presymptomatic transmission allows infectees to manifest symptoms before their infectors, resulting in SIs with negative domain support. The empirical SI-histogram tends to be approximately symmetric, motivating the modeling of the SI as a Gaussian (normal) distribution. In stationary process theory, gap lengths between sign changes (serial intervals) are also naturally treated within a Gaussian regime, due to the stationarity and continuity of sample paths.
This continuous, symmetric modeling requires caution: physical and biological constraints imply that SIs cannot be arbitrarily negative—infectiousness only arises after infection, and symptoms appear after a characteristic incubation period. Accordingly, truncated normal (Gaussian) distributions are preferred, with lower cutoffs grounded in the incubation period.
2. Mathematical Formulation of the Truncated Gaussian Serial Interval
Let and denote the mean and standard deviation of the SI estimated from data, and let specify the minimal feasible SI value (biologically motivated, e.g., d for SARS-CoV-2). The truncated Gaussian probability density function (pdf) is given by: with for , and the standard normal cumulative distribution function. The denominator ensures proper normalization, limiting the total probability mass to the physical domain.
3. Lotka–Euler Equation and Epidemiological Application
During periods of exponential growth/decay, the renewal equation links incidence rates to the reproduction number via the serial interval distribution. With for exponential phase (rate ), substitution yields the Lotka–Euler equation for a truncated Gaussian SI: Via completion of the square, this evaluates to
If the SI support is erroneously considered infinite , the CDF ratio tends towards unity, yielding the commonly cited, but systematically biased, result: This infinite-Gaussian formula underestimates , as it includes unphysical negative SIs.
4. Statistical Bounds and Persistence Intervals in Gaussian Stationary Processes
The analysis of Feldheim–Feldheim (2013) quantifies the probability that a stationary Gaussian process persists in sign over an interval , defining the “serial interval” as the gap length between sign changes: Given a covariance with spectral measure bounded on (either upper-and-lower, or lower only), exponential bounds on the survival probability of are established:
- Upper bound (nondegeneracy): for ,
- Lower bound: for .
These statements depend crucially on the spectral density near the origin. A process vanishing to second order at zero (e.g., ) yields —precluding exponential bounds. This highlights the necessity of “flat” spectral density for exponential persistence.
5. Numerical Comparisons: Truncated vs. Infinite Gaussian SI Models
Empirical data, as compiled in Marsh (2025), demonstrate quantifiable downward bias in reproduction number estimation when infinite-Gaussian support is assumed:
| Fit (Location/Type) | (days) | (days) | (Truncated) | (Infinite) |
|---|---|---|---|---|
| Mainland China SI (Du et al. 2020) | (3.96, 4.75) | -5 | 1.67 | 1.29 |
| Tianjin GT | (2.90, 2.86) | +1 | 2.57 | 0.95 |
| Singapore GT | (3.86, 2.65) | +1 | 2.84 | <not used> |
Discrepancies in estimation are on the order of 20–30%, evidencing the necessity for rigorous application of lower cutoffs.
6. Practical Guidelines and Implementation Recommendations
- If presymptomatic transmission is present or suspected, implement continuous SI models allowing negative values, but truncate at an incubation-period-motivated lower limit.
- Always normalize the truncated pdf, explicitly citing parameter values and cutoff.
- Apply the truncated Lotka–Euler equation to maintain biological and mathematical consistency; avoid the infinite Gaussian form except for distribution limits with .
- In discrete renewal-equation computations of , enforce the same lower cutoff in summation indices, and ensure negative times are handled consistently (shifting to future incidence, as appropriate).
- For SI distributions fit by gamma or lognormal models, verify effective lower limits correspond to the presymptomatic window.
- Reporting practices should include , , and explicit rationale for data treatment on .
7. Connections to Stationary Process Theory and Broader Implications
The statistical treatment of serial intervals as Gaussian-distributed quantities has analogs in stationary process theory, specifically in the distribution of persisting intervals between sign changes. In both epidemiological and physical contexts, exponentially decaying tails emerge from nondegenerate spectral density at the origin. Where this condition fails, heavier decay (e.g., ) can arise. Thus, the Gaussian-distributed SI model, when properly truncated, bridges theoretical statistical properties with applied real-world processes subject to biological and physical constraints.
A plausible implication is that neglecting these constraints can yield systematically biased quantitative inferences, notably in epidemic estimation and in physical persistence probability analyses. Strict adherence to normalized, truncated interpretations aligns analytic models with reality and preserves the fidelity of statistical predictions.