- The paper demonstrates that properly truncating Gaussian distributions for serial intervals is crucial in accurately estimating R0 and Rt values in COVID-19.
- It employs renewal equations and the Lotka-Euler formalism to rigorously compare generation time and serial interval-based reproduction numbers.
- Empirical evaluations reveal systematic underestimations in common methods, underscoring significant implications for epidemiological surveillance and intervention strategies.
Reproduction Numbers for COVID-19: Implications of Gaussian Serial Intervals with Presymptomatic Transmission
Overview
This paper provides a rigorous mathematical and epidemiological framework for calculating the basic and instantaneous reproduction numbers (R0​, Rt​) for COVID-19, incorporating the observed phenomenon of presymptomatic transmission. The author focuses on the consequences of utilizing Gaussian-distributed serial intervals (SIs)—which necessarily extend to negative values due to infectiousness occurring prior to symptom onset—and discusses the implications for commonly used methods relating incidence data to reproduction numbers. Explicit attention is given to the need for lower truncation of Gaussian distributions for both serial intervals and generation times, a factor that is often neglected in previous literature and which leads to systematic underestimation of reproduction metrics when omitted.
Mathematical Framework for Generation Times and Serial Intervals
The established approach links incidence of infection, serial intervals, and reproduction numbers via renewal equations:
C(t)=R0​S(t)∫Tm​∞​C(t−τ)g(τ)dτ
where g(Ï„) is the probability density function (PDF) for generation time or its proxy, the serial interval.
In periods of exponential growth/decline, the Lotka-Euler equation connects the reproduction number with the exponential rate constant r:
R0​1​=∫Tm​∞​e−rτg(τ)dτ
Necessity of Distribution Truncation
The paper emphasizes that, due to presymptomatic transmission in COVID-19, the serial interval can take negative values—unlike the actual generation time, which must be strictly positive. Using a Gaussian distribution for SIs yields a nonzero probability density for arbitrarily large negative intervals, which is unphysical given the constraints imposed by human incubation periods.
Explicit truncation at a lower limit (roughly the mean incubation duration) is essential:
- For generation time, Tm​≃+1 day.
- For serial interval, Tm​≃−5 to −6 days (COVID-19 mean incubation period).
Failure to impose these limits, as is common when naively applying continuous Gaussians, results in underestimation of R0​ and Rt​; this is quantitatively substantiated in this work.
Empirical Evaluation and Results
Comparative Analysis: Generation Times vs Serial Intervals
Using parameterizations from Ganyani et al. (2020), the author evaluates reproduction numbers for COVID-19 in Germany across epidemic waves:
- GT means: 3.86 days (Singapore), 2.90 days (Tianjin)
- SI means: similar, but with greater variance
Key finding: Reproduction numbers calculated using generation times are systematically higher than those calculated using serial intervals, especially when using properly truncated Gaussian distributions. This divergence increases with distance from the Rt​=1 threshold.
For example:
- With the Singapore data, R0​ for Gaussian GT-distribution: 2.84; for gamma GT-distribution: 2.27; for Gaussian SI-distribution: 1.67; and for untruncated Gaussian (Tm​→−∞): 1.29.
- For Tianjin, R0​ are: 2.57 (Gaussian GT), 1.76 (gamma GT), 1.29 (Gaussian SI), 0.95 (untruncated Gaussian).
The claim that standard application of Eq. 11 (untruncated Gaussian) "mostly predicts values of R0​ that are too low" is strongly supported by these numerical comparisons, highlighting an underappreciated bias in the literature.
Serial Interval Distributions and Implications of Truncation
Analysis of the SI datasets from Ali et al. (2020) and Du et al. (2020) illustrates:
- Inclusion of negative intervals (presymptomatic transmission) leads to broader SI distributions and alters the inferred temporal profile of Rt​.
- As interventions progress during an epidemic, the mean SI shortens (reflecting the impact of NPIs), with corresponding reductions in Rt​.
- Excluding negative SIs (e.g., using only t>0) erases this effect and leads to distribution mis-specification; truncated data best fit lognormal rather than normal distributions in this regime.
Distributional Choices: Gaussian vs Gamma and Lognormal
For measured generation times (always positive), gamma distributions are more appropriate. The comparison in the appendix shows:
- Gamma-distributed GTs yield lower R0​ and Rt​ values compared to Gaussians with the same moments, while both remain above those computed using SI proxies or untruncated Gaussians.
- The shapes of temporal Rt​ profiles are robust to the choice of distribution, but the excursion magnitudes vary.
Practical Implications
Epidemiological Surveillance and Model Calibration
- Truncation of SIs is essential for correct estimation of reproduction numbers in ongoing surveillance and model calibration tasks.
- Estimation of vaccination coverage and intervention effectiveness critically depends on accurate R0​, necessitating proper treatment of serial interval distributions in renewal-based and Lotka-Euler-based approaches.
- Retrospective estimation is required for Rt​ when negative serial intervals are present because future incidence informs past infection events.
Guidance for Statistical Modelling
- When fitting SI data for COVID-19 or similar pathogens with presymptomatic transmission, Gaussian models must be truncated at the lower bound defined by biological constraints.
- For generation time estimation and prediction, gamma or lognormal distributions should be preferred unless the empirical distribution is decisively symmetric and sharply peaked.
Speculation on Future Developments
- As more pathogens with presymptomatic transmission are identified, standardized procedures for truncation and distribution selection must be codified in surveillance practice.
- Expansion to joint inference frameworks incorporating incubation periods, infectiousness profiles, and SI distributions will enhance reproduction number estimation.
- Further methodological research may develop more flexible distributions or copula approaches capturing the full range of observed intervals.
Conclusion
This paper systematically demonstrates that accurate estimation of reproduction numbers in epidemics with presymptomatic transmission—such as COVID-19—requires Gaussian distributions of serial intervals to be truncated at lower limits consistent with biological reality. Neglecting this truncation, as is common, leads to underestimation of R0​ and Rt​, with downstream consequences for intervention strategy and policy. The findings underscore the necessity for robust epidemiological methods and careful statistical practice in real-time infectious disease modelling. The theoretical developments and comparative analyses herein suggest avenues for improved surveillance, reporting, and calibration of dynamic models in future epidemic settings.