Papers
Topics
Authors
Recent
Search
2000 character limit reached

Modelling Under-Reported Data: Pitfalls of Naïve Approaches and a New Statistical Framework for Epidemic Curve Reconstruction

Published 12 Sep 2025 in stat.AP | (2509.10668v1)

Abstract: Count-valued autoregressions are widely used to analyse time-series of reported infectious-disease cases because of their close connection with discrete-time transmission models. However, when such models are applied directly to under-reported case counts, their mechanistic interpretation can break down. We establish new theoretical results quantifying the consequences of ignoring under-reporting in these models. To address this issue, reported cases are often modelled as a binomially thinned version of an underlying count process, but such models are difficult to fit because the unobserved true counts are serially correlated and integer-valued. We develop a new statistical framework for under-reported infectious-disease data that uses a normal-normal approximation to a broad class of thinned count autoregressions and then accurately maps this continuous process back to the integers. Through simulations and applications to rotavirus incidence in a German state and Covid-19 incidence in English conurbations, we demonstrate that our approach both retains the mechanistic appeal of thinned autoregressions and substantially simplifies inference.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.