Reducing Size Bias in Epidemic Network Modelling (2501.13195v3)
Abstract: Epidemiological models help policymakers mitigate disease spread by predicting transmission metrics based on disease dynamics and contact networks. Calibrating these models requires representative network sampling. We investigate the Random Walk (RW) and Metropolis-Hastings Random Walk (MHRW) algorithms for three network types: Erd\H{o}s-R\'enyi (ER), Small-world (SW), and Scale-free (SF). Disease transmission is simulated using a stochastic susceptible-infected-recovered (SIR) framework. For ER and SW networks, RW overestimates infected individuals and secondary infections by $25\%$ due to size bias, favouring highly connected nodes. MHRW, though more computationally intensive, reduces size bias and provides more representative samples. For time-to-infection, both algorithms provide representative estimates. However, neither algorithm samples SF networks representatively, exhibiting significant variability. Furthermore, removing duplicate sample nodes reduces MHRW's accuracy across three network types. We apply both algorithms to a cattle movement network of 46,512 farms combining ER, SW, and SF features. RW overestimates infected farms by about $100\%$ and secondary infections by over $900\%$, reflecting significant size bias, while MHRW estimates align within $1\%$ of the cattle network values. RW underestimates time-to-infection by about $40\%$, while MHRW overestimates it by $10\%$. Accuracy, again, deteriorates when duplicates nodes are removed. Our findings guide algorithm selection and intervention strategies based on network structure and disease severity; RW's conservative estimates suit high-mortality, fast-spreading epidemics, while MHRW enables more precise interventions for slower epidemics.