Replication of SARS-CoV-2 mutation analysis suggests differences in per-protein mutation characteristics (2112.07770v2)
Abstract: The increasing spread of COVID-19, caused by the virus SARS-CoV-2, raises concerns about the extent to which mutations have occurred across the viral genome. We present a partial replication of an earlier 2021 study by Wang, R. et al. that determined the presence of four substrains and eleven top mutations in the United States. We analyze a portion of the authors' data set in order to recreate Figure S1 from the paper, recapitulating the same features observed in the original figure. We further generate a summary of mutation characteristics for each of the 26 named proteins and confirm the significance of the spike protein at roughly 24% of all recorded mutations. Our analysis suggests that additional factors may affect per-protein mutation rate besides protein length.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.