Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 42 tok/s

Gemini 2.5 Pro 53 tok/s Pro

GPT-5 Medium 17 tok/s Pro

GPT-5 High 13 tok/s Pro

GPT-4o 101 tok/s Pro

Kimi K2 217 tok/s Pro

GPT OSS 120B 474 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

Stochastic Mutation Theory of SARS-CoV-2 Variants (2502.10471v2)

Published 13 Feb 2025 in q-bio.QM, q-bio.CB, and q-bio.PE

Abstract: Predicting the future evolutionary trajectory of SARS-CoV-2 remains a critical challenge, particularly due to the pivotal role of spike protein mutations. Developing an evolutionary model capable of continuously integrating new experimental data is an urgent priority. By employing well-founded assumptions for mutant representation (four-letter and two-letter formats) and the n-mer distance algorithm, we constructed an evolutionary tree of SARS-CoV-2 mutations that accurately reflects observed viral strain evolution. We introduce a stochastic method for generating new strains on this tree based on spike protein mutations. For a given set A of existing mutation sites, we define a set X of x randomly generated sites on the spike protein. Our analysis reveals that the position of a generated strain on the tree is determined by x. Through large-scale stochastic sampling, we predict the emergence of new macro-lineages. As x increases, the proportions of macro-lineages shift: lineage O surpasses lineage N, lineage P overtakes O, and ultimately, new lineage Q surpasses P. We identified threshold values of x that distinguish between macro-lineages. Furthermore, we demonstrate that the linear regression of the number of mutated sites (x) against sample collection dates (t) provides a robust approximation, enabling the prediction of new lineage emergence based on the x-t relationship. To conclude, we demonstrated that the SARS-CoV-2 evolution adheres to statistical principles: the emergence of new strains on the evolutionary tree can be driven by randomly generated spike protein sites; and the large-scale stochastic sampling uncovers evolutionary patterns governing the emergence of diverse macro-lineages.