Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Similarity analysis of DNA sequences through local distribution of nucleotides in strategic neighborhood (2303.14994v6)

Published 27 Mar 2023 in cs.DS

Abstract: We propose a new alignment-free algorithm by constructing a compact vector representation on $\mathbb{R}{24}$ of a DNA sequence of arbitrary length. Each component of this vector is obtained from a representative sequence, the elements of which are the values realized by a function $\Gamma$. This function $\Gamma$ acts on neighborhoods of arbitrary radius that are located at strategic positions within the DNA sequence and carries complete information about the local distribution of frequencies of the nucleotides as a consequence of the uniqueness of prime factorization of integer. The algorithm exhibits linear time complexity and turns out to consume significantly small memory. The two natural parameters characterizing the radius and location of the neighbourhoods are fixed by comparing the phylogenetic tree with the benchmark for full genome sequences of fish mtDNA datasets. Using these fitting parameters, the method is applied to analyze a number of genome sequences from benchmark and other standard datasets. Our algorithm proves to be computationally efficient compared to other well known algorithms when applied on simulated dataset.

Summary

We haven't generated a summary for this paper yet.