Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 20 tok/s Pro
GPT-5 High 31 tok/s Pro
GPT-4o 100 tok/s Pro
Kimi K2 177 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Expected Density of Random Minimizers (2410.16968v2)

Published 22 Oct 2024 in math.CO and q-bio.GN

Abstract: Minimizer schemes, or just minimizers, are a very important computational primitive in sampling and sketching biological strings. Assuming a fixed alphabet of size $\sigma$, a minimizer is defined by two integers $k,w\ge2$ and a total order $\rho$ on strings of length $k$ (also called $k$-mers). A string is processed by a sliding window algorithm that chooses, in each window of length $w+k-1$, its minimal $k$-mer with respect to $\rho$. A key characteristic of the minimizer is the expected density of chosen $k$-mers among all $k$-mers in a random infinite $\sigma$-ary string. Random minimizers, in which the order $\rho$ is chosen uniformly at random, are often used in applications. However, little is known about their expected density $\mathcal{DR}\sigma(k,w)$ besides the fact that it is close to $\frac{2}{w+1}$ unless $w\gg k$. We first show that $\mathcal{DR}\sigma(k,w)$ can be computed in $O(k\sigma{k+w})$ time. Then we attend to the case $w\le k$ and present a formula that allows one to compute $\mathcal{DR}\sigma(k,w)$ in just $O(w \log w)$ time. Further, we describe the behaviour of $\mathcal{DR}\sigma(k,w)$ in this case, establishing the connection between $\mathcal{DR}\sigma(k,w)$, $\mathcal{DR}\sigma(k+1,w)$, and $\mathcal{DR}\sigma(k,w+1)$. In particular, we show that $\mathcal{DR}\sigma(k,w)<\frac{2}{w+1}$ (by a tiny margin) unless $w$ is small. We conclude with some partial results and conjectures for the case $w>k$.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 11 likes.

Upgrade to Pro to view all of the tweets about this paper: