On the Complexity of the Median and Closest Permutation Problems (2311.17224v1)
Abstract: Genome rearrangements are events where large blocks of DNA exchange places during evolution. The analysis of these events is a promising tool for understanding evolutionary genomics, providing data for phylogenetic reconstruction based on genome rearrangement measures. Many pairwise rearrangement distances have been proposed, based on finding the minimum number of rearrangement events to transform one genome into the other, using some predefined operation. When more than two genomes are considered, we have the more challenging problem of rearrangement-based phylogeny reconstruction. Given a set of genomes and a distance notion, there are at least two natural ways to define the "target" genome. On the one hand, finding a genome that minimizes the sum of the distances from this to any other, called the median genome. Finding a genome that minimizes the maximum distance to any other, called the closest genome. Considering genomes as permutations, some distance metrics have been extensively studied. We investigate median and closest problems on permutations over the metrics: breakpoint, swap, block-interchange, short-block-move, and transposition. In biological matters some values are usually small, such as the solution value d or the number k of input permutations. For each of these metrics and parameters d or k, we analyze the closest and the median problems from the viewpoint of parameterized complexity. We obtain the following results: NP-hardness for finding the median/closest permutation for some metrics, even for k = 3; Polynomial kernels for the problems of finding the median permutation of all studied metrics, considering the target distance d as parameter; NP-hardness result for finding the closest permutation by short-block-moves; FPT algorithms and infeasibility of polynomial kernels for finding the closest permutation for some metrics parameterized by the target distance d.
- Martin Bader. The transposition median problem is NP-complete. Theor. Comput. Sci., 412(12-14):1099–1110, 2011.
- Sorting by transpositions. SIAM J. Discrete Math., 11(2):224–240, 1998.
- On the kernelization complexity of string problems. Theor. Comput. Sci., 730:21–31, 2018.
- Kernel bounds for disjoint cycles and disjoint paths. Theor. Comput. Sci., 412(35):4570–4578, 2011.
- David Bryant. The complexity of the breakpoint median problem. Centre de recherches mathematiques, Technical Repert, 1998.
- Sorting by transpositions is difficult. SIAM J. Discrete Math., 26(3):1148–1180, 2012.
- Alberto Caprara. The reversal median problem. INFORMS J. Comput., 15(1):93–113, 2003.
- David Alan Christie. Genome Rearrangement Problems. University of Glasgow (United Kingdom), 1998.
- On the computational complexity of closest genome problems. Discrete Applied Mathematics, 274:26–34, 2020.
- Advancing the transposition distance and diameter through lonely permutations. SIAM J. Discrete Math., 27(4):1682–1709, 2013.
- A faster 1.375-approximation algorithm for sorting by transpositions. In WABI 2014, pages 26–37. Springer Berlin Heidelberg, 2014.
- A faster 1.375-approximation algorithm for sorting by transpositions. J. Comput. Biol., 22(11):1044–1056, 2015.
- Luís Felipe I Cunha and Fábio Protti. Genome rearrangements on multigenomic models: Applications of graph convexity problems. J. Comput. Biol., 26(11):1214–1222, 2019.
- Parameterized Algorithms, volume 5. Springer, 2015.
- Parameterized Complexity. Springer Science & Business Media, 2012.
- Combinatorics of Genome Rearrangements. MIT press, 2009.
- Msoar: a high-throughput ortholog assignment system based on genome rearrangement. J. Comput. Biol., 14(9):1160–1175, 2007.
- Fixed-parameter algorithms for closest string and related problems. Algorithmica, 37:25–42, 2003.
- Exact solutions for closest string and related problems. In ISAAC 2001, pages 441–453. Springer, 2001.
- Medians seek the corners, and other conjectures. In BMC bioinformatics, volume 13, pages 1–7. Springer, 2012.
- Lenwood S Heath and John Paul C Vergara. Sorting by bounded block-moves. Discrete Appl. Math., 88:181–206, 1998.
- Lenwood S Heath and John Paul C Vergara. Sorting by short block-moves. Algorithmica, 28:323–352, 2000.
- Ian Holyer. The NP-completeness of some edge-partition problems. SIAM Journal on Computing, 10(4):713–717, 1981.
- D Knuth. The Art of Computer Programming: Sorting and Searching, vol 3, 1998.
- Anthony Labarre. Sorting by Prefix Block-Interchanges. In Yixin Cao, Siu-Wing Cheng, and Minming Li, editors, 31st International Symposium on Algorithms and Computation ISAAC 2020, volume 181, pages 55:1–55:15, 2020.
- Distinguishing string selection problems. Inf. Comput., 185(1):41–55, 2003.
- Pavel Pevzner. Computational Molecular Biology: An Algorithmic Approach. MIT press, 2000.
- The median problems for breakpoints are NP-complete. In Elec. Colloq. on Comput. Complexity, volume 71, 1998.
- V Yu Popov. Multiple genome rearrangement by swaps and by element duplications. Theor. Comput. Sci., 385(1-3):115–126, 2007.
- Reversals and transpositions over finite alphabets. SIAM J. Discrete Math., 19(1):224–244, 2005.
- The chromosome inversion problem. J. Theor. Biol., 99(1):1–7, 1982.