Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Subsequences With Generalised Gap Constraints: Upper and Lower Complexity Bounds (2404.10497v1)

Published 16 Apr 2024 in cs.DS

Abstract: For two strings u, v over some alphabet A, we investigate the problem of embedding u into w as a subsequence under the presence of generalised gap constraints. A generalised gap constraint is a triple (i, j, C_{i, j}), where 1 <= i < j <= |u| and C_{i, j} is a subset of A*. Embedding u as a subsequence into v such that (i, j, C_{i, j}) is satisfied means that if u[i] and u[j] are mapped to v[k] and v[l], respectively, then the induced gap v[k + 1..l - 1] must be a string from C_{i, j}. This generalises the setting recently investigated in [Day et al., ISAAC 2022], where only gap constraints of the form C_{i, i + 1} are considered, as well as the setting from [Kosche et al., RP 2022], where only gap constraints of the form C_{1, |u|} are considered. We show that subsequence matching under generalised gap constraints is NP-hard, and we complement this general lower bound with a thorough (parameterised) complexity analysis. Moreover, we identify several efficiently solvable subclasses that result from restricting the interval structure induced by the generalised gap constraints.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. Tight hardness results for LCS and other sequence similarity measures. In IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015, pages 59–78, 2015. doi:10.1109/FOCS.2015.14.
  2. Consequences of faster alignment of sequences. In Automata, Languages, and Programming - 41st International Colloquium, ICALP 2014, Copenhagen, Denmark, July 8-11, 2014, Proceedings, Part I, pages 39–51, 2014. doi:10.1007/978-3-662-43948-7\_4.
  3. Longest common subsequence with gap constraints. In Combinatorics on Words - 14th International Conference, WORDS 2023, Umeå, Sweden, June 12-16, 2023, Proceedings, pages 60–76, 2023. doi:10.1007/978-3-031-33180-0\_5.
  4. A refined laser method and faster matrix multiplication. In Dániel Marx, editor, Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms, SODA 2021, Virtual Conference, January 10 - 13, 2021, pages 522–539. SIAM, 2021. doi:10.1137/1.9781611976465.32.
  5. Complex event recognition languages: Tutorial. In Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems, DEBS 2017, Barcelona, Spain, June 19-23, 2017, pages 7–10, 2017. doi:10.1145/3093742.3095106.
  6. Practical variable length gap pattern matching. In Experimental Algorithms - 15th International Symposium, SEA 2016, St. Petersburg, Russia, June 5-8, 2016, Proceedings, pages 1–16, 2016. doi:10.1007/978-3-319-38851-9\_1.
  7. Ricardo A. Baeza-Yates. Searching subsequences. Theor. Comput. Sci., 78(2):363–376, 1991.
  8. String matching with variable length gaps. Theor. Comput. Sci., 443:25–34, 2012. doi:10.1016/j.tcs.2012.03.029.
  9. Hans L. Bodlaender. A linear-time algorithm for finding tree-decompositions of small treewidth. SIAM J. Comput., 25(6):1305–1317, 1996. doi:10.1137/S0097539793251219.
  10. Hans L. Bodlaender. A partial k-arboretum of graphs with bounded treewidth. Theor. Comput. Sci., 209(1-2):1–45, 1998. doi:10.1016/S0304-3975(97)00228-4.
  11. Sketching, streaming, and fine-grained complexity of (weighted) LCS. In Proc. FSTTCS 2018, volume 122 of LIPIcs, pages 40:1–40:16, 2018.
  12. Multivariate fine-grained complexity of longest common subsequence. In Proc. SODA 2018, pages 1216–1235, 2018.
  13. Unshuffling a square is NP-hard. J. Comput. Syst. Sci., 80(4):766–776, 2014. doi:10.1016/j.jcss.2013.11.002.
  14. Fast indexes for gapped pattern matching. In SOFSEM 2020: Theory and Practice of Computer Science - 46th International Conference on Current Trends in Theory and Practice of Informatics, SOFSEM 2020, Limassol, Cyprus, January 20-24, 2020, Proceedings, pages 493–504, 2020. doi:10.1007/978-3-030-38919-2\_40.
  15. Stephen A. Cook. The complexity of theorem-proving procedures. In Michael A. Harrison, Ranan B. Banerji, and Jeffrey D. Ullman, editors, Proceedings of the 3rd Annual ACM Symposium on Theory of Computing, May 3-5, 1971, Shaker Heights, Ohio, USA, pages 151–158. ACM, 1971. doi:10.1145/800157.805047.
  16. Introduction to Algorithms, 3rd Edition. MIT Press, 2009. URL: http://mitpress.mit.edu/books/introduction-algorithms.
  17. Pathwidth of outerplanar graphs. J. Graph Theory, 55(1):27–41, 2007. URL: https://doi.org/10.1002/jgt.20218, doi:10.1002/JGT.20218.
  18. Subsequences with gap constraints: Complexity bounds for matching and analysis problems. In 33rd International Symposium on Algorithms and Computation, ISAAC 2022, December 19-21, 2022, Seoul, Korea, pages 64:1–64:18, 2022. URL: https://doi.org/10.4230/LIPIcs.ISAAC.2022.64, doi:10.4230/LIPICS.ISAAC.2022.64.
  19. Crossing numbers and cutwidths. J. Graph Algorithms Appl., 7(3):245–251, 2003. URL: https://doi.org/10.7155/jgaa.00069, doi:10.7155/JGAA.00069.
  20. Construction of aho corasick automaton in linear time for integer alphabets. Inf. Process. Lett., 98(2):66–72, 2006. URL: https://doi.org/10.1016/j.ipl.2005.11.019, doi:10.1016/J.IPL.2005.11.019.
  21. Faster matrix multiplication via asymmetric hashing. CoRR, abs/2210.10173, 2022. arXiv:2210.10173, doi:10.48550/arXiv.2210.10173.
  22. Graph separation and search number. In Proc. 1983 Allerton Conf. on Communication, Control, and Computing, 1983.
  23. The vertex separation and search number of a graph. Inf. Comput., 113(1):50–79, 1994. URL: https://doi.org/10.1006/inco.1994.1064, doi:10.1006/INCO.1994.1064.
  24. Matching patterns with variables under simon’s congruence. In Reachability Problems - 17th International Conference, RP 2023, Nice, France, October 11-13, 2023, Proceedings, pages 155–170, 2023. doi:10.1007/978-3-031-45286-4\_12.
  25. Testing k𝑘kitalic_k-binomial equivalence. In Multidisciplinary Creativity, a collection of papers dedicated to G. Păun 65th birthday, pages 239–248, 2015. available in CoRR abs/1509.00622.
  26. Puzzling over subsequence-query extensions: Disjunction and generalised gaps. In Proceedings of the 15th Alberto Mendelzon International Workshop on Foundations of Data Management (AMW 2023), Santiago de Chile, Chile, May 22-26, 2023, 2023. URL: https://ceur-ws.org/Vol-3409/paper3.pdf.
  27. Complex event recognition in the big data era: a survey. VLDB J., 29(1):313–352, 2020. doi:10.1007/s00778-019-00557-w.
  28. Decidability, complexity, and expressiveness of first-order logic over the subword ordering. In Proc. LICS 2017, pages 1–12, 2017.
  29. Algorithms for computing the longest parameterized common subsequence. In Combinatorial Pattern Matching, 18th Annual Symposium, CPM 2007, London, Canada, July 9-11, 2007, Proceedings, pages 265–273, 2007. doi:10.1007/978-3-540-73437-6\_27.
  30. On the complexity of k-sat. J. Comput. Syst. Sci., 62(2):367–375, 2001. doi:10.1006/jcss.2000.1727.
  31. On the index of Simon’s congruence for piecewise testability. Inf. Process. Lett., 115(4):515–519, 2015.
  32. The height of piecewise-testable languages with applications in logical complexity. In Proc. CSL 2016, volume 62 of LIPIcs, pages 37:1–37:22, 2016.
  33. The height of piecewise-testable languages and the complexity of the logic of subwords. Log. Methods Comput. Sci., 15(2), 2019.
  34. Richard M. Karp. Reducibility among combinatorial problems. In Raymond E. Miller and James W. Thatcher, editors, Proceedings of a symposium on the Complexity of Computer Computations, held March 20-22, 1972, at the IBM Thomas J. Watson Research Center, Yorktown Heights, New York, USA, The IBM Research Symposia Series, pages 85–103. Plenum Press, New York, 1972. doi:10.1007/978-1-4684-2001-2\_9.
  35. Discovering event queries from traces: Laying foundations for subsequence-queries with wildcards and gap-size constraints. In 25th International Conference on Database Theory, ICDT 2022, 29th March-1st April, 2022 Edinburgh, UK, 2022.
  36. Discovering multi-dimensional subsequence queries from traces - from theory to practice. In Datenbanksysteme für Business, Technologie und Web (BTW 2023), 20. Fachtagung des GI-Fachbereichs ,,Datenbanken und Informationssysteme” (DBIS), 06.-10, März 2023, Dresden, Germany, Proceedings, pages 511–533, 2023. doi:10.18420/BTW2023-24.
  37. Subsequences in bounded ranges: Matching and analysis problems. In Anthony W. Lin, Georg Zetzsche, and Igor Potapov, editors, Reachability Problems - 16th International Conference, RP 2022, Kaiserslautern, Germany, October 17-21, 2022, Proceedings, volume 13608 of Lecture Notes in Computer Science, pages 140–159. Springer, 2022. doi:10.1007/978-3-031-19135-0\_10.
  38. Combinatorial algorithms for subsequence matching: A survey. In Henning Bordihn, Géza Horváth, and György Vaszil, editors, Proceedings 12th International Workshop on Non-Classical Models of Automata and Applications, NCMA 2022, Debrecen, Hungary, August 26-27, 2022, volume 367 of EPTCS, pages 11–27, 2022. doi:10.4204/EPTCS.367.2.
  39. Dietrich Kuske. The subtrace order and counting first-order logic. In Proc. CSR 2020, volume 12159 of Lecture Notes in Computer Science, pages 289–302, 2020.
  40. Languages ordered by the subword order. In Proc. FOSSACS 2019, volume 11425 of Lecture Notes in Computer Science, pages 348–364, 2019.
  41. Computing the k𝑘kitalic_k-binomial complexity of the Thue-Morse word. In Proc. DLT 2019, volume 11647 of Lecture Notes in Computer Science, pages 278–291, 2019.
  42. Generalized Pascal triangle for binomial coefficients of words. Electron. J. Combin., 24(1.44):36 pp., 2017.
  43. Efficiently mining closed subsequences with gap constraints. In SDM, pages 313–322. SIAM, 2008.
  44. Efficient mining of gap-constrained subsequences and its various applications. ACM Trans. Knowl. Discov. Data, 6(1):2:1–2:39, 2012.
  45. David Maier. The complexity of some problems on subsequences and supersequences. J. ACM, 25(2):322–336, April 1978.
  46. Subword histories and Parikh matrices. J. Comput. Syst. Sci., 68(1):1–21, 2004.
  47. T.A.J. Nicholson. Permutation procedure for minimising the number of crossings in a network. Proceedings of the Institution of Electrical Engineers, 115:21–26(5), January 1968.
  48. Rohit J Parikh. Language generating devices. Quarterly Progress Report, 60:199–212, 1961.
  49. On the piecewise complexity of words and periodic words. In SOFSEM 2024: Theory and Practice of Computer Science - 48th International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM 2024, Cochem, Germany, February 19-23, 2024, Proceedings, pages 456–470, 2024. doi:10.1007/978-3-031-52113-3\_32.
  50. William E. Riddle. An approach to software system modelling and analysis. Comput. Lang., 4(1):49–66, 1979. doi:10.1016/0096-0551(79)90009-2.
  51. Another generalization of abelian equivalence: Binomial complexity of infinite words. Theor. Comput. Sci., 601:47–57, 2015.
  52. Arto Salomaa. Connections between subwords and certain matrix mappings. Theoret. Comput. Sci., 340(2):188–203, 2005.
  53. On arch factorization and subword universality for words and compressed words. In Combinatorics on Words - 14th International Conference, WORDS 2023, Umeå, Sweden, June 12-16, 2023, Proceedings, pages 274–287, 2023. doi:10.1007/978-3-031-33180-0\_21.
  54. Shinnosuke Seki. Absoluteness of subword inequality is undecidable. Theor. Comput. Sci., 418:116–120, 2012.
  55. Alan C. Shaw. Software descriptions with flow expressions. IEEE Trans. Software Eng., 4(3):242–254, 1978. doi:10.1109/TSE.1978.231501.
  56. Imre Simon. Hierarchies of events with dot-depth one — Ph.D. thesis. University of Waterloo, 1972.
  57. Imre Simon. Piecewise testable events. In Autom. Theor. Form. Lang., 2nd GI Conf., volume 33 of LNCS, pages 214–222, 1975.
  58. Manfred Wiegers. Recognizing outerplanar graphs in linear time. In Gottfried Tinhofer and Gunther Schmidt, editors, Graphtheoretic Concepts in Computer Science, International Workshop, WG ’86, Bernried, Germany, June 17-19, 1986, Proceedings, volume 246 of Lecture Notes in Computer Science, pages 165–176. Springer, 1986. doi:10.1007/3-540-17218-1\_57.
  59. Dan E. Willard. Log-logarithmic worst-case range queries are possible in space theta(n). Inf. Process. Lett., 17(2):81–84, 1983. doi:10.1016/0020-0190(83)90075-3.
  60. Ryan Williams. A new algorithm for optimal 2-constraint satisfaction and its implications. Theor. Comput. Sci., 348(2-3):357–365, 2005. doi:10.1016/j.tcs.2005.09.023.
  61. Virginia Vassilevska Williams. On some fine-grained questions in algorithms and complexity, pages 3447–3487. URL: https://www.worldscientific.com/doi/abs/10.1142/9789813272880_0188, arXiv:https://www.worldscientific.com/doi/pdf/10.1142/9789813272880_0188, doi:10.1142/9789813272880_0188.
  62. Georg Zetzsche. The complexity of downward closure comparisons. In Proc. ICALP 2016, volume 55 of LIPIcs, pages 123:1–123:14, 2016.
  63. On complexity and optimization of expensive queries in complex event processing. In International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22-27, 2014, pages 217–228, 2014. doi:10.1145/2588555.2593671.
Citations (2)

Summary

We haven't generated a summary for this paper yet.