Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhanced Graph Pattern Matching (2402.16205v1)

Published 25 Feb 2024 in cs.DS

Abstract: Pattern matching queries on strings can be solved in linear time by Knuth-Morris-Pratt (KMP) algorithm. In 1973, Weiner introduced the suffix tree of a string [FOCS 1973] and showed that the seemingly more difficult problem of computing matching statistics can also be solved in liner time. Pattern matching queries on graphs are inherently more difficult: under the Orthogonal Vector hypothesis, the graph pattern matching problem cannot be solved in subquadratic time [TALG 2023]. The complexity of graph pattern matching can be parameterized by the topological complexity of the considered graph, which is captured by a parameter $ p $ [JACM 2023]. In this paper, we show that, as in the string setting, computing matching statistics on graph is as difficult as solving standard pattern matching queries. To this end, we introduce a notion of longest common prefix (LCP) array for arbitrary graphs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (72)
  1. Subtree isomorphism revisited. ACM Trans. Algorithms, 14(3), jun 2018. doi:10.1145/3093239.
  2. Subtree Isomorphism Revisited, pages 1256–1271. URL: https://epubs.siam.org/doi/abs/10.1137/1.9781611974331.ch88, arXiv:https://epubs.siam.org/doi/pdf/10.1137/1.9781611974331.ch88, doi:10.1137/1.9781611974331.ch88.
  3. Tight hardness results for lcs and other sequence similarity measures. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, pages 59–78, 2015. doi:10.1109/FOCS.2015.14.
  4. Simulating branching programs with edit distance and friends: or: a polylog shaved is a lower bound made. In Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing, STOC ’16, page 375–388, New York, NY, USA, 2016. Association for Computing Machinery. doi:10.1145/2897518.2897653.
  5. Popular conjectures imply strong lower bounds for dynamic problems. In 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pages 434–443, 2014. doi:10.1109/FOCS.2014.53.
  6. Consequences of faster alignment of sequences. In Javier Esparza, Pierre Fraigniaud, Thore Husfeldt, and Elias Koutsoupias, editors, Automata, Languages, and Programming, pages 39–51, Berlin, Heidelberg, 2014. Springer Berlin Heidelberg.
  7. Practical compressed suffix trees. Algorithms, 6(2):319–351, 2013. URL: https://www.mdpi.com/1999-4893/6/2/319, doi:10.3390/a6020319.
  8. The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching. Springer Publishing Company, Incorporated, 1 edition, 2008.
  9. Tatsuya Akutsu. A linear time pattern matching algorithm between a string and a tree. In Alberto Apostolico, Maxime Crochemore, Zvi Galil, and Udi Manber, editors, Combinatorial Pattern Matching, pages 1–10, Berlin, Heidelberg, 1993. Springer Berlin Heidelberg.
  10. Pattern matching in hypertext. Journal of Algorithms, 35(1):82–99, 2000. URL: https://www.sciencedirect.com/science/article/pii/S0196677499910635, doi:10.1006/jagm.1999.1063.
  11. Alberto Apostolico. The myriad virtues of subword trees. In Alberto Apostolico and Zvi Galil, editors, Combinatorial Algorithms on Words, pages 85–96, Berlin, Heidelberg, 1985. Springer Berlin Heidelberg.
  12. Edit distance cannot be computed in strongly subquadratic time (unless seth is false). In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC ’15, page 51–58, New York, NY, USA, 2015. Association for Computing Machinery. doi:10.1145/2746539.2746612.
  13. Which regular expression patterns are hard to match? In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 457–466, 2016. doi:10.1109/FOCS.2016.56.
  14. Edit distance cannot be computed in strongly subquadratic time (unless seth is false). SIAM Journal on Computing, 47(3):1087–1097, 2018. arXiv:https://doi.org/10.1137/15M1053128, doi:10.1137/15M1053128.
  15. Brenda S. Baker. A theory of parameterized pattern matching: algorithms and applications. In Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, STOC ’93, page 71–80, New York, NY, USA, 1993. Association for Computing Machinery. doi:10.1145/167088.167115.
  16. Optimal wheeler language recognition. In Franco Maria Nardini, Nadia Pisanti, and Rossano Venturini, editors, String Processing and Information Retrieval, pages 62–74, Cham, 2023. Springer Nature Switzerland.
  17. The Fine-Grained Complexity of Episode Matching. In Hideo Bannai and Jan Holub, editors, 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022), volume 223 of Leibniz International Proceedings in Informatics (LIPIcs), pages 4:1–4:12, Dagstuhl, Germany, 2022. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. URL: https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2022.4, doi:10.4230/LIPIcs.CPM.2022.4.
  18. Karl Bringmann. Why walking the dog takes time: Frechet distance has no strongly subquadratic algorithms unless seth fails. In Proceedings of the 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, FOCS ’14, page 661–670, USA, 2014. IEEE Computer Society. doi:10.1109/FOCS.2014.76.
  19. Quadratic conditional lower bounds for string problems and dynamic time warping. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, pages 79–97, 2015. doi:10.1109/FOCS.2015.15.
  20. M. Burrows and D. J. Wheeler. A block-sorting lossless data compression algorithm. Technical report, 1994.
  21. Simple deterministic wildcard matching. Information Processing Letters, 101(2):53–54, 2007. URL: https://www.sciencedirect.com/science/article/pii/S002001900600250X, doi:10.1016/j.ipl.2006.08.002.
  22. Verifying candidate matches in sparse and wildcard matching. In Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing, STOC ’02, page 592–601, New York, NY, USA, 2002. Association for Computing Machinery. doi:10.1145/509907.509992.
  23. Tree pattern matching to subset matching in linear time. SIAM Journal on Computing, 32(4):1056–1066, 2003. arXiv:https://doi.org/10.1137/S0097539700382704, doi:10.1137/S0097539700382704.
  24. Co-lexicographically ordering automata and regular languages - part i. J. ACM, 70(4), aug 2023. doi:10.1145/3607471.
  25. On Indexing and Compressing Finite Automata, pages 2585–2599. URL: https://epubs.siam.org/doi/abs/10.1137/1.9781611976465.153, arXiv:https://epubs.siam.org/doi/pdf/10.1137/1.9781611976465.153, doi:10.1137/1.9781611976465.153.
  26. 125 Problems in Text Algorithms: with Solutions. Cambridge University Press, 2021.
  27. Faster tree pattern matching. In Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science, pages 145–150 vol.1, 1990. doi:10.1109/FSCS.1990.89533.
  28. Faster tree pattern matching. J. ACM, 41(2):205–213, mar 1994. doi:10.1145/174652.174653.
  29. Tight Conditional Lower Bounds for Longest Common Increasing Subsequence. In Daniel Lokshtanov and Naomi Nishimura, editors, 12th International Symposium on Parameterized and Exact Computation (IPEC 2017), volume 89 of Leibniz International Proceedings in Informatics (LIPIcs), pages 15:1–15:13, Dagstuhl, Germany, 2018. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. URL: https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.IPEC.2017.15, doi:10.4230/LIPIcs.IPEC.2017.15.
  30. Tight conditional lower bounds for longest common increasing subsequence. Algorithmica, 81(10):3968–3992, 2019. URL: https://doi.org/10.1007/s00453-018-0485-7, doi:10.1007/S00453-018-0485-7.
  31. On the Complexity of String Matching for Graphs. In Christel Baier, Ioannis Chatzigiannakis, Paola Flocchini, and Stefano Leonardi, editors, 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019), volume 132 of Leibniz International Proceedings in Informatics (LIPIcs), pages 55:1–55:15, Dagstuhl, Germany, 2019. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. URL: https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2019.55, doi:10.4230/LIPIcs.ICALP.2019.55.
  32. Graphs cannot be indexed in polynomial time for sub-quadratic time string matching, unless seth fails. In Tomáš Bureš, Riccardo Dondi, Johann Gamper, Giovanna Guerrini, Tomasz Jurdziński, Claus Pahl, Florian Sikora, and Prudence W.H. Wong, editors, SOFSEM 2021: Theory and Practice of Computer Science, pages 608–622, Cham, 2021. Springer International Publishing.
  33. Graphs cannot be indexed in polynomial time for sub-quadratic time string matching, unless SETH fails. Theor. Comput. Sci., 975:114128, 2023. URL: https://doi.org/10.1016/j.tcs.2023.114128, doi:10.1016/J.TCS.2023.114128.
  34. On the complexity of string matching for graphs. ACM Trans. Algorithms, 19(3), apr 2023. doi:10.1145/3588334.
  35. P. Ferragina and G. Manzini. Opportunistic data structures with applications. In Proceedings 41st Annual Symposium on Foundations of Computer Science, pages 390–398, 2000. doi:10.1109/SFCS.2000.892127.
  36. Indexing compressed text. J. ACM, 52(4):552–581, jul 2005. doi:10.1145/1082036.1082039.
  37. Johannes Fischer. Wee lcp. Information Processing Letters, 110(8):317–320, 2010. URL: https://www.sciencedirect.com/science/article/pii/S002001901000044X, doi:10.1016/j.ipl.2010.02.010.
  38. An(other) entropy-bounded compressed suffix tree. In Paolo Ferragina and Gad M. Landau, editors, Combinatorial Pattern Matching, pages 152–165, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg.
  39. Faster entropy-bounded compressed suffix trees. Theoretical Computer Science, 410(51):5354–5364, 2009. Combinatorial Pattern Matching. URL: https://www.sciencedirect.com/science/article/pii/S0304397509006392, doi:10.1016/j.tcs.2009.09.012.
  40. String-matching and other products. Technical report, USA, 1974.
  41. Optimal-Time Text Indexing in BWT-runs Bounded Space, pages 1459–1477. URL: https://epubs.siam.org/doi/abs/10.1137/1.9781611975031.96, arXiv:https://epubs.siam.org/doi/pdf/10.1137/1.9781611975031.96, doi:10.1137/1.9781611975031.96.
  42. Fully functional suffix trees and optimal text searching in bwt-runs bounded space. J. ACM, 67(1), jan 2020. doi:10.1145/3375890.
  43. Daniel Gibney. An efficient elastic-degenerate text index? not likely. In Christina Boucher and Sharma V. Thankachan, editors, String Processing and Information Retrieval, pages 76–88, Cham, 2020. Springer International Publishing.
  44. Compressed suffix trees: Efficient computation and storage of lcp-values. ACM J. Exp. Algorithmics, 18, may 2013. doi:10.1145/2444016.2461327.
  45. Compressed suffix arrays and suffix trees with applications to text indexing and string matching (extended abstract). In Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, STOC ’00, page 397–406, New York, NY, USA, 2000. Association for Computing Machinery. doi:10.1145/335305.335351.
  46. Compressed suffix arrays and suffix trees with applications to text indexing and string matching. SIAM Journal on Computing, 35(2):378–407, 2005. arXiv:https://doi.org/10.1137/S0097539702402354, doi:10.1137/S0097539702402354.
  47. Dan Gusfield. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997.
  48. Pattern matching in trees. J. ACM, 29(1):68–95, 1982. doi:10.1145/322290.322295.
  49. P. Indyk. Deterministic superimposed coding with applications to pattern matching. In Proceedings 38th Annual Symposium on Foundations of Computer Science, pages 127–136, 1997. doi:10.1109/SFCS.1997.646101.
  50. P. Indyk. Faster algorithms for string matching problems: matching the convolution bound. In Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280), pages 166–173, 1998. doi:10.1109/SFCS.1998.743440.
  51. Adam Kalai. Efficient pattern-matching with don’t cares. In Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’02, page 655–656, USA, 2002. Society for Industrial and Applied Mathematics.
  52. Fast pattern matching in strings. SIAM Journal on Computing, 6(2):323–350, 1977. arXiv:https://doi.org/10.1137/0206024, doi:10.1137/0206024.
  53. Longest common substring with approximately k mismatches. Algorithmica, 81(6):2633–2652, 2019. URL: https://doi.org/10.1007/s00453-019-00548-x, doi:10.1007/S00453-019-00548-X.
  54. S.R. Kosaraju. Efficient tree pattern matching. In 30th Annual Symposium on Foundations of Computer Science, pages 178–183, 1989. doi:10.1109/SFCS.1989.63475.
  55. Genome-Scale Algorithm Design: Bioinformatics in the Era of High-Throughput Sequencing (2nd edition). Cambridge University Press, 2023. URL: http://www.genome-scale.info/.
  56. Suffix arrays: a new method for on-line string searches. In Proceedings of the First Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’90, page 319–327, USA, 1990. Society for Industrial and Applied Mathematics.
  57. Suffix arrays: A new method for on-line string searches. SIAM Journal on Computing, 22(5):935–948, 1993. arXiv:https://doi.org/10.1137/0222058, doi:10.1137/0222058.
  58. Approximate string matching with arbitrary costs for text and hypertext. In Advances In Structural And Syntactic Pattern Recognition, pages 22–33. World Scientific, 1992.
  59. Space efficient suffix trees. Journal of Algorithms, 39(2):205–222, 2001. URL: https://www.sciencedirect.com/science/article/pii/S0196677400911519, doi:10.1006/jagm.2000.1151.
  60. Gonzalo Navarro. Improved approximate pattern matching on hypertext. Theor. Comput. Sci., 237(1–2):455–463, apr 2000. doi:10.1016/S0304-3975(99)00333-3.
  61. Gonzalo Navarro. Compact Data Structures: A Practical Approach. Cambridge University Press, 2016.
  62. Compressed full-text indexes. ACM Comput. Surv., 39(1):2–es, apr 2007. doi:10.1145/1216370.1216372.
  63. Gonzalo Navarro and Luís M. S. Russo. Fast fully-compressed suffix trees. In 2014 Data Compression Conference, pages 283–291, 2014. doi:10.1109/DCC.2014.40.
  64. Computing matching statistics and maximal exact matches on compressed full-text indexes. In Edgar Chavez and Stefano Lonardi, editors, String Processing and Information Retrieval, pages 347–358, Berlin, Heidelberg, 2010. Springer Berlin Heidelberg.
  65. String matching in hypertext. In Zvi Galil and Esko Ukkonen, editors, Combinatorial Pattern Matching, pages 318–329, Berlin, Heidelberg, 1995. Springer Berlin Heidelberg.
  66. Mihai Patrascu. Towards polynomial lower bounds for dynamic problems. In Proceedings of the Forty-Second ACM Symposium on Theory of Computing, STOC ’10, page 603–610, New York, NY, USA, 2010. Association for Computing Machinery. doi:10.1145/1806689.1806772.
  67. Adam Polak. Why is it hard to beat o(n2) for longest common weakly increasing subsequence? Information Processing Letters, 132:1–5, 2018. URL: https://www.sciencedirect.com/science/article/pii/S0020019017302016, doi:10.1016/j.ipl.2017.11.007.
  68. Aligning sequences to general graphs in o(v + me) time. bioRxiv, 2017. URL: https://www.biorxiv.org/content/early/2017/11/08/216127, arXiv:https://www.biorxiv.org/content/early/2017/11/08/216127.full.pdf, doi:10.1101/216127.
  69. Fully-compressed suffix trees. In Eduardo Sany Laber, Claudson Bornstein, Loana Tito Nogueira, and Luerbio Faria, editors, LATIN 2008: Theoretical Informatics, pages 362–373, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg.
  70. Fully compressed suffix trees. ACM Trans. Algorithms, 7(4), sep 2011. doi:10.1145/2000807.2000821.
  71. Kunihiko Sadakane. Succinct representations of lcp information and improvements in the compressed suffix arrays. In Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’02, page 225–232, USA, 2002. Society for Industrial and Applied Mathematics.
  72. Kunihiko Sadakane. Compressed suffix trees with full functionality. Theory Comput. Syst., 41(4):589–607, 2007. URL: https://doi.org/10.1007/s00224-006-1198-x, doi:10.1007/S00224-006-1198-X.

Summary

We haven't generated a summary for this paper yet.