Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
55 tokens/sec
2000 character limit reached

GLL-based Context-Free Path Querying for Neo4j (2312.11925v1)

Published 19 Dec 2023 in cs.DB

Abstract: We propose GLL-based context-free path querying algorithm which handles queries in Extended Backus-Naur Form (EBNF) using Recursive State Machines (RSM). Utilization of EBNF allows one to combine traditional regular expressions and mutually recursive patterns in constraints natively. The proposed algorithm solves both the reachability-only and the all-paths problems for the all-pairs and the multiple sources cases. The evaluation on realworld graphs demonstrates that utilization of RSMs increases performance of query evaluation. Being implemented as a stored procedure for Neo4j, our solution demonstrates better performance than a similar solution for RedisGraph. Performance of our solution of regular path queries is comparable with performance of native Neo4j solution, and in some cases our solution requires significantly less memory.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Ali Afroozeh and Anastasia Izmaylova. 2015. Faster, Practical GLL Parsing. In Compiler Construction, Björn Franke (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 89–108.
  2. Analysis of Recursive State Machines. ACM Trans. Program. Lang. Syst. 27, 4 (jul 2005), 786–818. https://doi.org/10.1145/1075382.1075387
  3. Context-Free Path Querying with All-Path Semantics by Matrix Multiplication. In Proceedings of the 4th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA) (Virtual Event, China) (GRADES-NDA ’21). Association for Computing Machinery, New York, NY, USA, Article 4, 7 pages. https://doi.org/10.1145/3461837.3464513
  4. Rustam Azimov and Semyon Grigorev. 2018. Context-free Path Querying by Matrix Multiplication. In Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA) (Houston, Texas) (GRADES-NDA ’18). ACM, New York, NY, USA, Article 5, 10 pages. https://doi.org/10.1145/3210259.3210264
  5. On formal properties of simple phrase structure grammars. Z. Phonetik Sprachwiss. Kommunikat. 14 (1961), 143–172.
  6. Phillip G. Bradford. 2017. Efficient exact paths for dyck and semi-dyck labeled path reachability (extended abstract). In 2017 IEEE 8th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON). 247–253. https://doi.org/10.1109/UEMCON.2017.8249039
  7. Phillip G. Bradford and David A. Thomas. 2009. Labeled shortest paths in digraphs with negative and positive edge weights. RAIRO - Theoretical Informatics and Applications 43, 3 (April 2009), 567–583. https://doi.org/10.1051/ita/2009011
  8. Swarat Chaudhuri. 2008. Subcubic Algorithms for Recursive State Machines. In Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (San Francisco, California, USA) (POPL ’08). Association for Computing Machinery, New York, NY, USA, 159–169. https://doi.org/10.1145/1328438.1328460
  9. Semyon Grigorev and Anastasiya Ragozina. 2017. Context-free Path Querying with Structural Representation of Result. In Proceedings of the 13th Central & Eastern European Software Engineering Conference in Russia (St. Petersburg, Russia) (CEE-SECR ’17). ACM, New York, NY, USA, Article 10, 7 pages. https://doi.org/10.1145/3166094.3166104
  10. Jelle Hellings. 2014. Conjunctive context-free path queries. In Proceedings of ICDT’14. 119–130.
  11. Jelle Hellings. 2020. Explaining Results of Path Queries on Graphs. In Software Foundations for Data Interoperability and Large Scale Graph Data Analytics, Lu Qin, Wenjie Zhang, Ying Zhang, You Peng, Hiroyuki Kato, Wei Wang, and Chuan Xiao (Eds.). Springer International Publishing, Cham, 84–98.
  12. An Experimental Study of Context-Free Path Query Evaluation Methods. In Proceedings of the 31st International Conference on Scientific and Statistical Database Management (Santa Cruz, CA, USA) (SSDBM ’19). ACM, New York, NY, USA, 121–132. https://doi.org/10.1145/3335783.3335791
  13. Recursive State Machine Guided Graph Folding for Context-Free Language Reachability. Proc. ACM Program. Lang. 7, PLDI, Article 119 (jun 2023), 25 pages. https://doi.org/10.1145/3591233
  14. Efficient Evaluation of Context-Free Path Queries for Graph Databases. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing (Pau, France) (SAC ’18). Association for Computing Machinery, New York, NY, USA, 1230–1237. https://doi.org/10.1145/3167132.3167265
  15. LL-based query answering over RDF databases. Journal of Computer Languages 51 (2019), 75–87. https://doi.org/10.1016/j.cola.2019.02.002
  16. An Algorithm for Context-Free Path Queries over Graph Databases. In Proceedings of the 24th Brazilian Symposium on Context-Oriented Programming and Advanced Modularity (Natal, Brazil) (SBLP ’20). Association for Computing Machinery, New York, NY, USA, 40–47. https://doi.org/10.1145/3427081.3427087
  17. Querying graph databases using context-free grammars. Journal of Computer Languages 68 (2022), 101089. https://doi.org/10.1016/j.cola.2021.101089
  18. H. Miao and A. Deshpande. 2019. Understanding Data Science Lifecycle Provenance via Graph Segmentation and Summarization. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). 1710–1713.
  19. Context-Free Path Querying by Kronecker Product. In Advances in Databases and Information Systems, Jérôme Darmont, Boris Novikov, and Robert Wrembel (Eds.). Springer International Publishing, Cham, 49–59.
  20. Regular Path Query Evaluation on Streaming Graphs. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD ’20). Association for Computing Machinery, New York, NY, USA, 1415–1430. https://doi.org/10.1145/3318464.3389733
  21. Andreas Pavlogiannis. 2023. CFL/Dyck Reachability: An Algorithmic Perspective. ACM SIGLOG News 9, 4 (feb 2023), 5–25. https://doi.org/10.1145/3583660.3583664
  22. Jakob Rehof and Manuel Fähndrich. 2001. Type-Base Flow Analysis: From Polymorphic Subtyping to CFL-Reachability. SIGPLAN Not. 36, 3 (Jan. 2001), 54–66. https://doi.org/10.1145/373243.360208
  23. Jan G Rekers. 1992. Parser generation for interactive environments. Ph. D. Dissertation. Citeseer.
  24. Precise Interprocedural Dataflow Analysis via Graph Reachability. In Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (San Francisco, California, USA) (POPL ’95). Association for Computing Machinery, New York, NY, USA, 49–61. https://doi.org/10.1145/199448.199462
  25. An Efficient and Scalable Platform for Java Source Code Analysis Using Overlaid Graph Representations. IEEE Access 8 (2020), 72239–72260. https://doi.org/10.1109/ACCESS.2020.2987631
  26. A Bottom-Up Algorithm for Answering Context-Free Path Queries in Graph Databases. In Web Engineering, Tommi Mikkonen, Ralf Klamma, and Juan Hernández (Eds.). Springer International Publishing, Cham, 225–233.
  27. Elizabeth Scott and Adrian Johnstone. 2010. GLL Parsing. Electronic Notes in Theoretical Computer Science 253, 7 (2010), 177–189. https://doi.org/10.1016/j.entcs.2010.08.041 Proceedings of the Ninth Workshop on Language Descriptions Tools and Applications (LDTA 2009).
  28. Elizabeth Scott and Adrian Johnstone. 2013. GLL parse-tree generation. Science of Computer Programming 78, 10 (2013), 1828–1844. https://doi.org/10.1016/j.scico.2012.03.005 Special section on Language Descriptions Tools and Applications (LDTA’08 & ’09) & Special section on Software Engineering Aspects of Ubiquitous Computing and Ambient Intelligence (UCAmI 2011).
  29. Elizabeth Scott and Adrian Johnstone. 2018. GLL syntax analysers for EBNF grammars. Science of Computer Programming 166 (2018), 120–145. https://doi.org/10.1016/j.scico.2018.06.001
  30. Derivation representation using binary subtree sets. Science of Computer Programming 175 (2019), 63–84. https://doi.org/10.1016/j.scico.2019.01.008
  31. Petteri Sevon and Lauri Eronen. 2008. Subgraph Queries by Context-free Grammars. Journal of Integrative Bioinformatics 5, 2 (2008), 157 – 172. https://doi.org/10.1515/jib-2008-100
  32. Yuliya Susanina. 2020. Context-Free Path Querying via Matrix Equations. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD ’20). Association for Computing Machinery, New York, NY, USA, 2821–2823. https://doi.org/10.1145/3318464.3384400
  33. Context-Free Path Querying with Single-Path Semantics by Matrix Multiplication. In Proceedings of the 3rd Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA) (Portland, OR, USA) (GRADES-NDA’20). Association for Computing Machinery, New York, NY, USA, Article 5, 12 pages. https://doi.org/10.1145/3398682.3399163
  34. Multiple-Source Context-Free Path Querying in Terms of Linear Algebra. In Proceedings of the 24th International Conference on Extending Database Technology, EDBT 2021, Nicosia, Cyprus, March 23 - 26, 2021, Yannis Velegrakis, Demetris Zeinalipour-Yazti, Panos K. Chrysanthis, and Francesco Guerra (Eds.). OpenProceedings.org, 487–492. https://doi.org/10.5441/002/edbt.2021.56
  35. Parser Combinators for Context-Free Path Querying. In Proceedings of the 9th ACM SIGPLAN International Symposium on Scala (St. Louis, MO, USA) (Scala 2018). Association for Computing Machinery, New York, NY, USA, 13–23. https://doi.org/10.1145/3241653.3241655
  36. Distributed Pregel-based provenance-aware regular path query processing on RDF knowledge graphs. World Wide Web 23, 3 (Nov. 2019), 1465–1496. https://doi.org/10.1007/s11280-019-00739-0
  37. A Distributed Context-Free Language Constrained Shortest Path Algorithm. In 2008 37th International Conference on Parallel Processing. 373–380. https://doi.org/10.1109/ICPP.2008.67
  38. Mihalis Yannakakis. 1990. Graph-Theoretic Methods in Database Theory. In Proceedings of the Ninth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (Nashville, Tennessee, USA) (PODS ’90). Association for Computing Machinery, New York, NY, USA, 230–242. https://doi.org/10.1145/298514.298576
  39. Context-Free Path Queries on RDF Graphs. In The Semantic Web – ISWC 2016, Paul Groth, Elena Simperl, Alasdair Gray, Marta Sabou, Markus Krötzsch, Freddy Lecue, Fabian Flöck, and Yolanda Gil (Eds.). Springer International Publishing, Cham, 632–648.
  40. Xin Zheng and Radu Rugina. 2008. Demand-driven Alias Analysis for C. In Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (San Francisco, California, USA) (POPL ’08). ACM, New York, NY, USA, 197–208. https://doi.org/10.1145/1328438.1328464

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.