Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 17 tok/s Pro
GPT-5 High 22 tok/s Pro
GPT-4o 93 tok/s Pro
Kimi K2 186 tok/s Pro
GPT OSS 120B 446 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Semgrex and Ssurgeon, Searching and Manipulating Dependency Graphs (2404.16250v1)

Published 24 Apr 2024 in cs.CL

Abstract: Searching dependency graphs and manipulating them can be a time consuming and challenging task to get right. We document Semgrex, a system for searching dependency graphs, and introduce Ssurgeon, a system for manipulating the output of Semgrex. The compact language used by these systems allows for easy command line or API processing of dependencies. Additionally, integration with publicly released toolkits in Java and Python allows for searching text relations and attributes over natural text.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Linda Alfieri and Fabio Tamburini. 2016. (Almost) automatic conversion of the Venice Italian Treebank into the merged Italian dependency treebank format. In CLiC-it/EVALITA.
  2. Stanford’s distantly supervised slot filling systems for KBP 2014.
  3. Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 344–354, Beijing, China. Association for Computational Linguistics.
  4. Luca Brigada Villa. 2022. UDeasy: a tool for querying treebanks in conll-u format. In Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-10), pages 16–19, Marseille, France. European Language Resources Association.
  5. Stanford at TAC KBP 2017: Building a trilingual relational knowledge graph. In Text Analysis Conference.
  6. Learning alignments and leveraging natural logic. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pages 165–170, Prague. Association for Computational Linguistics.
  7. Learning spatial knowledge for text to 3D scene generation. In EMNLP.
  8. Bruno Guillaume. 2021. Graph matching and graph rewriting: GREW tools for corpus exploration, maintenance and conversion. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 168–175, Online. Association for Computational Linguistics.
  9. Johannes Heinecke. 2019. ConlluEditor: a fully graphical editor for universal dependencies treebank files. In Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019), pages 87–93, Paris, France. Association for Computational Linguistics.
  10. spaCy: Industrial-strength Natural Language Processing in Python.
  11. Roger Levy and Galen Andrew. 2006. Tregex and tsurgeon: tools for querying and manipulating tree data structures. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
  12. Tigersearch manual.
  13. Abstractive news summarization based on event semantic link network. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 236–246, Osaka, Japan. The COLING 2016 Organizing Committee.
  14. The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 55–60, Baltimore, Maryland. Association for Computational Linguistics.
  15. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):313–330.
  16. Scott Martens. 2013. Tündra: A web application for treebank search and visualization. In Proceedings of The Twelfth Workshop on Treebanks and Linguistic Theories (TLT12), pages 133–144, Sofia.
  17. Scott Martens and Marco Passarotti. 2014. Thomas Aquinas in the TüNDRA: Integrating the index Thomisticus treebank into CLARIN-D. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pages 767–774, Reykjavik, Iceland. European Language Resources Association (ELRA).
  18. Universal Dependencies v2: An evergrowing multilingual treebank collection. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4034–4043, Marseille, France. European Language Resources Association.
  19. Richard Pito. 1993. Tgrep user manual.
  20. UDapi: Universal api for universal dependencies. In UDW@NoDaLiDa.
  21. Hans-JĂĽrgen Profitlich and Daniel Sonntag. 2021. A case study on pros and cons of regular expression detection and dependency parsing for negation extraction from german medical documents. technical report. ArXiv, abs/2105.09702.
  22. Stanza: A Python natural language processing toolkit for many human languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations.
  23. Douglas Rohde. 2003. Tgrep2 user manual.
  24. A relation aware search engine for materials science. Integrating Materials and Manufacturing Innovation, 7:1–11.
  25. A gold standard dependency corpus for English. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pages 2897–2904, Reykjavik, Iceland. European Language Resources Association (ELRA).
  26. UDPipe: Trainable pipeline for processing CoNLL-U files performing tokenization, morphological analysis, POS tagging and parsing. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 4290–4297, Portorož, Slovenia. European Language Resources Association (ELRA).
  27. Fabio Tamburini. 2017. Semgrex-plus: a tool for automatic dependency-graph rewriting. In Proceedings of the Fourth International Conference on Dependency Linguistics (Depling 2017), pages 248–254, Pisa,Italy. Linköping University Electronic Press.
  28. Odin’s runes: A rule language for information extraction. In LREC.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: