2000 character limit reached
Semgrex and Ssurgeon, Searching and Manipulating Dependency Graphs (2404.16250v1)
Published 24 Apr 2024 in cs.CL
Abstract: Searching dependency graphs and manipulating them can be a time consuming and challenging task to get right. We document Semgrex, a system for searching dependency graphs, and introduce Ssurgeon, a system for manipulating the output of Semgrex. The compact language used by these systems allows for easy command line or API processing of dependencies. Additionally, integration with publicly released toolkits in Java and Python allows for searching text relations and attributes over natural text.
- Linda Alfieri and Fabio Tamburini. 2016. (Almost) automatic conversion of the Venice Italian Treebank into the merged Italian dependency treebank format. In CLiC-it/EVALITA.
- Stanford’s distantly supervised slot filling systems for KBP 2014.
- Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 344–354, Beijing, China. Association for Computational Linguistics.
- Luca Brigada Villa. 2022. UDeasy: a tool for querying treebanks in conll-u format. In Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-10), pages 16–19, Marseille, France. European Language Resources Association.
- Stanford at TAC KBP 2017: Building a trilingual relational knowledge graph. In Text Analysis Conference.
- Learning alignments and leveraging natural logic. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pages 165–170, Prague. Association for Computational Linguistics.
- Learning spatial knowledge for text to 3D scene generation. In EMNLP.
- Bruno Guillaume. 2021. Graph matching and graph rewriting: GREW tools for corpus exploration, maintenance and conversion. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 168–175, Online. Association for Computational Linguistics.
- Johannes Heinecke. 2019. ConlluEditor: a fully graphical editor for universal dependencies treebank files. In Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019), pages 87–93, Paris, France. Association for Computational Linguistics.
- spaCy: Industrial-strength Natural Language Processing in Python.
- Roger Levy and Galen Andrew. 2006. Tregex and tsurgeon: tools for querying and manipulating tree data structures. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
- Tigersearch manual.
- Abstractive news summarization based on event semantic link network. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 236–246, Osaka, Japan. The COLING 2016 Organizing Committee.
- The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 55–60, Baltimore, Maryland. Association for Computational Linguistics.
- Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2):313–330.
- Scott Martens. 2013. Tündra: A web application for treebank search and visualization. In Proceedings of The Twelfth Workshop on Treebanks and Linguistic Theories (TLT12), pages 133–144, Sofia.
- Scott Martens and Marco Passarotti. 2014. Thomas Aquinas in the TüNDRA: Integrating the index Thomisticus treebank into CLARIN-D. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pages 767–774, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Universal Dependencies v2: An evergrowing multilingual treebank collection. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4034–4043, Marseille, France. European Language Resources Association.
- Richard Pito. 1993. Tgrep user manual.
- UDapi: Universal api for universal dependencies. In UDW@NoDaLiDa.
- Hans-JĂĽrgen Profitlich and Daniel Sonntag. 2021. A case study on pros and cons of regular expression detection and dependency parsing for negation extraction from german medical documents. technical report. ArXiv, abs/2105.09702.
- Stanza: A Python natural language processing toolkit for many human languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations.
- Douglas Rohde. 2003. Tgrep2 user manual.
- A relation aware search engine for materials science. Integrating Materials and Manufacturing Innovation, 7:1–11.
- A gold standard dependency corpus for English. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pages 2897–2904, Reykjavik, Iceland. European Language Resources Association (ELRA).
- UDPipe: Trainable pipeline for processing CoNLL-U files performing tokenization, morphological analysis, POS tagging and parsing. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 4290–4297, Portorož, Slovenia. European Language Resources Association (ELRA).
- Fabio Tamburini. 2017. Semgrex-plus: a tool for automatic dependency-graph rewriting. In Proceedings of the Fourth International Conference on Dependency Linguistics (Depling 2017), pages 248–254, Pisa,Italy. Linköping University Electronic Press.
- Odin’s runes: A rule language for information extraction. In LREC.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.