OpenPI2.0: An Improved Dataset for Entity Tracking in Texts (2305.14603v2)
Abstract: Much text describes a changing world (e.g., procedures, stories, newswires), and understanding them requires tracking how entities change. An earlier dataset, OpenPI, provided crowdsourced annotations of entity state changes in text. However, a major limitation was that those annotations were free-form and did not identify salient changes, hampering model evaluation. To overcome these limitations, we present an improved dataset, OpenPI2.0, where entities and attributes are fully canonicalized and additional entity salience annotations are added. On our fairer evaluation setting, we find that current state-of-the-art LLMs are far from competent. We also show that using state changes of salient entities as a chain-of-thought prompt, downstream performance is improved on tasks such as question answering and classical planning, outperforming the setting involving all related entities indiscriminately. We offer OpenPI2.0 for the continued development of models that can understand the dynamics of entities in text.
- Simulating action dynamics with neural process networks. arXiv preprint arXiv:1711.05313.
- Simulating action dynamics with neural process networks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net.
- Do as i can, not as i say: Grounding language in robotic affordances. In Conference on Robot Learning, pages 287–318. PMLR.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- HybridQA: A dataset of multi-hop question answering over tabular and textual data. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1026–1036, Online. Association for Computational Linguistics.
- Distilling task knowledge from how-to communities. In Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, April 3-7, 2017, pages 805–814. ACM.
- Kernel-based object tracking. IEEE Transactions on pattern analysis and machine intelligence, 25(5):564–577.
- Tracking state changes in procedural text: a challenge dataset and models for process paragraph comprehension. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1595–1604, New Orleans, Louisiana. Association for Computational Linguistics.
- Everything happens for a reason: Discovering the purpose of actions in procedural text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4496–4505, Hong Kong, China. Association for Computational Linguistics.
- Estelle Delpech and Patrick Saint-Dizier. 2008. Investigating the structure of procedural texts for answering how-to questions. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), Marrakech, Morocco. European Language Resources Association (ELRA).
- Crowdsourced corpus with entity salience annotations. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 3307–3311, Portorož, Slovenia. European Language Resources Association (ELRA).
- Be consistent! improving procedural text comprehension using label consistency. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2347–2356, Minneapolis, Minnesota. Association for Computational Linguistics.
- Jesse Dunietz and Daniel Gillick. 2014. A new entity salience task with millions of training examples. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, pages 205–209, Gothenburg, Sweden. Association for Computational Linguistics.
- Identifying salient entities in web pages. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management, pages 2375–2380.
- Aditya Gupta and Greg Durrett. 2019. Tracking discrete and continuous entity state for process understanding. In Proceedings of the Third Workshop on Structured Prediction for NLP, pages 7–12, Minneapolis, Minnesota. Association for Computational Linguistics.
- Deberta: Decoding-enhanced bert with disentangled attention. In International Conference on Learning Representations.
- Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In International Conference on Machine Learning, pages 9118–9147. PMLR.
- Automatic construction of a large-scale situation ontology by mining how-to instructions from the web. Web Semantics: Science, Services and Agents on the World Wide Web, 8(2-3):110–124.
- Llm+ p: Empowering large language models with optimal planning proficiency. arXiv preprint arXiv:2304.11477.
- Simpler context-dependent logical forms via model projections. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1456–1465, Berlin, Germany. Association for Computational Linguistics.
- Goal-oriented script construction. In Proceedings of the 14th International Conference on Natural Language Generation, pages 184–200, Aberdeen, Scotland, UK. Association for Computational Linguistics.
- Howto100m: Learning a text-video embedding by watching hundred million narrated video clips. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2630–2640.
- Dena Mujtaba and Nihar Mahapatra. 2019. Recent trends in natural language understanding for procedural knowledge. In 2019 International Conference on Computational Science and Computational Intelligence (CSCI), pages 420–424.
- The materials science procedural text corpus: Annotating materials synthesis procedures with shallow semantic structures. In Proceedings of the 13th Linguistic Annotation Workshop, pages 56–64, Florence, Italy. Association for Computational Linguistics.
- Automated knowledge acquisition for instructional text generation. In Proceedings of the 20th Annual International Conference on Computer Documentation, SIGDOC ’02, page 142–151, New York, NY, USA. Association for Computing Machinery.
- Hogun Park and Hamid Reza Motahari Nezhad. 2018. Learning procedures from text: Codifying how-to procedures in deep neural networks. In Companion Proceedings of the The Web Conference 2018, pages 351–358.
- What-if I ask you to explain: Explaining the effects of perturbations in procedural text. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3345–3355, Online. Association for Computational Linguistics.
- Roger C. Schank. 1977. Scripts, plans, goals, and understanding : an inquiry into human knowledge structures /. L. Erlbaum Associates ;, Hillsdale, N.J. :.
- Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034.
- EvEntS ReaLM: Event reasoning of entity states via language models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1982–1997, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Feature selection in categorizing procedural expressions. In Proceedings of the Sixth International Workshop on Information Retrieval with Asian Languages, pages 49–56, Sapporo, Japan. Association for Computational Linguistics.
- WIQA: A dataset for “what if…” reasoning over procedural text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 6076–6085, Hong Kong, China. Association for Computational Linguistics.
- A dataset for tracking entities in open domain procedural text. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6408–6417, Online. Association for Computational Linguistics.
- Llama: Open and efficient foundation language models.
- Sel: A unified algorithm for salient entity linking. Computational Intelligence, 34(1):2–29.
- Simultaneous localization, mapping and moving object tracking. The International Journal of Robotics Research, 26(9):889–916.
- Scienceworld: Is your agent smarter than a 5th grader? arXiv preprint arXiv:2203.07540.
- Towards ai-complete question answering: A set of prerequisite toy tasks. arXiv preprint arXiv:1502.05698.
- WN-salience: A corpus of news articles with entity salience annotations. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 2095–2102, Marseille, France. European Language Resources Association.
- Translating natural language to planning goals with large-language models. arXiv preprint arXiv:2302.05128.
- Induce, edit, retrieve: Language grounded multimodal schema for instructional video retrieval. ArXiv preprint, abs/2111.09276.
- Visual goal-step inference using wikiHow. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 2167–2179, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Analogous process structure induction for sub-event sequence prediction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1541–1550, Online. Association for Computational Linguistics.
- Li Zhang. 2022. Reasoning about procedures with natural language processing: A tutorial.
- Intent detection with WikiHow. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, pages 328–333, Suzhou, China. Association for Computational Linguistics.
- Reasoning about goals, steps, and temporal ordering with WikiHow. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4630–4639, Online. Association for Computational Linguistics.
- Causal reasoning of entities and events in procedural texts. In Findings of the Association for Computational Linguistics: EACL 2023, pages 415–431, Dubrovnik, Croatia. Association for Computational Linguistics.
- Bertscore: Evaluating text generation with bert. In International Conference on Learning Representations.
- Human-in-the-loop schema induction.
- Automatically extracting procedural knowledge from instructional texts using natural language processing. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12), pages 520–527, Istanbul, Turkey. European Language Resources Association (ELRA).
- Show me more details: Discovering hierarchies of procedures from semi-structured web data. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2998–3012, Dublin, Ireland. Association for Computational Linguistics.