Papers
Topics
Authors
Recent
AI Research Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 89 tok/s
Gemini 2.5 Pro 43 tok/s Pro
GPT-5 Medium 24 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 112 tok/s Pro
Kimi K2 199 tok/s Pro
GPT OSS 120B 449 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Promptly Predicting Structures: The Return of Inference (2401.06877v3)

Published 12 Jan 2024 in cs.CL

Abstract: Prompt-based methods have been used extensively across NLP to build zero- and few-shot label predictors. Many NLP tasks are naturally structured: that is, their outputs consist of multiple labels which constrain each other. Annotating data for such tasks can be cumbersome. Can the promise of the prompt-based paradigm be extended to such structured outputs? In this paper, we present a framework for constructing zero- and few-shot linguistic structure predictors. Our key insight is that we can use structural constraints -- and combinatorial inference derived from them -- to filter out inconsistent structures predicted by LLMs. We instantiated this framework on two structured prediction tasks, and five datasets. Across all cases, our results show that enforcing consistency not only constructs structurally valid outputs, but also improves performance over the unconstrained variants.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. The Berkeley FrameNet Project. In 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1, pages 86–90, Montreal, Quebec, Canada. Association for Computational Linguistics.
  2. Longformer: The Long-Document Transformer. arXiv preprint arXiv:2004.05150.
  3. Prompting Language Models for Linguistic Structure. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6649–6663, Toronto, Canada. Association for Computational Linguistics.
  4. Language Models Are Few-Shot Learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS’20, Red Hook, NY, USA. Curran Associates Inc.
  5. Inference Protocols for Coreference Resolution. In Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task, pages 40–44, Portland, Oregon, USA. Association for Computational Linguistics.
  6. A Close Look into the Calibration of Pre-trained Language Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1343–1367, Toronto, Canada. Association for Computational Linguistics.
  7. Scaling Instruction-Finetuned Language Models. arXiv preprint arXiv:2210.11416.
  8. Michael Collins. 2002. Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), pages 1–8. Association for Computational Linguistics.
  9. Template-Based Named Entity Recognition Using BART. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1835–1845, Online. Association for Computational Linguistics.
  10. Agata Cybulska and Piek Vossen. 2014. Using a Sledgehammer to Crack a Nut? Lexical Diversity and Event Coreference Resolution. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pages 4545–4552, Reykjavik, Iceland. European Language Resources Association (ELRA).
  11. Xinya Du and Claire Cardie. 2020. Event Extraction by Answering (Almost) Natural Questions. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 671–683, Online. Association for Computational Linguistics.
  12. Mitigating Label Biases for In-context Learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14014–14031, Toronto, Canada. Association for Computational Linguistics.
  13. Large-Scale QA-SRL Parsing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2051–2060, Melbourne, Australia. Association for Computational Linguistics.
  14. Question Answering is a Format; When is it Useful? arXiv preprint arXiv:1909.11291.
  15. Gurobi Optimization, LLC. 2023. Gurobi Optimizer Reference Manual.
  16. Prototypical Calibration for Few-shot Learning of Language Models. In Proceedings of the Eleventh International Conference on Learning Representations.
  17. Question-Answer Driven Semantic Role Labeling: Using Natural Language to Annotate Natural Language. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 643–653, Lisbon, Portugal. Association for Computational Linguistics.
  18. Surface Form Competition: Why the Highest Probability Answer Isn’t Always Right. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7038–7051, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  19. UnifiedQA-v2: Stronger Generalization via Broader Cross-Format Training. arXiv preprint arXiv:2202.12359.
  20. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning, ICML ’01, page 282–289, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
  21. Nghia T Le and Alan Ritter. 2023. Are Large Language Models Robust Zero-shot Coreference Resolvers? arXiv preprint arXiv:2305.14489.
  22. Zero-Shot Relation Extraction via Reading Comprehension. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pages 333–342, Vancouver, Canada. Association for Computational Linguistics.
  23. A Logic-Driven Framework for Consistency of Neural Models. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3924–3935, Hong Kong, China. Association for Computational Linguistics.
  24. Global Constraints with Prompting for Zero-Shot Event Argument Classification. In Findings of the Association for Computational Linguistics: EACL 2023, pages 2527–2538, Dubrovnik, Croatia. Association for Computational Linguistics.
  25. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Computing Surveys, 55(9):1–35.
  26. Autoregressive Structured Prediction with Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 993–1005, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  27. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8086–8098, Dublin, Ireland. Association for Computational Linguistics.
  28. ZEROTOP: Zero-Shot Task-Oriented Semantic Parsing using Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 5792–5799, Singapore. Association for Computational Linguistics.
  29. Advanced Structured Prediction. Neural Information Processing series. MIT Press.
  30. The Proposition Bank: An Annotated Corpus of Semantic Roles. Computational Linguistics, 31(1):71–106.
  31. Semantic Role Labeling. Synthesis Lectures on Human Language Technologies, 3(1):1–103.
  32. Scoring Coreference Partitions of Predicted Mentions: A Reference Implementation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 30–35, Baltimore, Maryland. Association for Computational Linguistics.
  33. CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes. In Joint Conference on EMNLP and CoNLL - Shared Task, pages 1–40, Jeju Island, Korea. Association for Computational Linguistics.
  34. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  35. Dan Roth and Wen-tau Yih. 2004. A Linear Programming Formulation for Global Inference in Natural Language Tasks. In Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL-2004) at HLT-NAACL 2004, pages 1–8, Boston, Massachusetts, USA. Association for Computational Linguistics.
  36. Timo Schick and Hinrich Schütze. 2021a. Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 255–269, Online. Association for Computational Linguistics.
  37. Timo Schick and Hinrich Schütze. 2021b. It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2339–2352, Online. Association for Computational Linguistics.
  38. Noah A. Smith. 2010. Linguistic Structure Prediction. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers.
  39. A Machine Learning Approach to Coreference Resolution of Noun Phrases. Computational Linguistics, 27(4):521–544.
  40. Coreference Resolution in Biomedical Texts: a Machine Learning Approach. In Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives, volume 8131 of Dagstuhl Seminar Proceedings (DagSemProc), pages 1–1, Dagstuhl, Germany. Schloss Dagstuhl – Leibniz-Zentrum für Informatik.
  41. Oyvind Tafjord and Peter Clark. 2021. General-Purpose Question-Answering with Macaw. ArXiv, abs/2109.02593.
  42. Llama: Open and Efficient Foundation Language Models. arXiv preprint arXiv:2302.13971.
  43. Efficient Inference and Structured Learning for Semantic Role Labeling. Transactions of the Association for Computational Linguistics, 3:29–41.
  44. Iteratively Prompt Pre-trained Language Models for Chain of Thought. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2714–2730, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  45. Code4Struct: Code Generation for Few-Shot Event Structure Prediction. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3640–3663, Toronto, Canada. Association for Computational Linguistics.
  46. Finetuned Language Models are Zero-Shot Learners. In Proceedings of the Tenth International Conference on Learning Representations.
  47. Transformers: State-of-the-Art Natural Language Processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
  48. CorefQA: Coreference Resolution as Query-based Span Prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6953–6963, Online. Association for Computational Linguistics.
  49. What GPT Knows About Who is Who. In Proceedings of the Third Workshop on Insights from Negative Results in NLP, pages 75–81, Dublin, Ireland. Association for Computational Linguistics.
  50. Jin Y Yen. 1971. Finding the K Shortest Loopless Paths in a Network. Management Science, 17(11):712–716.
  51. Named Entity Recognition as Structured Span Prediction. In Proceedings of the Workshop on Unimodal and Multimodal Induction of Linguistic Structures (UM-IoS), pages 1–10, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
  52. Calibrate Before Use: Improving Few-shot Performance of Language Models. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 12697–12706. PMLR.
Citations (3)

Summary

  • The paper presents an innovative inference framework that integrates with prompt-based methods to ensure structurally valid predictions.
  • It demonstrates significant performance enhancements across tasks like Semantic Role Labeling and coreference resolution, reducing inconsistencies.
  • The approach leverages LLMs without task-specific training, offering a cost-effective and scalable solution for complex NLP structured prediction challenges.

Introduction to Structured Prediction

Structured prediction is a critical area in NLP aimed at decoding complex relations within data, such as parsing grammatical structures or understanding semantic roles in a sentence. The task requires models to make multiple, interrelated decisions, posing a challenge for accurate and efficient prediction models.

Prompt-Based Paradigm and LLMs

With the advent of LLMs, the prompt-based paradigm has gained popularity. In this method, an LLM generates output based on a textual prompt. The advantage of this approach lies in its versatility, as it can extend to a multitude of tasks using the same pre-trained models without the need for extensive labeled datasets. For zero-shot or few-shot applications—where a model sees few or no examples before it's put to use—prompt-based methods have shown promising results.

However, for structured prediction, the application of such methods has been underexplored. While previous attempts have utilized prompts to resolve individual components without considering their interdependence, leading to structurally invalid or inconsistent outputs.

The Return of Inference

In a recent development, researchers have presented a framework that reintroduces inference to the prompt-based approach for structured predictions. The framework filters out inconsistent predictions by leveraging the constraints inherent to the structure of the target output. This innovation uses sophisticated algorithms that work with LLMs to ensure the validity of the output structure.

In experiments across various structured prediction tasks, the application of this framework has not only enhanced the quality of generated data by enforcing consistency but also improved overall performance compared to unconstrained versions. Importantly, this approach does not require task-specific training, as it utilizes LLMs in conjunction with inference algorithms.

Exemplary Results and Implications

The practical implications of this are profound. For instance, in Semantic Role Labeling—a task that determines the relationships and roles in a sentence—the framework provided considerable improvements in both consistency and performance metrics. Furthermore, for the intricate task of coreference resolution, which groups mentions of the same entity within a text, significant gains in accuracy were achieved.

This progress suggests that incorporating inference can push the boundaries of what is achievable with prompt-based, few-shot methods in structured prediction tasks. The research indicates that structurally valid outputs are possible even with limited exposure to task-specific data, presenting a cost-effective and time-efficient direction for future NLP endeavors.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 post and received 5 likes.