Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Constraints First: A New MDD-based Model to Generate Sentences Under Constraints (2309.12415v1)

Published 21 Sep 2023 in cs.AI and cs.CL

Abstract: This paper introduces a new approach to generating strongly constrained texts. We consider standardized sentence generation for the typical application of vision screening. To solve this problem, we formalize it as a discrete combinatorial optimization problem and utilize multivalued decision diagrams (MDD), a well-known data structure to deal with constraints. In our context, one key strength of MDD is to compute an exhaustive set of solutions without performing any search. Once the sentences are obtained, we apply a LLM (GPT-2) to keep the best ones. We detail this for English and also for French where the agreement and conjugation rules are known to be more complex. Finally, with the help of GPT-2, we get hundreds of bona-fide candidate sentences. When compared with the few dozen sentences usually available in the well-known vision screening test (MNREAD), this brings a major breakthrough in the field of standardized sentence generation. Also, as it can be easily adapted for other languages, it has the potential to make the MNREAD test even more valuable and usable. More generally, this paper highlights MDD as a convincing alternative for constrained text generation, especially when the constraints are hard to satisfy, but also for many other prospects.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. GLUE: A multi-task benchmark and analysis platform for natural language understanding. CoRR, abs/1804.07461, 2018.
  2. Most language models can be poets too: An AI writing assistant and constrained text generation studio. In Proceedings of the Second Workshop on When Creative AI Meets Conversational AI, pages 9–15, Gyeongju, Republic of Korea, October 2022. Association for Computational Linguistics.
  3. Constrained text generation with global guidance - case study on commongen. CoRR, abs/2103.07170, 2021.
  4. Fast lexically constrained decoding with dynamic beam allocation for neural machine translation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1314–1324, New Orleans, Louisiana, June 2018. Association for Computational Linguistics.
  5. Lexically constrained decoding for sequence generation using grid beam search. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1535–1546, Vancouver, Canada, July 2017. Association for Computational Linguistics.
  6. NeuroLogic A*esque decoding: Constrained text generation with lookahead heuristics. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 780–799, Seattle, United States, July 2022. Association for Computational Linguistics.
  7. Generating all Possible Palindromes from Ngram Corpora. In IJCAI 2015, Buenos Aires, Argentina, July 2015.
  8. A new reading-acuity chart for normal and low vision. Ophthalmic and Visual Optics/Noninvasive Assessment of the Visual System Technical Digest, 3:232–235, 1993.
  9. Attention is all you need. In Advances in Neural Information Processing Systems, pages 5998–6008, 2017.
  10. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  11. Development of a reading accessibility index using the MNREAD acuity chart. JAMA Ophthalmol., 134(4):398–405, April 2016.
  12. The development of an automated sentence generator for the assessment of reading speed. Behavioral and Brain Functions, 4(1):14, 2008.
  13. A new sentence generator providing material for maximum reading speed measurement. Behav Res, 47:055–1064, 2015.
  14. Extending the MNREAD sentence corpus: Computer-generated sentences for measuring visual performance in reading. Vision research, 158:11–18, 2019.
  15. Manulex : A grade-level lexical database from french elementary-school readers. Behavior Research Methods, Instruments & Computers, 36:166–176, 2004.
  16. Sheldon B. Akers. Binary decision diagrams. IEEE Transactions on Computers, C(27):509–516, June 1978.
  17. Enforcing structure on temporal sequences: the Allen constraint. In International conference on principles and practice of constraint programming, page 786–801. Springer, 2016.
  18. MDDs: Sampling and probability constraints. In Proceedings of the International Conference on Principles and Practice of Constraint Programming, page 226–242, 2017.
  19. An MDD-based generalized arc consistency algorithm for positive and negative table constraints and some global constraints. Constraints, 15:265–304, 2010.
  20. Christophe Lecoutre. STR2: Optimized simple tabular reduction for table constraints. Constraints, 16(4):341–371, oct 2011.
  21. Encoding multi-valued decision diagram constraints as binary constraint trees. Proceedings of the AAAI Conference on Artificial Intelligence, 36(4):3850–3858, Jun. 2022.
  22. Heuristics for MDD Propagation in HADDOCK. In Christine Solnon, editor, 28th International Conference on Principles and Practice of Constraint Programming (CP 2022), volume 235 of Leibniz International Proceedings in Informatics (LIPIcs), pages 24:1–24:17, Dagstuhl, Germany, 2022. Schloss Dagstuhl – Leibniz-Zentrum für Informatik.
  23. Extending compact-diagram to basic smart multi-valued variable diagrams. In Integration of Constraint Programming, Artificial Intelligence, and Operations Research: 16th International Conference, CPAIOR 2019, Thessaloniki, Greece, June 4–7, 2019, Proceedings, pages 581–598. Springer, 2019.
  24. Compact-mdd: Efficiently filtering (s) MDD constraints with reversible sparse bit-sets. In Twenty-Seventh International Joint Conference on Artificial Intelligence IJCAI-18, 2018.
  25. Decision Diagrams for Optimization. Springer Publishing Company, Incorporated, 1st edition, 2016.
  26. Guillaume Perez. Diagrammes de décision : contraintes et algorithmes. PhD thesis, Université Côte d’Azur, 2017.
  27. Large neighborhood search with decision diagrams. In International Joint Conference on Artificial Intelligence, 2022.
  28. Peel-and-bound: Generating stronger relaxed bounds with multivalued decision diagrams. In Christine Solnon, editor, 28th International Conference on Principles and Practice of Constraint Programming, CP 2022, July 31 to August 8, 2022, Haifa, Israel, volume 235 of LIPIcs, pages 35:1–35:20. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
  29. Willem-Jan van Hoeve. Graph coloring with decision diagrams. Mathematical Programming, 192(1):631–674, 2022.
  30. Compiling csps: A complexity map of (non-deterministic) multivalued decision diagrams. International Journal on Artificial Intelligence Tools, 23(04):1460015, 2014.
  31. Helmut Schmid. Probabilistic part-of-speech tagging using decision trees. In New methods in language processing, page 154, 2013.
  32. Efficient operations on mdds for building constraint programming models. In Proceedings of the 24th International Conference on Artificial Intelligence, pages 374–380, 2015.
  33. Efficient operations between mdds and constraints. In Pierre Schaus, editor, Integration of Constraint Programming, Artificial Intelligence, and Operations Research, pages 173–189, Cham, 2022. Springer International Publishing.
  34. Michael Trick. A dynamic programming approach for consistency and propagation for knapsack constraints. Annals of Operations Research, 118:73–84, 2003.
  35. An estimate of an upper bound for the entropy of English. Computational Linguistics, 18(1):31–40, 1992.
  36. Speech and language processing : an introduction to natural language processing, computational linguistics, and speech recognition. Pearson Prentice Hall, Upper Saddle River, N.J., 2009.
  37. Un modèle Transformer Génératif Pré-entrainé pour le français. In Pascal Denis, Natalia Grabar, Amel Fraisse, Rémi Cardon, Bernard Jacquemin, Eric Kergosien, and Antonio Balvet, editors, Traitement Automatique des Langues Naturelles, pages 246–255, Lille, France, 2021. ATALA.
  38. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online, October 2020. Association for Computational Linguistics.
  39. Markov constraints: Steerable generation of markov sequences. Constraints, 16(2):148–172, apr 2011.
  40. Informer: Beyond efficient transformer for long sequence time-series forecasting. Proceedings of the AAAI Conference on Artificial Intelligence, 35(12):11106–11115, May 2021.
  41. Avoiding plagiarism in markov sequence generation. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, AAAI’14, page 2731–2737. AAAI Press, 2014.
  42. Quantitative analysis of culture using millions of digitized books. Science, 331(6014):176–182, 2011.
Citations (3)

Summary

We haven't generated a summary for this paper yet.