Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DECIDER: A Dual-System Rule-Controllable Decoding Framework for Language Generation (2403.01954v3)

Published 4 Mar 2024 in cs.CL, cs.AI, and cs.LO

Abstract: Constrained decoding approaches aim to control the meaning or style of text generated by a Pre-trained LLM (PLM) using specific target words during inference. However, these methods often guide plausible continuations by greedily selecting targets, which, while completing the task, may disrupt the natural patterns of human language generation. In this work, we propose a novel decoding framework, DECIDER, which enables us to program rules on how we complete tasks to control a PLM. Differing from previous work, our framework transforms the encouragement of target words into the encouragement of all words that satisfy the rule. Specifically, DECIDER is a dual system where a PLM is equipped with a First-OrderLogic (FOL) reasoner to express and evaluate the rules, and a decision function to merge the outputs from both systems to steer the generation. Experiments on CommonGen and PersonaChat demonstrate that DECIDER can effectively follow given rules to achieve generation tasks in a more human-like manner.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. N. S. Keskar, B. McCann, L. R. Varshney, C. Xiong, and R. Socher, “Ctrl: A conditional transformer language model for controllable generation,” arXiv preprint arXiv:1909.05858, 2019.
  2. S. Dathathri, A. Madotto, J. Lan, J. Hung, E. Frank, P. Molino, J. Yosinski, and R. Liu, “Plug and play language models: A simple approach to controlled text generation,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=H1edEyBKDS
  3. J. Qian, L. Dong, Y. Shen, F. Wei, and W. Chen, “Controllable natural language generation with contrastive prefixes,” arXiv preprint arXiv:2202.13257, 2022.
  4. S. Ghosh, M. Chollet, E. Laksana, L.-P. Morency, and S. Scherer, “Affect-lm: A neural language model for customizable affective text generation,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2017, pp. 634–642.
  5. Y. Li, R. Zhang, W. Li, and Z. Cao, “Hierarchical prediction and adversarial learning for conditional response generation,” IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 1, pp. 314–327, 2020.
  6. C. Xing, W. Wu, Y. Wu, J. Liu, Y. Huang, M. Zhou, and W.-Y. Ma, “Topic aware neural response generation,” in Proceedings of the AAAI conference on artificial intelligence, vol. 31, no. 1, 2017.
  7. L. Liao, R. Takanobu, Y. Ma, X. Yang, M. Huang, and T.-S. Chua, “Topic-guided conversational recommender in multiple domains,” IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 5, pp. 2485–2496, 2020.
  8. Z. Liu, D. Zhou, H. Liu, H. Wang, Z.-Y. Niu, H. Wu, W. Che, T. Liu, and H. Xiong, “Graph-grounded goal planning for conversational recommendation,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 5, pp. 4923–4939, 2022.
  9. Y. Su, T. Lan, Y. Wang, D. Yogatama, L. Kong, and N. Collier, “A contrastive framework for neural text generation,” in Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35.   Curran Associates, Inc., 2022, pp. 21 548–21 561. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2022/file/871cae8f599cb8bbfcb0f58fe1af95ad-Paper-Conference.pdf
  10. Y. Su, T. Lan, Y. Liu, F. Liu, D. Yogatama, Y. Wang, L. Kong, and N. Collier, “Language models can see: Plugging visual controls in text generation,” 2022.
  11. B. Y. Lin, W. Zhou, M. Shen, P. Zhou, C. Bhagavatula, Y. Choi, and X. Ren, “CommonGen: A constrained text generation challenge for generative commonsense reasoning,” in Findings of the Association for Computational Linguistics: EMNLP 2020.   Online: Association for Computational Linguistics, Nov. 2020, pp. 1823–1840. [Online]. Available: https://www.aclweb.org/anthology/2020.findings-emnlp.165
  12. Y. Li, S. Huang, X. Zhang, Q. Zhou, Y. Li, R. Liu, Y. Cao, H.-T. Zheng, and Y. Shen, “Automatic context pattern generation for entity set expansion,” IEEE Transactions on Knowledge and Data Engineering, 2023.
  13. Y. Lan, G. He, J. Jiang, J. Jiang, W. X. Zhao, and J.-R. Wen, “Complex knowledge base question answering: A survey,” IEEE Transactions on Knowledge and Data Engineering, 2022.
  14. S. Zhang, E. Dinan, J. Urbanek, A. Szlam, D. Kiela, and J. Weston, “Personalizing dialogue agents: I have a dog, do you have pets too?” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, 2018, pp. 2204–2213.
  15. Z. Fan, Y. Gong, Z. Wei, S. Wang, Y. Huang, J. Jiao, X. Huang, N. Duan, and R. Zhang, “An enhanced knowledge injection model for commonsense generation,” in Proceedings of the 28th International Conference on Computational Linguistics.   Barcelona, Spain (Online): International Committee on Computational Linguistics, Dec. 2020, pp. 2014–2025. [Online]. Available: https://aclanthology.org/2020.coling-main.182
  16. B. P. Majumder, H. Jhamtani, T. Berg-Kirkpatrick, and J. McAuley, “Like hiking? you probably enjoy nature: Persona-grounded dialog with commonsense expansions,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).   Online: Association for Computational Linguistics, Nov. 2020, pp. 9194–9206. [Online]. Available: https://aclanthology.org/2020.emnlp-main.739
  17. Y. Wang, C. Xu, H. Hu, C. Tao, S. Wan, M. Dras, M. Johnson, and D. Jiang, “Neural rule-execution tracking machine for transformer-based text generation,” Advances in Neural Information Processing Systems, vol. 34, pp. 16 938–16 950, 2021.
  18. Y. Liu, Y. Wan, L. He, H. Peng, and S. Y. Philip, “Kg-bart: Knowledge graph-augmented bart for generative commonsense reasoning,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 7, 2021, pp. 6418–6425.
  19. F. Carlsson, J. Öhman, F. Liu, S. Verlinden, J. Nivre, and M. Sahlgren, “Fine-grained controllable text generation using non-residual prompting,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 6837–6857.
  20. Q. Liu, Y. Chen, B. Chen, J.-G. Lou, Z. Chen, B. Zhou, and D. Zhang, “You impress me: Dialogue generation via mutual persona perception,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.   Online: Association for Computational Linguistics, Jul. 2020, pp. 1417–1427. [Online]. Available: https://aclanthology.org/2020.acl-main.131
  21. C. Xu, P. Li, W. Wang, H. Yang, S. Wang, and C. Xiao, “Cosplay: Concept set guided personalized dialogue generation across both party personas,” in Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2022, pp. 201–211.
  22. P. Wang, J. Zamora, J. Liu, F. Ilievski, M. Chen, and X. Ren, “Contextualized scene imagination for generative commonsense reasoning,” in International Conference on Learning Representations (ICLR), 2022.
  23. H. Yang, Y. Wang, P. Li, W. Bi, W. Lam, and C. Xu, “Bridging the gap between pre-training and fine-tuning for commonsense generation,” in Findings of the Association for Computational Linguistics: EACL 2023, A. Vlachos and I. Augenstein, Eds.   Dubrovnik, Croatia: Association for Computational Linguistics, May 2023, pp. 376–383. [Online]. Available: https://aclanthology.org/2023.findings-eacl.28
  24. P. Anderson, B. Fernando, M. Johnson, and S. Gould, “Guided open vocabulary image captioning with constrained beam search,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.   Copenhagen, Denmark: Association for Computational Linguistics, Sep. 2017, pp. 936–945. [Online]. Available: https://www.aclweb.org/anthology/D17-1098
  25. C. Hokamp and Q. Liu, “Lexically constrained decoding for sequence generation using grid beam search,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).   Vancouver, Canada: Association for Computational Linguistics, Jul. 2017, pp. 1535–1546. [Online]. Available: https://www.aclweb.org/anthology/P17-1141
  26. M. Post and D. Vilar, “Fast lexically constrained decoding with dynamic beam allocation for neural machine translation,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers).   New Orleans, Louisiana: Association for Computational Linguistics, Jun. 2018, pp. 1314–1324. [Online]. Available: https://www.aclweb.org/anthology/N18-1119
  27. D. Pascual, B. Egressy, C. Meister, R. Cotterell, and R. Wattenhofer, “A plug-and-play method for controlled text generation,” in Findings of the Association for Computational Linguistics: EMNLP 2021.   Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 3973–3997. [Online]. Available: https://aclanthology.org/2021.findings-emnlp.334
  28. X. Lu, P. West, R. Zellers, R. Le Bras, C. Bhagavatula, and Y. Choi, “NeuroLogic decoding: (un)supervised neural text generation with predicate logic constraints,” in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.   Online: Association for Computational Linguistics, Jun. 2021, pp. 4288–4299. [Online]. Available: https://aclanthology.org/2021.naacl-main.339
  29. P. M. Groves and R. F. Thompson, “Habituation: a dual-process theory.” Psychological review, vol. 77, no. 5, p. 419, 1970.
  30. G. E. Beroggi and W. A. Wallace, “The effect of reasoning logics on real-time decision making,” IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, vol. 27, no. 6, pp. 743–749, 1997.
  31. Z. Hu, X. Ma, Z. Liu, E. Hovy, and E. Xing, “Harnessing deep neural networks with logic rules,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany, Aug. 2016, pp. 2410–2420. [Online]. Available: https://aclanthology.org/P16-1228
  32. J. Ji, Y. Kim, J. Glass, and T. He, “Controlling the focus of pretrained language generation models,” in Findings of the Association for Computational Linguistics: ACL 2022.   Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 3291–3306. [Online]. Available: https://aclanthology.org/2022.findings-acl.260
  33. Y. Dong, C. Bhagavatula, X. Lu, J. D. Hwang, A. Bosselut, J. C. K. Cheung, and Y. Choi, “On-the-fly attention modulation for neural generation,” in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.   Online: Association for Computational Linguistics, Aug. 2021, pp. 1261–1274. [Online]. Available: https://aclanthology.org/2021.findings-acl.107
  34. S. H. Bach, M. Broecheler, B. Huang, and L. Getoor, “Hinge-loss Markov random fields and probabilistic soft logic,” arXiv preprint arXiv:1505.04406, 2015.
  35. J. Foulds, S. Kumar, and L. Getoor, “Latent topic networks: A versatile probabilistic programming framework for topic models,” in Proc. of ICML, 2015, pp. 777–786.
  36. A. Colmerauer, “An introduction to prolog iii,” Communications of the ACM, vol. 33, no. 7, pp. 69–90, 1990.
  37. R. Speer, J. Chin, and C. Havasi, “Conceptnet 5.5: An open multilingual graph of general knowledge,” in Thirty-first AAAI conference on artificial intelligence, 2017.
  38. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in neural information processing systems, 2017, pp. 5998–6008.
  39. Y. Liu, T. Han, S. Ma, J. Zhang, Y. Yang, J. Tian, H. He, A. Li, M. He, Z. Liu et al., “Summary of chatgpt-related research and perspective towards the future of large language models,” Meta-Radiology, p. 100017, 2023.
  40. A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving language understanding by generative pre-training,” 2018.
  41. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language models are unsupervised multitask learners,” 2019.
  42. Y. Zhang, S. Sun, M. Galley, Y.-C. Chen, C. Brockett, X. Gao, J. Gao, J. Liu, and B. Dolan, “DIALOGPT : Large-scale generative pre-training for conversational response generation,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations.   Online: Association for Computational Linguistics, Jul. 2020, pp. 270–278. [Online]. Available: https://aclanthology.org/2020.acl-demos.30
  43. A. Fan, M. Lewis, and Y. Dauphin, “Hierarchical neural story generation,” arXiv preprint arXiv:1805.04833, 2018.
  44. K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002, pp. 311–318.
  45. C.-Y. Lin, “Rouge: A package for automatic evaluation of summaries,” in Text summarization branches out, 2004, pp. 74–81.
  46. S. Banerjee and A. Lavie, “Meteor: An automatic metric for mt evaluation with improved correlation with human judgments,” in Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, 2005, pp. 65–72.
  47. R. Vedantam, C. Lawrence Zitnick, and D. Parikh, “Cider: Consensus-based image description evaluation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
  48. P. Anderson, B. Fernando, M. Johnson, and S. Gould, “Spice: Semantic propositional image caption evaluation,” in European conference on computer vision.   Springer, 2016, pp. 382–398.
  49. Y. Cao, W. Bi, M. Fang, S. Shi, and D. Tao, “A model-agnostic data manipulation method for persona-based dialogue generation,” arXiv preprint arXiv:2204.09867, 2022.
  50. T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, “Bertscore: Evaluating text generation with bert,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=SkeHuCVFDr
  51. P. He, X. Liu, J. Gao, and W. Chen, “Deberta: Decoding-enhanced bert with disentangled attention,” in International Conference on Learning Representations, 2021. [Online]. Available: https://openreview.net/forum?id=XPZIaotutsD
  52. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” 2019.
  53. S. Welleck, J. Weston, A. Szlam, and K. Cho, “Dialogue natural language inference,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.   Florence, Italy: Association for Computational Linguistics, Jul. 2019, pp. 3731–3741. [Online]. Available: https://www.aclweb.org/anthology/P19-1363
  54. H. Song, Y. Wang, K. Zhang, W.-N. Zhang, and T. Liu, “BoB: BERT over BERT for training persona-based dialogue models from limited personalized data,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).   Online: Association for Computational Linguistics, Aug. 2021, pp. 167–177. [Online]. Available: https://aclanthology.org/2021.acl-long.14
  55. A. Madotto, Z. Lin, C.-S. Wu, and P. Fung, “Personalizing dialogue agents via meta-learning,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.   Florence, Italy: Association for Computational Linguistics, Jul. 2019, pp. 5454–5459. [Online]. Available: https://aclanthology.org/P19-1542
  56. Z. Susskind, B. Arden, L. K. John, P. Stockton, and E. B. John, “Neuro-symbolic ai: An emerging class of ai workloads and their characterization,” arXiv preprint arXiv:2109.06133, 2021.
  57. Q. Li, P. Li, Z. Ren, P. Ren, and Z. Chen, “Knowledge bridging for empathetic dialogue generation,” 2020. [Online]. Available: https://arxiv.org/abs/2009.09708
  58. X. Ni, H. Dai, Z. Ren, and P. Li, “Multi-source multi-type knowledge exploration and exploitation for dialogue generation,” in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, H. Bouamor, J. Pino, and K. Bali, Eds.   Singapore: Association for Computational Linguistics, Dec. 2023, pp. 12 522–12 537. [Online]. Available: https://aclanthology.org/2023.emnlp-main.771
  59. H. Ji, P. Ke, S. Huang, F. Wei, X. Zhu, and M. Huang, “Language generation with multi-hop reasoning on commonsense knowledge graph.” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 725–736.
  60. Y. Zhou, Y. Yan, R. Han, J. H. Caufield, K.-W. Chang, Y. Sun, P. Ping, and W. Wang, “Clinical temporal relation extraction with probabilistic soft logic regularization and global inference,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 16, 2021, pp. 14 647–14 655.
  61. M. Nye, M. Tessler, J. Tenenbaum, and B. M. Lake, “Improving coherence and consistency in neural sequence models with dual-system, neuro-symbolic reasoning,” Advances in Neural Information Processing Systems, vol. 34, pp. 25 192–25 204, 2021.
  62. C. Shu, Y. Zhang, X. Dong, P. Shi, T. Yu, and R. Zhang, “Logic-consistency text generation from semantic parses,” in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.   Online: Association for Computational Linguistics, Aug. 2021, pp. 4414–4426. [Online]. Available: https://aclanthology.org/2021.findings-acl.388
  63. C. Xu, J. Zhao, R. Li, C. Hu, and C. Xiao, “Change or not: A simple approach for plug and play language models on sentiment control,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 18, 2021, pp. 15 935–15 936.
  64. T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz et al., “Huggingface’s transformers: State-of-the-art natural language processing,” arXiv preprint arXiv:1910.03771, 2019.
  65. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library,” Advances in neural information processing systems, vol. 32, pp. 8026–8037, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Chen Xu (186 papers)
  2. Tian Lan (162 papers)
  3. Changlong Yu (22 papers)
  4. Wei Wang (1793 papers)
  5. Jun Gao (267 papers)
  6. Yu Ji (28 papers)
  7. Qunxi Dong (2 papers)
  8. Kun Qian (87 papers)
  9. Piji Li (75 papers)
  10. Wei Bi (62 papers)
  11. Bin Hu (217 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets