Papers
Topics
Authors
Recent
2000 character limit reached

Learning to Solve Constraint Satisfaction Problems with Recurrent Transformer (2307.04895v1)

Published 10 Jul 2023 in cs.AI and cs.LG

Abstract: Constraint satisfaction problems (CSPs) are about finding values of variables that satisfy the given constraints. We show that Transformer extended with recurrence is a viable approach to learning to solve CSPs in an end-to-end manner, having clear advantages over state-of-the-art methods such as Graph Neural Networks, SATNet, and some neuro-symbolic models. With the ability of Transformer to handle visual input, the proposed Recurrent Transformer can straightforwardly be applied to visual constraint reasoning problems while successfully addressing the symbol grounding problem. We also show how to leverage deductive knowledge of discrete constraints in the Transformer's inductive learning to achieve sample-efficient learning and semi-supervised learning for CSPs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. CLR-DRNets: Curriculum learning with restarts to solve visual combinatorial games. In 27th International Conference on Principles and Practice of Constraint Programming (CP 2021). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2021.
  2. Graph transformer for graph-to-sequence learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp.  7464–7471, 2020.
  3. Assessing SATNet’s ability to solve the symbol grounding problem. Advances in Neural Information Processing Systems, 33:1428–1439, 2020.
  4. Deep reasoning networks: Thinking fast and slow. arXiv preprint arXiv:1906.00855, 2019.
  5. Binaryconnect: training deep neural networks with binary weights during propagations. In Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 2, pp.  3123–3131, 2015.
  6. Selection-inference: Exploiting large language models for interpretable logical reasoning. arXiv preprint arXiv:2205.09712, 2022.
  7. Transformer-XL: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860, 2019.
  8. Universal transformers. arXiv preprint arXiv:1807.03819, 2018.
  9. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  10. A generalization of transformer networks to graphs. AAAI Workshop on Deep Learning on Graphs: Methods and Applications, 2021.
  11. Multi-modal transformer for video retrieval. In European Conference on Computer Vision, pp.  214–229. Springer, 2020.
  12. A new model for learning in graph domains. In Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005., volume 2, pp.  729–734. IEEE, 2005.
  13. Modeling recurrence for transformer. arXiv preprint arXiv:1904.03092, 2019.
  14. Reasoning with transformer-based models: Deep learning, but shallow reasoning. In 3rd Conference on Automated Knowledge Base Construction, 2021.
  15. Heterogeneous graph transformer. In Proceedings of The Web Conference 2020, pp.  2704–2710, 2020.
  16. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations, ICLR 2017, 2017.
  17. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
  18. Isarstep: a benchmark for high-level mathematical reasoning. arXiv preprint arXiv:2006.09265, 2020.
  19. Deepproblog: Neural probabilistic logic programming. In Proceedings of Advances in Neural Information Processing Systems, pp.  3749–3759, 2018.
  20. Improving coherence and consistency in neural sequence models with dual-system, neuro-symbolic reasoning. Advances in Neural Information Processing Systems, 34:25192–25204, 2021.
  21. Recurrent relational networks. In Proceedings of Advances in Neural Information Processing Systems, pp.  3368–3378, 2018.
  22. Self-supervised graph transformer on large-scale molecular data. Advances in Neural Information Processing Systems, 33:12559–12571, 2020.
  23. Techniques for symbol grounding with SATNet. Advances in Neural Information Processing Systems, 34, 2021.
  24. Neural-symbolic integration: A compositional perspective. In Proceedings of the AAAI Conference on Artificial Intelligence, pp.  5051–5060, 2021.
  25. Attention is all you need. In Advances in neural information processing systems, pp. 5998–6008, 2017.
  26. Graph Attention Networks. International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=rJXMpikCZ.
  27. SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver. In Proceedings of the 35th International Conference on Machine Learning (ICML), 2019.
  28. A semantic loss function for deep learning with symbolic knowledge. In Proceedings of the 35th International Conference on Machine Learning (ICML), July 2018. URL http://starai.cs.ucla.edu/papers/XuICML18.pdf.
  29. NeurASP: Embracing neural networks into answer set programming. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), pp.  1755–1762, 2020. doi: 10.24963/ijcai.2020/243.
  30. Injecting logical constraints into neural networks via straight-through estimators. In International Conference on Machine Learning, pp. 25096–25122. PMLR, 2022.
  31. Do transformers really perform badly for graph representation? Advances in Neural Information Processing Systems, 34, 2021.
  32. Graph transformer networks. Advances in neural information processing systems, 32, 2019.
  33. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In International Conference on Machine Learning, pp. 11328–11339. PMLR, 2020.
Citations (6)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.