Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Disentangled Semantic Spaces of Explanations via Invertible Neural Networks (2305.01713v3)

Published 2 May 2023 in cs.CL and cs.AI

Abstract: Disentangled latent spaces usually have better semantic separability and geometrical properties, which leads to better interpretability and more controllable data generation. While this has been well investigated in Computer Vision, in tasks such as image disentanglement, in the NLP domain sentence disentanglement is still comparatively under-investigated. Most previous work have concentrated on disentangling task-specific generative factors, such as sentiment, within the context of style transfer. In this work, we focus on a more general form of sentence disentanglement, targeting the localised modification and control of more general sentence semantic features. To achieve this, we contribute to a novel notion of sentence semantic disentanglement and introduce a flow-based invertible neural network (INN) mechanism integrated with a transformer-based language Autoencoder (AE) in order to deliver latent spaces with better separability properties. Experimental results demonstrate that the model can conform the distributed latent space into a better semantically disentangled sentence space, leading to improved language interpretability and controlled generation when compared to the recent state-of-the-art language VAE models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (76)
  1. Framework for Easily Invertible Architectures (FrEIA).
  2. Abstract meaning representation for sembanking. In Proceedings of the 7th linguistic annotation workshop and interoperability with discourse, pages 178–186.
  3. Generating sentences from disentangled syntactic and semantic spaces. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6008–6019.
  4. Benjamin Bengfort and Rebecca Bilbro. 2019. Yellowbrick: Visualizing the Scikit-Learn Model Selection Process. 4(35).
  5. Yoshua Bengio. 2013. Deep learning of representations: Looking forward. In International conference on statistical language and speech processing, pages 1–37. Springer.
  6. Generating sentences from a continuous space. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, pages 10–21.
  7. Isolating sources of disentanglement in vaes. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, pages 2615–2625.
  8. DiffCSE: Difference-based contrastive learning for sentence embeddings. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4207–4218, Seattle, United States. Association for Computational Linguistics.
  9. A compositional distributional model of meaning. In Proceedings of the Second Quantum Interaction Symposium (QI-2008), pages 133–140. Oxford.
  10. Supervised learning of universal sentence representations from natural language inference data. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 670–680, Copenhagen, Denmark. Association for Computational Linguistics.
  11. What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2126–2136, Melbourne, Australia. Association for Computational Linguistics.
  12. Explaining answers with entailment trees. arXiv preprint arXiv:2104.08661.
  13. The paradox of the compositionality of natural language: A neural machine translation case study. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4154–4175, Dublin, Ireland. Association for Computational Linguistics.
  14. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
  15. Xiaoan Ding and Kevin Gimpel. 2021. FlowPrior: Learning expressive priors for latent variable sentence models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3242–3258, Online. Association for Computational Linguistics.
  16. Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516.
  17. Density estimation using real nvp. arXiv preprint arXiv:1605.08803.
  18. The hitchhiker’s guide to testing statistical significance in natural language processing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1383–1392. Association for Computational Linguistics.
  19. A disentangling invertible interpretation network for explaining latent representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9223–9232.
  20. Towards unsupervised content disentanglement in sentence representations via syntactic roles. arXiv preprint arXiv:2206.11184.
  21. Allennlp: A deep semantic natural language processing platform. arXiv preprint arXiv:1803.07640.
  22. Daniel Gildea and Daniel Jurafsky. 2000. Automatic labeling of semantic roles. In Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, ACL ’00, page 512–520, USA. Association for Computational Linguistics.
  23. DeCLUTR: Deep contrastive learning for unsupervised textual representations. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 879–895, Online. Association for Computational Linguistics.
  24. A distributional lens for multi-aspect controllable text generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1023–1043, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  25. Controllable text generation via probability density estimation in the latent space. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12590–12616, Toronto, Canada. Association for Computational Linguistics.
  26. beta-vae: Learning basic visual concepts with a constrained variational framework. In International Conference on Learning Representations.
  27. beta-vae: Learning basic visual concepts with a constrained variational framework. In ICLR.
  28. Fuse it more deeply! a variational transformer with layer-wise latent variable inference for text generation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 697–716, Seattle, United States. Association for Computational Linguistics.
  29. Zhiting Hu and Li Erran Li. 2021. A causal lens for controllable text generation. Advances in Neural Information Processing Systems, 34:24941–24955.
  30. Bridging continuous and discrete spaces: Interpretable sentence representation learning via compositional operations. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 14584–14595, Singapore. Association for Computational Linguistics.
  31. Compositionality decomposed: How do neural networks generalise? Journal of Artificial Intelligence Research, 67:757–795.
  32. Ray S Jackendoff. 1992. Semantic structures, volume 18. MIT press.
  33. WorldTree: A corpus of explanation graphs for elementary science questions supporting multi-hop inference. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
  34. Worldtree: A corpus of explanation graphs for elementary science questions supporting multi-hop inference. arXiv preprint arXiv:1802.03052.
  35. An efficient explorative sampling considering the generative boundaries of deep generative neural networks.
  36. Disentangled representation learning for non-parallel text style transfer. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 424–434.
  37. Hyunjik Kim and Andriy Mnih. 2018. Disentangling by factorising. In Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 2649–2658. PMLR.
  38. Durk P Kingma and Prafulla Dhariwal. 2018. Glow: Generative flow with invertible 1x1 convolutions. Advances in neural information processing systems, 31.
  39. Beth Levin. 1993. English verb classes and alternations: A preliminary investigation. University of Chicago press.
  40. On the sentence embeddings from pre-trained language models. arXiv preprint arXiv:2011.05864.
  41. Optimus: Organizing sentences via pre-trained modeling of a latent space. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4678–4699.
  42. Disentangled contrastive learning on graphs. Advances in Neural Information Processing Systems, 34:21872–21884.
  43. Variational autoencoder with disentanglement priors for low-resource task-specific natural language generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10335–10356, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  44. Composable text controls in latent space with ODEs. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 16543–16570, Singapore. Association for Computational Linguistics.
  45. Learning disentangled representations in the imaging domain. Medical Image Analysis, 80:102516.
  46. Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101.
  47. Adversarial autoencoders. arXiv preprint arXiv:1511.05644.
  48. Gary F Marcus. 2003. The algebraic mind: Integrating connectionism and cognitive science. MIT press.
  49. Giangiacomo Mercatali and André Freitas. 2021. Disentangling generative factors in natural language with discrete variational autoencoders. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 3547–3556.
  50. Overlooked implications of the reconstruction loss for vae disentanglement. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 4073–4081.
  51. Melanie Mitchell. 2023. How do we know how smart ai systems are?
  52. Ryan M. Nefdt. 2020. A puzzle concerning compositionality in machines. Minds and Machines, 30(1):47–75.
  53. Neural modeling of multi-predicate interactions for Japanese predicate argument structure analysis. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1591–1600, Vancouver, Canada. Association for Computational Linguistics.
  54. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
  55. Language models are unsupervised multitask learners.
  56. Malka Rappaport Hovav and Beth Levin. 2008. The english dative alternation: The case for verb sensitivityl. Journal of linguistics, 44(1):129–167.
  57. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Conference on Empirical Methods in Natural Language Processing.
  58. Karl Ridgeway and Michael C Mozer. 2018. Learning deep disentangled embeddings with the f-statistic loss. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, pages 185–194.
  59. The manifold tangent classifier. In Neural Information Processing Systems.
  60. On the latent space of wasserstein auto-encoders. arXiv preprint arXiv:1802.03761.
  61. Gözde Gül Şahin and Iryna Gurevych. 2020. Two birds with one stone: Investigating invertible neural networks for inverse problems in morphology. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 7814–7821.
  62. On the information bottleneck theory of deep learning. In International Conference on Learning Representations.
  63. Educating text autoencoders: Latent representation guidance via denoising. In International Conference on Machine Learning, pages 8719–8729. PMLR.
  64. Jonathon Shlens. 2014. A tutorial on principal component analysis. arXiv preprint arXiv:1404.1100.
  65. Paul Smolensky and Géraldine Legendre. 2006. The harmonic mind: From neural computation to optimality-theoretic grammar. Vol. 1, Cognitive architecture. MIT.
  66. Adavae: Exploring adaptive gpt-2s in variational auto-encoders for language modeling. arXiv preprint arXiv:2205.05862.
  67. Learning disentangled representations of negation and uncertainty. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8380–8397, Dublin, Ireland. Association for Computational Linguistics.
  68. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning, ICML ’08, page 1096–1103, New York, NY, USA. Association for Computing Machinery.
  69. Tsdae: Using transformer-based sequential denoising auto-encoder for unsupervised sentence embedding learning. arXiv preprint arXiv:2104.06979.
  70. Bohong Wu and Hai Zhao. 2022. Sentence representation learning with generative objective rather than contrastive objective. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3356–3368, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  71. Sequence-to-dependency neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 698–707, Vancouver, Canada. Association for Computational Linguistics.
  72. ConSERT: A contrastive framework for self-supervised sentence representation transfer. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5065–5075, Online. Association for Computational Linguistics.
  73. Sygns: A systematic generalization testbed based on natural language semantics. arXiv preprint arXiv:2106.01077.
  74. Quasi-symbolic explanatory nli via disentanglement: A geometrical examination. arXiv preprint arXiv:2210.06230.
  75. Improving semantic control in discrete latent spaces with transformer quantized variational autoencoders.
  76. MoverScore: Text generation evaluating with contextualized embeddings and earth mover distance. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 563–578, Hong Kong, China. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Yingji Zhang (12 papers)
  2. Danilo S. Carvalho (23 papers)
  3. André Freitas (156 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.