RNNs Implicitly Implement Tensor Product Representations (1812.08718v2)

Published 20 Dec 2018 in cs.CL

Abstract: Recurrent neural networks (RNNs) can learn continuous vector representations of symbolic structures such as sequences and sentences; these representations often exhibit linear regularities (analogies). Such regularities motivate our hypothesis that RNNs that show such regularities implicitly compile symbolic structures into tensor product representations (TPRs; Smolensky, 1990), which additively combine tensor products of vectors representing roles (e.g., sequence positions) and vectors representing fillers (e.g., particular words). To test this hypothesis, we introduce Tensor Product Decomposition Networks (TPDNs), which use TPRs to approximate existing vector representations. We demonstrate using synthetic data that TPDNs can successfully approximate linear and tree-based RNN autoencoder representations, suggesting that these representations exhibit interpretable compositional structure; we explore the settings that lead RNNs to induce such structure-sensitive representations. By contrast, further TPDN experiments show that the representations of four models trained to encode naturally-occurring sentences can be largely approximated with a bag of words, with only marginal improvements from more sophisticated structures. We conclude that TPDNs provide a powerful method for interpreting vector representations, and that standard RNNs can induce compositional sequence representations that are remarkably well approximated by TPRs; at the same time, existing training tasks for sentence representation learning may not be sufficient for inducing robust structural representations.

Citations (54)

View on Semantic Scholar

Summary

The paper demonstrates that RNNs can implicitly encode compositional symbolic structures using tensor product representations.
It introduces Tensor Product Decomposition Networks that reveal hidden role decompositions across various RNN architectures and datasets.
The findings indicate that structure-dependent training tasks may enhance interpretable symbolic encodings and guide future neural model innovations.

Implicit Tensor Product Representations in RNNs

The paper examines Recurrent Neural Networks (RNNs) and their capacity to encode compositional symbolic structures through implicit tensor product representations (TPRs). This research hinges on the observation that neural networks, traditionally critiqued for lacking symbolic capabilities, often yield vector representations with linear regularities characteristic of compositional structures. The paper primarily introduces and leverages Tensor Product Decomposition Networks (TPDNs) to uncover the underlying TPRs in RNNs, probing both synthetic and naturally occurring data.

The authors first test TPDNs on synthetic datasets, employing RNN autoencoders with unidirectional, bidirectional, and tree-based architectures. Notably, their TPDN model demonstrates an impressive ability to approximate these RNN autoencoders’ representations of digit sequences, suggesting an inherent, albeit hidden, structure-sensitive nature. Through this, it is observed that unidirectional architectures favor bidirectional role decompositions, while tree-based RNNs align closely with tree-structured representation.

When applied to models trained on real-world data such as the InferSent, Skip-thought, and Stanford Sentiment models, TPDNs reveal a distinct contrast. These representations largely resemble a bag-of-words model, with only marginal benefits observed from more sophisticated role schemes like tree or bidirectional roles. The inference drawn is significant: current sentence embedding models might not robustly encode structure due to the inherent limitations of their training tasks, pushing the hypothesis that alternate tasks or architectural innovations could enhance compositional representation learning.

Further explorations highlight the influence of both architecture and training tasks on the compositional representation behavior of RNNs. The paper identifies that RNN architectures are predisposed to learn representations dependent largely on the decoder's role in structural learning. Here, training specifics again prove crucial: tasks that strictly require structural comprehension, such as reversing a sequence, naturally encourage more accurate structural representations in RNNs.

In exploring the broader implications, this paper suggests that integrating tensor product representations into the standard training process of neural models might enhance their compositional capabilities. Practically, adopting more structure-dependent training tasks could guide RNNs and similar models towards richer, more interpretable symbolic representations.

Overall, this research underscores the latent potential within RNNs for learning structured representations and posits a path forward in both refining the architectures and reconceptualizing training paradigms. Moving into the future of AI, these insights could inform developments in computational linguistics, enhancing neural networks' alignment with symbolic reasoning and improving how they interpret and generate linguistically complex data.

PDF Markdown

Related Papers

YouTube

Show All Videos