- The paper shows that replacing permutation invariant aggregators with RNNs enhances performance in neural algorithmic reasoning.
- It introduces the RNAR model using LSTM aggregators, achieving up to 87.08% micro-F1 on tasks like Quickselect.
- The study highlights trade-offs in memory efficiency and suggests avenues for improved sequential aggregator design in future research.
Recurrent Aggregators in Neural Algorithmic Reasoning
In the paper Recurrent Aggregators in Neural Algorithmic Reasoning by Kaijia Xu and Petar Veličković, the authors investigate the use of recurrent neural networks (RNNs) as aggregation functions in neural algorithmic reasoners. The primary motivation is to challenge the dominant reliance on Graph Neural Networks (GNNs) due to their message-passing frameworks and permutation equivariance properties. The authors propose replacing the equivariant aggregation function with an RNN, despite the seeming counter-intuitiveness of this approach.
Introduction
Neural algorithmic reasoning (NAR) explores how neural networks can learn and execute classical algorithms. Typically, GNNs have dominated NAR due to their alignment with dynamic programming, supporting OOD generalization in standard algorithmic benchmarks like CLRS-30. However, the authors argue that permutation equivariance, a key property of GNNs, can be a limitation rather than an advantage in tasks where input elements possess a natural ordering, such as sequential algorithms.
Model Architecture
The authors introduce the Recurrent NAR (RNAR) model, which leverages RNNs—specifically long short-term memory (LSTM) networks—as aggregation functions. This approach deviates from the permutation-invariant aggregators commonly used in GNNs. The message-passing framework of a GNN with an LSTM as its aggregator processes messages in a specified order. Here, an LSTM is applied over fully connected node neighborhoods, capturing order-specific interactions previously unaddressed by GNNs.
Mathematically, the LSTM processes messages as: zt(u)=LSTM(ψ(xu,xt),zt−1(u))
where z0(u) is initialized to a zero vector. The aggregated output is given by the LSTM's final state at the last timestep N: xu′=ϕ(xu,zN(u))
Evaluation
The evaluation leverages the CLRS-30 benchmark, key for measuring NAR performance. RNAR's performance is compared against Triplet-GMPNNs, Relational Transformers (RT), and G-ForgetNets.
Sequential Algorithms Results
The RNAR model notably excels in several sequential algorithms. The most striking result is its performance on the Quickselect task, achieving a micro-F1 score of 87.08%, a significant improvement over prior state-of-the-art methods. RNAR also demonstrates superior performance on other sequential tasks such as Bubble Sort, Heapsort, and Quicksort, with micro-F1 scores frequently surpassing 90%.
| Algorithm | Triplet-GMPNN | RT | G-ForgetNet | RNAR |
||||--||
| Quickselect | 0.47% | 19.18% | 6.3% | 87.08% |
| Bubble Sort | 80.51% | 38.22% | 83.19% | 95.78% |
| Heapsort | 49.13% | 32.96% | 57.47% | 93.07% |
| Quicksort | 85.69% | 39.42% | 73.28% | 94.73% |
Overall Results
RNAR maintains competitive performance across all tasks, achieving an average micro-F1 score of 75.78% compared to 80.04% for Triplet-GMPNNs. This slight regression highlights the trade-off involved in departing from permutation invariance.
Discussion and Future Work
The promising results indicate that RNAR's approach to using a recurrent aggregator effectively captures the ordered nature of inputs in specific algorithmic tasks, where GNNs may struggle. However, challenges remain, notably in tasks like the Knuth-Morris-Pratt algorithm where RNAR's performance is suboptimal. The memory usage of LSTM aggregators also poses a limitation, which prevented its application to certain tasks without modifications.
Further research may explore:
- Alternative Sequential Aggregators: Investigating other forms of recurrent or non-commutative aggregators, such as Binary-GRUs.
- Optimized Architectures: Enhancing the memory efficiency of recurrent aggregators to handle more complex tasks without running out of memory.
- Incorporating Automata Alignment: Adapting models to better align with tasks requiring string-processing capabilities.
In conclusion, while RNAR is not universally superior across all tasks, its marked improvements on sequential algorithms indicate a valuable direction in neural algorithmic reasoning. Future work in this area may yield models that combine the strengths of both GNNs and RNNs, further narrowing the gap between neural and classical algorithmic performance.