Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 86 tok/s

Gemini 2.5 Pro 48 tok/s Pro

GPT-5 Medium 23 tok/s Pro

GPT-5 High 22 tok/s Pro

GPT-4o 73 tok/s Pro

Kimi K2 206 tok/s Pro

GPT OSS 120B 469 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

Recurrent Aggregators in Neural Algorithmic Reasoning (2409.07154v2)

Published 11 Sep 2024 in cs.LG and cs.AI

Abstract: Neural algorithmic reasoning (NAR) is an emerging field that seeks to design neural networks that mimic classical algorithmic computations. Today, graph neural networks (GNNs) are widely used in neural algorithmic reasoners due to their message passing framework and permutation equivariance. In this extended abstract, we challenge this design choice, and replace the equivariant aggregation function with a recurrent neural network. While seemingly counter-intuitive, this approach has appropriate grounding when nodes have a natural ordering -- and this is the case frequently in established reasoning benchmarks like CLRS-30. Indeed, our recurrent NAR (RNAR) model performs very strongly on such tasks, while handling many others gracefully. A notable achievement of RNAR is its decisive state-of-the-art result on the Heapsort and Quickselect tasks, both deemed as a significant challenge for contemporary neural algorithmic reasoners -- especially the latter, where RNAR achieves a mean micro-F1 score of 87%.

Citations (1)

View on Semantic Scholar

Summary

The paper shows that replacing permutation invariant aggregators with RNNs enhances performance in neural algorithmic reasoning.
It introduces the RNAR model using LSTM aggregators, achieving up to 87.08% micro-F1 on tasks like Quickselect.
The study highlights trade-offs in memory efficiency and suggests avenues for improved sequential aggregator design in future research.

Recurrent Aggregators in Neural Algorithmic Reasoning

In the paper Recurrent Aggregators in Neural Algorithmic Reasoning by Kaijia Xu and Petar Veličković, the authors investigate the use of recurrent neural networks (RNNs) as aggregation functions in neural algorithmic reasoners. The primary motivation is to challenge the dominant reliance on Graph Neural Networks (GNNs) due to their message-passing frameworks and permutation equivariance properties. The authors propose replacing the equivariant aggregation function with an RNN, despite the seeming counter-intuitiveness of this approach.

Introduction

Neural algorithmic reasoning (NAR) explores how neural networks can learn and execute classical algorithms. Typically, GNNs have dominated NAR due to their alignment with dynamic programming, supporting OOD generalization in standard algorithmic benchmarks like CLRS-30. However, the authors argue that permutation equivariance, a key property of GNNs, can be a limitation rather than an advantage in tasks where input elements possess a natural ordering, such as sequential algorithms.

Model Architecture

The authors introduce the Recurrent NAR (RNAR) model, which leverages RNNs—specifically long short-term memory (LSTM) networks—as aggregation functions. This approach deviates from the permutation-invariant aggregators commonly used in GNNs. The message-passing framework of a GNN with an LSTM as its aggregator processes messages in a specified order. Here, an LSTM is applied over fully connected node neighborhoods, capturing order-specific interactions previously unaddressed by GNNs.

Mathematically, the LSTM processes messages as: $\mathbf{z}^{(u)}_t = \mathrm{LSTM}(\psi(\mathbf{x}_u, \mathbf{x}_t), \mathbf{z}_{t-1}^{(u)})$ where $\mathbf{z}_0^{(u)}$ is initialized to a zero vector. The aggregated output is given by the LSTM's final state at the last timestep $N$ : $\mathbf{x}'_u = \phi(\mathbf{x}_u, \mathbf{z}_N^{(u)})$

Evaluation

The evaluation leverages the CLRS-30 benchmark, key for measuring NAR performance. RNAR's performance is compared against Triplet-GMPNNs, Relational Transformers (RT), and G-ForgetNets.

Sequential Algorithms Results

The RNAR model notably excels in several sequential algorithms. The most striking result is its performance on the Quickselect task, achieving a micro-F $_1$ score of 87.08%, a significant improvement over prior state-of-the-art methods. RNAR also demonstrates superior performance on other sequential tasks such as Bubble Sort, Heapsort, and Quicksort, with micro-F $_1$ scores frequently surpassing 90%.

| Algorithm | Triplet-GMPNN | RT | G-ForgetNet | RNAR | ||||--|| | Quickselect | 0.47% | 19.18% | 6.3% | 87.08% | | Bubble Sort | 80.51% | 38.22% | 83.19% | 95.78% | | Heapsort | 49.13% | 32.96% | 57.47% | 93.07% | | Quicksort | 85.69% | 39.42% | 73.28% | 94.73% |

Overall Results

RNAR maintains competitive performance across all tasks, achieving an average micro-F $_1$ score of 75.78% compared to 80.04% for Triplet-GMPNNs. This slight regression highlights the trade-off involved in departing from permutation invariance.

Discussion and Future Work

The promising results indicate that RNAR's approach to using a recurrent aggregator effectively captures the ordered nature of inputs in specific algorithmic tasks, where GNNs may struggle. However, challenges remain, notably in tasks like the Knuth-Morris-Pratt algorithm where RNAR's performance is suboptimal. The memory usage of LSTM aggregators also poses a limitation, which prevented its application to certain tasks without modifications.

Further research may explore:

Alternative Sequential Aggregators: Investigating other forms of recurrent or non-commutative aggregators, such as Binary-GRUs.
Optimized Architectures: Enhancing the memory efficiency of recurrent aggregators to handle more complex tasks without running out of memory.
Incorporating Automata Alignment: Adapting models to better align with tasks requiring string-processing capabilities.

In conclusion, while RNAR is not universally superior across all tasks, its marked improvements on sequential algorithms indicate a valuable direction in neural algorithmic reasoning. Future work in this area may yield models that combine the strengths of both GNNs and RNNs, further narrowing the gap between neural and classical algorithmic performance.