Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TensorNEAT: A GPU-accelerated Library for NeuroEvolution of Augmenting Topologies (2504.08339v1)

Published 11 Apr 2025 in cs.NE

Abstract: The NeuroEvolution of Augmenting Topologies (NEAT) algorithm has received considerable recognition in the field of neuroevolution. Its effectiveness is derived from initiating with simple networks and incrementally evolving both their topologies and weights. Although its capability across various challenges is evident, the algorithm's computational efficiency remains an impediment, limiting its scalability potential. To address these limitations, this paper introduces TensorNEAT, a GPU-accelerated library that applies tensorization to the NEAT algorithm. Tensorization reformulates NEAT's diverse network topologies and operations into uniformly shaped tensors, enabling efficient parallel execution across entire populations. TensorNEAT is built upon JAX, leveraging automatic function vectorization and hardware acceleration to significantly enhance computational efficiency. In addition to NEAT, the library supports variants such as CPPN and HyperNEAT, and integrates with benchmark environments like Gym, Brax, and gymnax. Experimental evaluations across various robotic control environments in Brax demonstrate that TensorNEAT delivers up to 500x speedups compared to existing implementations, such as NEAT-Python. The source code for TensorNEAT is publicly available at: https://github.com/EMI-Group/tensorneat.

Summary

  • The paper introduces tensorization to convert variable-sized neural networks into fixed-shape tensors, enabling efficient GPU acceleration for NEAT.
  • It details tensorized operations for structural mutations and inference, achieving up to 500x speedups and near-linear scaling across multiple GPUs.
  • Experiments on robotics tasks demonstrate faster convergence and higher-quality solutions compared to traditional CPU-based NEAT implementations.

This paper introduces TensorNEAT, a GPU-accelerated library designed to address the computational inefficiency of the NeuroEvolution of Augmenting Topologies (NEAT) algorithm. While NEAT is recognized for its ability to evolve both the structure and weights of neural networks starting from simple topologies, its traditional implementations suffer from performance bottlenecks, limiting scalability. Existing NEAT libraries are often CPU-based using object-oriented programming, incurring overhead, or utilize GPUs only for network inference, leaving the core evolutionary processes unaccelerated.

The core innovation presented is tensorization, a method to represent NEAT's populations of networks with diverse topologies as uniformly shaped tensors. This involves:

  1. Tensorized Encoding: Representing individual nodes and connections as 1D tensors. Node sets (NN) and connection sets (CC) of a network are represented as 2D tensors (N\boldsymbol{N}, C\boldsymbol{C}). To handle varying network sizes within a population, these tensors are padded with NaN values up to predefined maximums (Nmax|N|_\text{max}, Cmax|C|_\text{max}) creating fixed-size tensors (N^\hat{\boldsymbol{N}}, C^\hat{\boldsymbol{C}}). The entire population (PP) is then represented by batching these padded tensors into 3D tensors (PN\boldsymbol{P}_{N}, PC\boldsymbol{P}_{C}).
  2. Tensorized Operations: NEAT's structural mutations (adding/deleting nodes/connections) and attribute modifications (changing weights/biases) are translated into tensor indexing and assignment operations on these population tensors.
  3. Tensorized Inference: Network inference is split into a one-time 'transformation' step and a 'calculation' step. For feedforward networks, transformation involves computing the topological node order (Norder\boldsymbol{N}_{\text{order}}) and an expanded connection tensor (Cexp\boldsymbol{C}_{\text{exp}}) suitable for parallel computation. The calculation step iteratively computes node values based on the ordered nodes and connection tensors.

TensorNEAT is implemented using JAX, leveraging its capabilities for:

  • Hardware Acceleration: Automatic execution on GPUs/TPUs.
  • Automatic Vectorization: Using jax.vmap to parallelize operations across the population dimension.
  • Multi-Device Parallelism: Using jax.pmap for execution across multiple GPUs/TPUs.

The library also features:

  • User-friendly Interfaces: Customizable hyperparameters and problem/network definitions (supporting SNNs, BNNs).
  • Visualization Tools: Generating topology diagrams and network formulations (LaTeX, Python code).
  • Extensions: Support for NEAT variants like CPPN and HyperNEAT, and integration with benchmarks like Gym, Brax, and gymnax.

Experiments were conducted on Brax robotic control tasks (Swimmer, Hopper, HalfCheetah) comparing TensorNEAT (on GPUs and CPU) with NEAT-Python (CPU) and evosax (GPU GAs). Key findings include:

  • vs. NEAT-Python: TensorNEAT achieves significant speedups (up to 500x on RTX 4090 in HalfCheetah) across various hardware. It shows much better runtime scalability with increasing population size and more stable per-generation runtimes. TensorNEAT also demonstrated faster convergence to higher fitness values.
  • Multi-GPU Scalability: TensorNEAT shows near-linear performance scaling with multiple GPUs (tested up to 8), primarily accelerating the evaluation phase, with diminishing returns due to communication overhead.
  • Population Size Impact: Larger populations in TensorNEAT generally lead to better solutions faster due to effective parallelism, although performance can be limited by hardware capacity on complex tasks.
  • vs. evosax (GPU GAs): TensorNEAT (using NEAT) outperformed various GA algorithms from evosax on more complex tasks (Hopper, HalfCheetah) in terms of solution quality, despite a ~10% higher runtime due to the irregular computation patterns of evolved networks compared to fixed MLPs.

The paper concludes that tensorization significantly enhances NEAT's performance and scalability by enabling efficient GPU acceleration. TensorNEAT provides a practical and high-performance tool for neuroevolution research. Future work includes extending TensorNEAT to distributed computing environments and incorporating more advanced NEAT variants like DeepNEAT and CoDeepNEAT.

Github Logo Streamline Icon: https://streamlinehq.com