Papers
Topics
Authors
Recent
Search
2000 character limit reached

HeatACO: Neural-ACO Decoder for TSP

Updated 30 January 2026
  • The paper introduces HeatACO, a decoding algorithm that blends neural priors with a Max-Min Ant System to construct feasible TSP tours under strict degree and single-cycle constraints.
  • It employs a candidate edge list and dynamic pheromone updates, using local distance heuristics and a heatmap exponent to balance exploration and error correction.
  • Optional 2-opt and 3-opt post-processing further refine solutions, yielding competitive gaps and CPU times on TSP instances up to 10K nodes.

HeatACO is a decoding algorithm introduced for large-scale Travelling Salesman Problems (TSP) that integrates neural "heatmap" predictions with a probabilistic Ant Colony Optimization (ACO) framework. It is designed to translate dense edge-probability matrices generated by neural predictors into feasible TSP tours that obey degree-2 and single-cycle constraints, offering high-quality solutions with computational efficiency at scale (Lin et al., 26 Jan 2026).

1. Problem Formulation and Decoding Challenges

The large-scale symmetrical TSP is defined over NN points with coordinates xiR2x_i\in\mathbb R^2 and inter-point distances dij=xixj2d_{ij} = \|x_i - x_j\|_2. A legal TSP tour satisfies two critical constraints: (i) each node has degree 2, enforced by jAij=2\sum_j A_{ij} = 2 for the adjacency matrix A{0,1}N×NA \in \{0,1\}^{N\times N}, and (ii) the tour forms a single cycle, excluding subtours.

Heatmap-based non-autoregressive TSP solvers output a confidence matrix Hij[0,1]N×NH_{ij} \in [0,1]^{N\times N}, where higher HijH_{ij} signals greater neural confidence that edge (i,j)(i,j) belongs to a near-optimal solution. Decoding aims to map (H,D)(H, D) to a feasible Hamiltonian cycle. Standard greedy heuristics—such as edge selection by HijH_{ij} ranking—aggregate errors at scale, yielding poor performance as NN increases. While MCTS-guided k-opt solvers mitigate such error cascades and enforce constraints accurately, their computational costs are prohibitive for high NN.

HeatACO instead reframes decoding as constrained probabilistic construction. It samples tours from a distribution that blends three influences:

  • Local geometry (dijd_{ij} as a distance heuristic),
  • Neural prior (HijH_{ij} as a soft edge prior),
  • Global feedback (pheromone trails τij\tau_{ij} learned during search).

2. HeatACO Algorithm: Max-Min Ant System Structure

HeatACO is instantiated as a Max-Min Ant System (MMAS) [Stützle & Hoos 2000], maintaining:

  • A pheromone matrix τR>0N×N\tau \in \mathbb R_{>0}^{N\times N} (dynamic global feedback),
  • A static distance heuristic ηij=1/dij\eta_{ij} = 1/d_{ij},
  • A fixed heatmap HijH_{ij}.

Transition Probability:

Ants construct tours stepwise. At node ii, the next node jUij \in U_i (unvisited, feasible) is selected with: Pij=[τij]α[ηij]β[H~ij]ykUi[τik]α[ηik]β[H~ik]yP_{ij} = \frac{\left[\tau_{ij}\right]^{\alpha} \left[\eta_{ij}\right]^{\beta} \left[\tilde H_{ij}\right]^y}{\sum_{k\in U_i} \left[\tau_{ik}\right]^{\alpha} \left[\eta_{ik}\right]^{\beta} \left[\tilde H_{ik}\right]^y} where α,β>0\alpha, \beta > 0 are exponents, H~ij=max(Hij,ϵ)\tilde H_{ij} = \max(H_{ij},\epsilon) avoids zero-probability transitions (ϵ=109\epsilon=10^{-9}), and y0y \ge 0 is the heatmap exponent modulating reliance on the neural prior. y=0y=0 recovers vanilla MMAS.

Candidate Edge Lists:

To achieve scalability, HeatACO restricts sampling and search to O(Nk)O(Nk) candidate edges (k20k\approx 20). For each node:

  1. Retain edges with HijEhH_{ij}\ge E_h (Eh=104E_h = 10^{-4}).
  2. For each ii, take the top-kk highest-HijH_{ij} neighbors (if above threshold).
  3. If needed, pad to kk neighbors with closest nodes by dijd_{ij}.

Pheromone Update:

After each batch of ant tours:

  • Evaporation: τij(1ρ)τij\tau_{ij} \leftarrow (1-\rho)\tau_{ij}
  • Reinforcement: If (i,j)(i,j) is on elite tour TT^*, τijτij+ρ/Length(T)\tau_{ij} \leftarrow \tau_{ij} + \rho/\text{Length}(T^*)
  • Clamping: τij[τmin,τmax]\tau_{ij}\in [\tau_{\min},\tau_{\max}], where τmax=1/(ρLnn)\tau_{\max}=1/(\rho L_{\rm nn}), τmin=τmax/a\tau_{\min}=\tau_{\max}/a (aNa\sim N), and ρ\rho is the evaporation rate.

3. Global Coordination and Correction of Local Errors

The heatmap HH serves as a soft prior—no edge is strictly forbidden, as even low-confidence options remain accessible. Tour feasibility constraints (degree, subtour) are enforced during sampling. Over multiple iterations, if a high-HijH_{ij} edge consistently leads to infeasible or suboptimal tours, reinforcement is withheld and pheromone levels for such edges decay, while effective ones are reinforced. This moderates local heatmap mis-rankings, correcting error cascades without resorting to intensive backtracking or search tree expansion.

4. Post-Processing: 2-opt and 3-opt Local Search

Optional post-processing using 2-opt or 3-opt exchanges is undertaken on the O(Nk)O(Nk) candidate edge set to further refine constructed tours. In 2-opt, pairs of edges are considered for replacement if the exchange reduces total tour length, with iterative improvement halted when no further gains are found. 3-opt iteratively attempts more complex triple-edge improvements.

Cost for these routines is O(Nkp2)O(Nk\,p_2) for p2p_2 2-opt passes and O(Nk2p3)O(Nk^2p_3) for p3p_3 3-opt passes. For NN up to 10,000, these searches typically complete within seconds.

5. Experimental Results and Performance Benchmarks

HeatACO was evaluated on TSP500, TSP1K, and TSP10K datasets, with heatmaps derived from AttGCN [Fu et al.], DIMES (Ożański et al., 2022), UTSP (Erceg et al., 2023), and DIFUSCO (Troulé et al., 2023). Baselines include NAR + Greedy merge (fast but brittle), published parallel MCTS combined with k-opt (Pan et al., 2024), and vanilla MMAS.

Key empirical outcomes for fixed heatmaps, ++2-opt, using m=32m=32 ants, 5000 iterations:

Dataset Gap (%) CPU Time
TSP500 0.11 ≈ 2 s
TSP1K 0.23 ≈ 5 s
TSP10K 1.15 ≈ 1 m

Further tightening with 3-opt achieved sub-0.01% gaps on TSP500, approximately 0.05% on TSP1K (tens of seconds), and approximately 0.4% on TSP10K (≈4 m). Greedy merge delivered significantly inferior results (gaps >10–40%), while MCTS/k-opt achieved gaps of 1–4% with much higher CPU times (50 s–16 m).

6. Heatmap Reliability and Distribution Shift Effects

Sparse O(N)O(N) candidate sets with near-perfect recall are attainable by thresholding the heatmap. However, most candidates reside in low-confidence regions, complicating decoding since true tour edges concentrate in a mid-to-high confidence band. Under distribution shift (e.g., TSPLIB circuits, drilling instances), candidate set sizes inflate (Edges/N20N \gg 20) and heatmap confidence can collapse, leading to degraded performance for greedy approaches. HeatACO remained robust, maintaining sub-1% gaps in seconds and matching or surpassing parallel MCTS at substantially reduced CPU burden.

Auxiliary diagnostics such as binary cross-entropy (CE) and class-weighted CE (WCE) of HH relative to the reference tour correlated with decoding difficulty, but did not fully predict performance.

7. Hyperparameterization and Practical Considerations

The heatmap exponent yy sharply modulates the influence of the neural prior. Sweeping yy across {0.1,0.5,1,2}\{0.1, 0.5, 1, 2\} is empirically sufficient. Greater yy sharpens the prior and can accelerate convergence but risks overcommitting to misranked edges or suffering from poor calibration, especially under aggressive local search. Smaller yy promotes exploration when HH is noisy. An entropy-based label-free heuristic can also automate yy selection by targeting the effective support size of the heatmap-only proposal per node.

Parameter settings, full reproducibility instructions, and source code are available at https://github.com/bochenglin/HEATACO (Lin et al., 26 Jan 2026).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to HeatACO.