Zero-shot Structure Learning and Planning for Autonomous Robot Navigation using Active Inference (2510.09574v1)

Published 10 Oct 2025 in cs.RO

Abstract: Autonomous navigation in unfamiliar environments requires robots to simultaneously explore, localise, and plan under uncertainty, without relying on predefined maps or extensive training. We present a biologically inspired, Active Inference-based framework, Active Inference MAPping and Planning (AIMAPP). This model unifies mapping, localisation, and decision-making within a single generative model. Inspired by hippocampal navigation, it uses topological reasoning, place-cell encoding, and episodic memory to guide behaviour. The agent builds and updates a sparse topological map online, learns state transitions dynamically, and plans actions by minimising Expected Free Energy. This allows it to balance goal-directed and exploratory behaviours. We implemented a ROS-compatible navigation system that is sensor and robot-agnostic, capable of integrating with diverse hardware configurations. It operates in a fully self-supervised manner, is resilient to drift, and supports both exploration and goal-directed navigation without any pre-training. We demonstrate robust performance in large-scale real and simulated environments against state-of-the-art planning models, highlighting the system's adaptability to ambiguous observations, environmental changes, and sensor noise. The model offers a biologically inspired, modular solution to scalable, self-supervised navigation in unstructured settings. AIMAPP is available at https://github.com/decide-ugent/AIMAPP.

Summary

The paper introduces AIMAPP, a unified active inference framework that combines mapping, localisation, and planning for zero-shot navigation.
It employs variational inference and Monte Carlo Tree Search to balance exploration with goal-directed actions while managing environmental uncertainties.
Experimental results demonstrate near-teleoperation efficiency, robust drift resilience, and adaptability to dynamic changes and sensor failures.

Introduction and Motivation

This work introduces AIMAPP, an Active Inference-based navigation framework for autonomous robots that unifies mapping, localisation, and planning within a single generative model. The approach is inspired by hippocampal navigation mechanisms, leveraging topological reasoning, place-cell encoding, and episodic memory. AIMAPP is designed for zero-shot operation: it requires no pre-training, is sensor- and robot-agnostic, and operates in a fully self-supervised manner. The system is robust to odometric drift, ambiguous observations, and dynamic environmental changes, and is compatible with ROS-based robotic platforms.

Model Architecture and Active Inference Formulation

AIMAPP formalises the navigation problem as a POMDP, with a generative model that factorises latent state into spatial position and discrete state variables. The agent maintains beliefs over states, positions, and observations, updating these via variational inference to minimise variational free energy (VFE). Planning is performed by simulating action sequences (policies) using Monte Carlo Tree Search (MCTS), scoring them by expected free energy (EFE), which balances epistemic (information gain) and pragmatic (goal-directed) value.

Figure 1: The POMDP structure of AIMAPP, integrating states, positions, and observations, with categorical distributions for transitions and likelihoods.

The generative model is specified by categorical distributions for state transitions ( $B_s$ ), position transitions ( $B_p$ ), observation likelihoods ( $A_o$ ), and position likelihoods ( $A_p$ ). The agent updates these parameters online, expanding the model structure as new states or observations are encountered. Model updates are governed by the reduction in free energy, ensuring that only model expansions that improve explanatory power are retained.

Topological Mapping and Node Creation

AIMAPP constructs a sparse topological map, where nodes represent distinct states associated with metric positions and panoramic observations. Node creation is triggered when the agent's predicted position exceeds an influence radius from existing nodes, ensuring efficient coverage without redundant state proliferation. The connectivity of the graph is determined by a discrete action set (e.g., 12 headings), and new nodes are only created if they respect minimum distance constraints.

Figure 2: Node influence and creation in the topological map, showing valid and invalid positions for new nodes based on influence radius and robot size.

Observations are stored as $360^\circ$ panoramas, and similarity is assessed using SSIM. The map is updated dynamically as the agent explores, with new nodes and transitions added as required by the agent's experience.

Figure 3: Schematic of the topological map in a two-room environment, with nodes representing states and links denoting plausible transitions.

Planning and Policy Selection

Planning is performed via MCTS, where candidate policies are evaluated by their EFE. The EFE for a policy is computed as a sum over future steps, incorporating expected information gain (epistemic value) and utility terms reflecting preferences over observations, positions, and states. An inductive prior term propagates utility from goal states to intermediate nodes, enabling efficient long-horizon planning even when the goal is not directly reachable within the planning horizon.

Figure 4: AIMAPP's planning and localisation process, showing the agent's topological map, visited and unvisited states, and planning through the graph.

Policy selection is stochastic, controlled by a precision parameter, and the agent can flexibly balance exploration and exploitation by adjusting the weighting of epistemic and pragmatic terms.

Localisation and Drift Robustness

Localisation is performed by inferring the most plausible state in the topological map given current observations and predicted motion. The system is robust to odometric drift, as localisation is belief-driven rather than strictly metric. When discrepancies arise between predicted and observed outcomes, the agent can re-localise, update its observation model, or enter an exploratory phase to recover.

Figure 5: Illustration of drift impact on localisation, showing possible belief updates depending on confidence in motion and observation.

This approach allows the agent to maintain operational consistency even when physical odometry is unreliable, a critical property for real-world deployment.

System Integration and Modularity

AIMAPP is implemented as a modular, ROS-compatible system. The architecture separates perception, motion planning, and the core active inference loop, allowing integration with diverse hardware and sensor configurations.

Figure 6: System architecture, highlighting modular components and the central role of belief propagation in mapping, localisation, and planning.

The system prioritises believed odometry over raw sensor odometry, and user-specified preferences (e.g., goal observations) can be incorporated at runtime.

Experimental Evaluation

Exploration Efficiency

AIMAPP was evaluated in both simulated (Gazebo-based warehouses and houses) and real-world environments (warehouses, parking lots, homes) up to 325 m². Coverage efficiency (area explored per distance travelled) was used as the primary metric, with comparisons to state-of-the-art planners: FAEL, GBPlanner, and Frontier-based exploration.

Figure 7: Coverage efficiency across models in large simulated and real environments, measured as area covered per distance travelled.

AIMAPP achieved coverage efficiency within 90% of manual teleoperation, outperforming or matching other planners, especially in large or cluttered environments where heuristic methods struggled.

Adaptation to Environmental Change

The agent dynamically adapts its map in response to environmental changes, such as moved obstacles, by weakening or reinforcing transitions based on observed feasibility.

Figure 8: Real-time adaptation to a displaced obstacle, with the map updating transition likelihoods and creating new nodes as needed.

Drift Quantification

In controlled experiments with ground-truth odometry, AIMAPP's drift was comparable to or better than baseline methods, with lower variance and higher robustness to sensor failures.

Figure 9: Comparison of internal belief, sensor odometry, and ground-truth trajectories, highlighting AIMAPP's drift resilience.

Goal-reaching was evaluated by specifying target observations and measuring path efficiency relative to A* on the agent's internal graph. AIMAPP achieved mean path lengths within 9% of optimal, with high robustness to drift and environmental changes.

Figure 10: MCTS-based path planning toward a goal observation, showing the evolution of state utility and the agent's trajectory.

Figure 11: Travelled distance to goals versus A

expected distance, demonstrating near-optimal goal-reaching across environments.*

Figure 12: Example trajectories in warehouse and garage environments, comparing agent paths to ideal trajectories.

Computational Performance

AIMAPP's computational and memory requirements scale linearly with model size. On embedded hardware (Jetson Orin Nano), CPU and RAM usage remained well within operational limits, and execution time scaled predictably with the number of states.

Figure 13: Computational resource usage and execution time as a function of model size, confirming scalability.

Discussion

AIMAPP demonstrates that Active Inference provides a viable, unified framework for zero-shot navigation in unknown environments. The system achieves exploration and goal-reaching performance comparable to or exceeding state-of-the-art planners, with strong robustness to drift, ambiguous observations, and dynamic changes. The topological map structure enables efficient scaling, and the modular design supports integration with diverse robotic platforms.

Theoretical implications include the demonstration that biologically inspired, probabilistic generative models can support robust, scalable navigation without pre-training or globally consistent metric maps. Practically, AIMAPP's resilience to drift and sensor failures, combined with its ability to adapt online to environmental changes, makes it suitable for deployment in real-world, unstructured settings.

Limitations include reliance on panoramic visual observations, which may be sensitive to large-scale environmental changes, and the potential for diluted observation likelihoods as the map grows. Future work should explore hierarchical reasoning, semantic observation encoding, and improved long-horizon planning via message passing or chunked spatial representations.

Conclusion

AIMAPP establishes Active Inference as a practical, interpretable, and robust approach to autonomous navigation in unknown environments. By unifying mapping, localisation, and planning within a single generative model, and leveraging online structure learning, the system achieves strong empirical performance without pre-training or heavy reliance on metric consistency. The framework's modularity, scalability, and resilience position it as a promising candidate for real-world robotic autonomy, with future extensions likely to further enhance efficiency and robustness.