Robustness of joint MLLM–WM embodied systems under sensor noise and partial observability
Ensure robustness of joint multimodal large language model–world model (MLLM–WM) driven embodied AI architectures against sensor noise and partial observability by developing methods that maintain reliable performance in dynamic, real-world environments.
References
Additionally, training such systems requires vast multimodal datasets covering rare edge cases, while ensuring robustness against sensor noise and partial observability remains unsolved.
— Embodied AI: From LLMs to World Models
(2509.20021 - Feng et al., 24 Sep 2025) in Section 5.3 (Discussions)