MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility (2407.08725v2)

Published 11 Jul 2024 in cs.CV, cs.AI, and cs.RO

Abstract: Public urban spaces like streetscapes and plazas serve residents and accommodate social life in all its vibrant variations. Recent advances in Robotics and Embodied AI make public urban spaces no longer exclusive to humans. Food delivery bots and electric wheelchairs have started sharing sidewalks with pedestrians, while robot dogs and humanoids have recently emerged in the street. Micromobility enabled by AI for short-distance travel in public urban spaces plays a crucial component in the future transportation system. Ensuring the generalizability and safety of AI models maneuvering mobile machines is essential. In this work, we present MetaUrban, a compositional simulation platform for the AI-driven urban micromobility research. MetaUrban can construct an infinite number of interactive urban scenes from compositional elements, covering a vast array of ground plans, object placements, pedestrians, vulnerable road users, and other mobile agents' appearances and dynamics. We design point navigation and social navigation tasks as the pilot study using MetaUrban for urban micromobility research and establish various baselines of Reinforcement Learning and Imitation Learning. We conduct extensive evaluation across mobile machines, demonstrating that heterogeneous mechanical structures significantly influence the learning and execution of AI policies. We perform a thorough ablation study, showing that the compositional nature of the simulated environments can substantially improve the generalizability and safety of the trained mobile agents. MetaUrban will be made publicly available to provide research opportunities and foster safe and trustworthy embodied AI and micromobility in cities. The code and dataset will be publicly available.

PDF HTML Abstract

MetaUrban: An Advanced Simulation Platform for Embodied AI in Urban Spaces

This paper introduces MetaUrban, a novel compositional simulation platform designed for Embodied AI research in public urban environments. Unlike existing simulation platforms which focus predominantly on indoor or driving scenarios, MetaUrban addresses the unique characteristics and challenges of urban spaces such as streetscapes and plazas that are increasingly being shared by humans and mobile robots.

Key Features of MetaUrban

MetaUrban stands out with its procedural generation capabilities, allowing for the creation of infinite interactive urban scenes. This flexibility is facilitated by three major components:

Hierarchical Layout Generation: This component enables the generation of diverse urban layouts by hierarchically assembling street blocks, sidewalks, and crosswalks. Scenes can be tailored for various urban settings by defining layout parameters, such as block types and geometric zones, to improve agent generalization across different urban environments.
Scalable Object Retrieval: Utilizing large-scale 3D asset repositories and Vision-LLMs (VLMs), MetaUrban incorporates an open-vocabulary search to extract relevant objects with real-world distribution patterns. This approach ensures that the virtual environments closely mimic real urban landscapes and allow for customized object placement.
Cohabitant Populating: MetaUrban includes a diverse range of dynamic agents including rigged human models, vulnerable road users, and mobile machines such as robot dogs and delivery bots. These agents bring life to the virtual environments, with their trajectories managed by advanced path planning algorithms that promote safety and social conformity.

Experimentation and Results

The authors designed two primary tasks: Point Navigation and Social Navigation, to benchmark various AI methodologies including Reinforcement Learning (RL), Safe Reinforcement Learning (SafeRL), Offline RL, and Imitation Learning (IL) within the MetaUrban framework. Results indicate that MetaUrban's compositional nature significantly enhances the generalizability of trained models, making them more capable in unseen environments.

Importantly, experiments reveal the complexity and scale extensions of MetaUrban's scenes challenge current state-of-the-art AI techniques, highlighting areas for future advancement in safe and effective navigation in urban spaces. The findings were bolstered by the creation of the MetaUrban-12K dataset, which provides a rich resource for AI training with over 12,000 scenes designed to span a wide variety of urban layouts and conditions.

Implications and Future Directions

The introduction of MetaUrban brings forth several implications both practical and theoretical:

Practical Applications: MetaUrban supports the development of robust navigation systems for mobile robots that can safely and efficiently navigate through crowded urban streets. This is crucial for applications such as last-mile delivery and autonomous urban transportation.
Theoretical Contributions: The platform encourages the exploration of Embodied AI's interactions within human-populated environments, advancing theories around robot-human cohabitation and interaction dynamics.

Looking forward, the potential for MetaUrban to serve as a foundation for developing urban-specific AI models is vast. Future work may focus on enhancing the simulation's realism by integrating acoustic simulation and more nuanced human-agent interaction models. Furthermore, the platform can be a catalyst for interdisciplinary research that connects AI with urban planning, sociology, and safety engineering. By fostering a comprehensive understanding of AI's role in urban settings, MetaUrban represents a significant step towards integrating intelligent systems into the fabric of modern cities.

PDF Markdown Bookmark Chat (Pro)

Authors (8)

Wayne Wu (60 papers)
Honglin He (9 papers)
Yiran Wang (78 papers)
Chenda Duan (4 papers)
Jack He (3 papers)
Zhizheng Liu (13 papers)
Quanyi Li (19 papers)
Bolei Zhou (134 papers)

Related Papers

Find Related Papers

Tweets

https://twitter.com/zhoubolei/status/1886568374471418371

https://twitter.com/_vztu/status/1811864390079250718

https://twitter.com/OWW/status/1811889764594774421