Aria Everyday Activities Dataset (2402.13349v2)

Published 20 Feb 2024 in cs.CV, cs.AI, and cs.HC

Abstract: We present Aria Everyday Activities (AEA) Dataset, an egocentric multimodal open dataset recorded using Project Aria glasses. AEA contains 143 daily activity sequences recorded by multiple wearers in five geographically diverse indoor locations. Each of the recording contains multimodal sensor data recorded through the Project Aria glasses. In addition, AEA provides machine perception data including high frequency globally aligned 3D trajectories, scene point cloud, per-frame 3D eye gaze vector and time aligned speech transcription. In this paper, we demonstrate a few exemplar research applications enabled by this dataset, including neural scene reconstruction and prompted segmentation. AEA is an open source dataset that can be downloaded from https://www.projectaria.com/datasets/aea/. We are also providing open-source implementations and examples of how to use the dataset in Project Aria Tools https://github.com/facebookresearch/projectaria_tools.

References (29)

Citations (3)

View on Semantic Scholar

Summary

The paper presents a novel multimodal dataset that advances egocentric AI research by capturing 143 sequences of daily activities across diverse indoor settings.
It leverages advanced machine perception techniques to provide high-resolution RGB video, calibrated eye gaze, semi-dense point clouds, and aligned 3D trajectories.
The dataset enables groundbreaking research in 3D scene reconstruction and prompted segmentation, while ensuring ethical data handling through robust anonymization practices.

Comprehensive Overview of the Aria Everyday Activities (AEA) Dataset

Introduction to AEA Dataset

The rapid advancements in augmented reality (AR) and AI promise to integrate AR devices and personal wearable AI into our daily lives. The Aria Everyday Activities (AEA) Dataset emerges as a pivotal resource for researchers aiming to exploit the unique multimodal data streams these wearable devices can offer. This dataset provides egocentric data from Project Aria glasses, encapsulating a wide variety of daily activities captured in five distinct indoor locations. Unlike previous datasets, AEA extends the variety of sensory data to include high-resolution RGB and monochrome videos, eye-tracking, spatial audio, and more, creating a robust foundation for egocentric AI research.

Dataset Overview

AEA comprises 143 sequences of daily activities recorded across different environments. This significant volume of multimodal sensor data includes aligned 3D trajectories, point clouds, eye gaze vectors, and time-aligned speech transcriptions, addressing the gap left by traditional datasets in egocentric 3D and multimodal learning. The uniqueness of AEA also lies in its 4D longitudinal data, offering researchers a deep dive into the temporal dynamics and spatial nuances of everyday activities from a first-person perspective.

Technological and Methodological Contributions

Key innovations brought forward by AEA include:

Updated Machine Perception Data: Leveraging the latest Machine Perception Services, the dataset offers enhanced pose information, semi-dense point clouds, and calibrated eye gaze data, enriching the depth of contextual understanding for AI models.
Data Collection and Anonymization: Following Meta's Responsible Innovation Principles, the dataset ensures privacy by anonymizing all personal identifiable information within the recordings, setting a precedent for ethical data handling in research.
Dataset Tools: Accompanying the AEA dataset, open-source tools have been updated to facilitate data manipulation, offering capabilities to handle multimodal data and machine perception outputs effectively. This package includes tools for visualizing synchronized activities, further simplifying the exploration and utilization of the dataset.

Exemplar Research Applications Enabled by AEA

The AEA dataset's richness allows for groundbreaking applications in AI research. Two highlighted research directions include:

3D Neural Scene Reconstruction: Demonstrating the potential of the dataset in reconstructing high-quality 3D scenes from egocentric data, the paper showcases applications leveraging the dataset's closed-loop trajectories and point clouds for immersive AR/VR experiences.
Prompted Segmentation: Exploring the integration of eye gaze and speech prompts with foundational models for object segmentation, AEA enables novel approaches to contextual object recognition, opening avenues for research in interactive AI systems.

Future Directions and Impact

The AEA dataset not only pushes the boundaries of what's possible with egocentric AI research but also sets a new standard in dataset utility and ethical considerations. By providing a comprehensive suite of tools alongside the dataset, the authors ensure accessibility and encourage widespread adoption within the research community.

Looking forward, the AEA dataset has the potential to foster innovation in personalized assistive AI technologies, augmented reality applications, and beyond. It emphasizes the need for rich, contextual data in developing AI systems that understand and predict user intent and interaction with their surroundings, marking a significant step forward in creating more intuitive and immersive AI experiences.

Conclusion

The Aria Everyday Activities Dataset provides an unprecedented resource for the exploration of personalized and contextualized AI research. With its diverse range of sensor modalities, spatial-temporal alignment, and ethical data practices, AEA is poised to drive forward the development of intelligent systems that enhance human-computer interaction in daily life. As we move towards a future where AR and AI are seamlessly integrated into our everyday experiences, datasets like AEA will play a crucial role in realizing this vision.

PDF Markdown

Related Papers

Tweets

https://twitter.com/arankomatsuzaki/status/1760501292362068301

https://twitter.com/LvZhaoyang/status/1760706335346446336

https://twitter.com/_akhaliq/status/1760502295374987341

https://twitter.com/fly51fly/status/1760790324530774438

https://twitter.com/gastronomy/status/1760530817342656890

https://twitter.com/gm8xx8/status/1760502052545851416

YouTube

Show All Videos