Papers
Topics
Authors
Recent
Search
2000 character limit reached

LLaSA: A Multimodal LLM for Human Activity Analysis Through Wearable and Smartphone Sensors

Published 20 Jun 2024 in cs.CL | (2406.14498v2)

Abstract: Integrating inertial measurement units (IMUs) with LLMs expands the potential of multimodal AI, enabling more nuanced human activity analysis. In this paper, we introduce LLaSA (Large Language and Sensor Assistant), a multimodal LLM built on LIMU-BERT and Llama, designed to interpret and answer queries related to human activities and motion analysis, leveraging sensor data and contextual reasoning. To develop LLaSA, we introduce two key datasets: SensorCaps, a comprehensive collection of 35,960 IMU-derived narratives with handcrafted features, and OpenSQA, an instruction-following dataset containing 179,727 question-answer pairs aware of the sensor and human activity context. These datasets provide diverse and rich inputs to train LLaSA for complex sensor-based queries. To optimize LLaSA's performance, we apply a unique hyperparameter tuning method, which significantly enhances its effectiveness in contextual question-answering tasks. Extensive evaluations, including a human-led assessment of the question-answering, demonstrate that LLaSA achieves superior data interpretation and context-aware responses compared to GPT-3.5-Turbo and Vicuna-1.5-13b-16K. These contributions advance the frontier of sensor-aware LLMs and create new opportunities for impactful multimodal research in healthcare, sports science, and human-computer interactions. Our code repository and datasets can be found at https://github.com/BASHLab/LLaSA.

Citations (2)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.