Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CAPTURE-24: A large dataset of wrist-worn activity tracker data collected in the wild for human activity recognition (2402.19229v1)

Published 29 Feb 2024 in cs.HC

Abstract: Existing activity tracker datasets for human activity recognition are typically obtained by having participants perform predefined activities in an enclosed environment under supervision. This results in small datasets with a limited number of activities and heterogeneity, lacking the mixed and nuanced movements normally found in free-living scenarios. As such, models trained on laboratory-style datasets may not generalise out of sample. To address this problem, we introduce a new dataset involving wrist-worn accelerometers, wearable cameras, and sleep diaries, enabling data collection for over 24 hours in a free-living setting. The result is CAPTURE-24, a large activity tracker dataset collected in the wild from 151 participants, amounting to 3883 hours of accelerometer data, of which 2562 hours are annotated. CAPTURE-24 is two to three orders of magnitude larger than existing publicly available datasets, which is critical to developing accurate human activity recognition models.

Comprehensive Analysis of the CAPTURE-24 Dataset for Enhancing Human Activity Recognition Models with In-the-wild Data

Background & Summary

The utility of wrist-worn accelerometers in health-related research is gaining traction, driven by their potential to offer objective, high-resolution insights into individual activity patterns. While accelerometers promise transformative applications in precision medicine, digital phenotyping, and epidemiological studies, the effectiveness of human activity recognition (HAR) models is limited by the quality and scope of available data. Existing datasets, predominantly collected under controlled laboratory conditions, fail to capture the complexity and variability of real-world human activities. The CAPTURE-24 dataset emerges as a response to this challenge, providing a large-scale, annotated collection of accelerometer data gathered from participants in free-living conditions. Spanning 3883 total hours, with 2562 hours of annotated data from 151 participants, it stands as a significantly larger and more representative resource than prior datasets. This advancement promises to usher in improved HAR models adept at navigating the nuanced landscape of human activities outside controlled environments.

Data Acquisition and Annotation

The CAPTURE-24 dataset was constituted from the data collected through wrist-worn Axivity AX3 accelerometers, wearable cameras, and sleep diaries, aimed at capturing a comprehensive 24-hour profile of participant activities. The innovative use of wearable cameras and sleep diaries as indirect measurement tools allowed for the detailed annotation of accelerometer data while ensuring participant privacy. This methodological choice not only facilitated long-duration data collection in free-living conditions but also contributed to the scalability and reduced labor intensity of the data collection process. The dataset underwent rigorous processing and annotation, adhering to a strict ethical framework to manage the privacy concerns associated with wearable camera data.

Benchmarking and Model Evaluation

Benchmarking exercises on the CAPTURE-24 dataset employed various commonly used and cutting-edge models, including random forests, XGBoost, convolutional neural networks (CNN), and recurrent neural networks (RNN), alongside the introduction of hidden Markov models (HMM) for temporal dependency modeling. These exercises underscored the critical importance of large, diverse datasets for the optimal performance of deep learning models in HAR. Notably, the research exhibits that models enhanced with HMM consistently outperform their counterparts across different tasks and metrics, highlighting the value of incorporating temporal dimensions in activity recognition models.

Challenges and Future Directions

While the CAPTURE-24 dataset represents a significant leap forward, it also accentuates the inherent challenges in HAR, particularly in distinguishing between closely related activities in a free-living environment. The findings stress the complexity of real-world human activities and the consequent difficulty in achieving high granularity in activity classification. Looking ahead, the paper suggests avenues for further research, including the exploration of larger and more demographically diverse datasets, the development of multimodal monitoring techniques, and the application of advanced model architectures capable of capturing the rich temporal and contextual nuances of human activities.

Conclusion

The CAPTURE-24 dataset, with its substantial volume and real-world applicability, serves as a crucial resource for advancing HAR research. By offering a detailed, in-the-wild dataset, it addresses the critical need for comprehensive and authentic activity data to refine the accuracy and applicability of HAR models. As research continues to evolve in this field, the CAPTURE-24 dataset will likely play a pivotal role in shaping the future of activity recognition, facilitating the development of models that are both robust and sensitive to the subtlest nuances of human movement and behavior.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (76)
  1. Creagh, A. P. et al. Digital health technologies and machine learning augment patient reported outcomes to remotely characterise rheumatoid arthritis. \JournalTitleMedRxiv 2022–11 (2022).
  2. Wearable movement-tracking data identify parkinson’s disease years before clinical diagnosis. \JournalTitleNature Medicine 1–9 (2023).
  3. At-home wearables and machine learning sensitively capture disease progression in amyotrophic lateral sclerosis. \JournalTitleNature Communications 14, 5080 (2023).
  4. Master, H. et al. Association of step counts over time with the risk of chronic disease in the all of us research program. \JournalTitleNature medicine 28, 2301–2308 (2022).
  5. Statistical machine learning of sleep and physical activity phenotypes from sensor data in 96,220 uk biobank participants. \JournalTitleScientific reports 8, 1–10 (2018).
  6. Walmsley, R. et al. Reallocating time from machine-learned sleep, sedentary behaviour or light physical activity to moderate-to-vigorous physical activity is associated with lower cardiovascular disease risk. \JournalTitlemedRxiv (2020).
  7. Gershuny, J. et al. Testing self-report time-use diaries against objective instruments in real time. \JournalTitleSociological Methodology 50, 318–349 (2020).
  8. A new dataset for evaluating pedometer performance. In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 865–869, 10.1109/BIBM.2017.8217769 (IEEE, Kansas City, MO, 2017).
  9. Smartphone and Smartwatch-Based Biometrics Using Activities of Daily Living. \JournalTitleIEEE Access 7, 133190–133202, 10.1109/ACCESS.2019.2940729 (2019).
  10. Baños, O. et al. A benchmark dataset to evaluate sensor displacement in activity recognition. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing, 1026–1035, 10.1145/2370216.2370437 (ACM, Pittsburgh Pennsylvania, 2012).
  11. Small, S. R. et al. Development and Validation of a Machine Learning Wrist-worn Step Detection Algorithm with Deployment in the UK Biobank. Preprint, Public and Global Health (2023). 10.1101/2023.02.20.23285750.
  12. Hang-Time HAR: A Benchmark Dataset for Basketball Activity Recognition Using Wrist-Worn Inertial Sensors. \JournalTitleSensors 23, 10.3390/s23135879 (2023).
  13. Detecting leisure activities with dense motif discovery. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing, 250–259, 10.1145/2370216.2370257 (ACM, Pittsburgh Pennsylvania, 2012).
  14. Wearables in the wet lab: A laboratory system for capturing and guiding experiments. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, 589–599, 10.1145/2750858.2807547 (ACM, Osaka Japan, 2015).
  15. On-body localization of wearable devices: An investigation of position-aware activity recognition. In 2016 IEEE International Conference on Pervasive Computing and Communications (PerCom), 1–9, 10.1109/PERCOM.2016.7456521 (IEEE, Sydney, Australia, 2016).
  16. Swimming style recognition and lap counting using a smartwatch and deep learning. In Proceedings of the 23rd International Symposium on Wearable Computers, 23–31, 10.1145/3341163.3347719 (ACM, London United Kingdom, 2019).
  17. WEAR: An Outdoor Sports Dataset for Wearable and Egocentric Activity Recognition, 10.48550/ARXIV.2304.05088 (2023).
  18. Yan, Y. et al. Topological Nonlinear Analysis of Dynamical Systems in Wearable Sensor-Based Human Physical Activity Inference. \JournalTitleIEEE Transactions on Human-Machine Systems 53, 792–801, 10.1109/THMS.2023.3275774 (2023).
  19. Opportunity++: A Multimodal Dataset for Video- and Wearable, Object and Ambient Sensors-Based Human Activity Recognition. \JournalTitleFrontiers in Computer Science 3, 792065, 10.3389/fcomp.2021.792065 (2021).
  20. Roggen, D. et al. Collecting complex activity datasets in highly rich networked sensor environments. In 2010 Seventh International Conference on Networked Sensing Systems (INSS), 233–240, 10.1109/INSS.2010.5573462 (IEEE, Kassel, Germany, 2010).
  21. WARD: A Wearable Action Recognition Database (2009).
  22. Introducing a New Benchmarked Dataset for Activity Monitoring. In 2012 16th International Symposium on Wearable Computers, 108–109, 10.1109/ISWC.2012.13 (IEEE, Newcastle, United Kingdom, 2012).
  23. Frade, F. D. l. T. et al. Guide to the Carnegie Mellon University Multimodal Activity (CMU-MMAC) Database. Tech. Rep. CMU-RI-TR-08-22, Carnegie Mellon University, Pittsburgh, PA (2008).
  24. Zappi, P. et al. Activity Recognition from On-Body Sensors: Accuracy-Power Trade-Off by Dynamic Sensor Selection. In Verdone, R. (ed.) Wireless Sensor Networks, vol. 4913, 17–33, 10.1007/978-3-540-77690-1_2 (Springer Berlin Heidelberg, Berlin, Heidelberg, 2008).
  25. Analysis of human behavior recognition algorithms based on acceleration data. In 2013 IEEE International Conference on Robotics and Automation, 1602–1607, 10.1109/ICRA.2013.6630784 (IEEE, Karlsruhe, Germany, 2013).
  26. Banos, O. et al. mHealthDroid: A Novel Framework for Agile Development of Mobile Health Applications. In Pecchia, L., Chen, L. L., Nugent, C. & Bravo, J. (eds.) Ambient Assisted Living and Daily Activities, vol. 8868, 91–98, 10.1007/978-3-319-13105-4_14 (Springer International Publishing, Cham, 2014).
  27. UTD-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In 2015 IEEE International Conference on Image Processing (ICIP), 168–172, 10.1109/ICIP.2015.7350781 (IEEE, Quebec City, QC, Canada, 2015).
  28. Berkeley MHAD: A comprehensive Multimodal Human Action Database. In 2013 IEEE Workshop on Applications of Computer Vision (WACV), 53–60, 10.1109/WACV.2013.6474999 (IEEE, Clearwater Beach, FL, USA, 2013).
  29. Comparative study on classifying human activities with miniature inertial and magnetic sensors. \JournalTitlePattern Recognition 43, 3605–3620, 10.1016/j.patcog.2010.04.019 (2010).
  30. UTD Multimodal Human Action Dataset (UTD-MHAD) Kinect V2 (2015).
  31. Kelly, P. et al. Developing a method to test the validity of 24 hour time use diaries using wearable cameras: a feasibility pilot. \JournalTitlePLoS One 10, e0142198 (2015).
  32. of the European Commission, S. O. et al. Harmonised european time use surveys, 2008 guidelines. \JournalTitleOffice for Official Publications of the European Communities (2009).
  33. White, T. et al. Estimating energy expenditure from wrist and thigh accelerometry in free-living adults: a doubly labelled water study. \JournalTitleInternational journal of obesity 43, 2333–2342 (2019).
  34. Shaker table validation of openmovement ax3 accelerometer. In Ahmerst (ICAMPAM 2013 AMHERST): In 3rd International Conference on Ambulatory Monitoring of Physical Activity and Movement, 69–70 (2013).
  35. Doherty, A. R. et al. Wearable cameras in health: the state of the art and future possibilities. \JournalTitleAmerican journal of preventive medicine 44, 320–323 (2013).
  36. Hodges, S. et al. Sensecam: A retrospective memory aid. In International Conference on Ubiquitous Computing, 177–193 (Springer, 2006).
  37. Martinez, J. et al. Validation of wearable camera still images to assess posture in free-living conditions. \JournalTitleJournal for the measurement of physical behaviour 4, 47–52 (2021).
  38. Kelly, P. et al. Ethics of using wearable cameras devices in health behaviour research. \JournalTitleAm J Prev Med 44, 314–319 (2013).
  39. Ainsworth, B. E. et al. 2011 compendium of physical activities: a second update of codes and met values. \JournalTitleMed Sci Sports Exerc 43, 1575–1581 (2011).
  40. Automatically assisting human memory: A sensecam browser. \JournalTitleMemory 19, 785–795 (2011).
  41. Van Hees, V. T. et al. Autocalibration of accelerometer data for free-living physical activity assessment using local gravity and temperature: an evaluation on four continents. \JournalTitleJournal of applied physiology 117, 738–744 (2014).
  42. A tutorial on human activity recognition using body-worn inertial sensors. \JournalTitleACM Computing Surveys (CSUR) 46, 1–33 (2014).
  43. Using random forest to learn imbalanced data. Tech. Rep. 666, University of California, Berkeley (2004).
  44. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785–794 (2016).
  45. Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In International conference on machine learning, 115–123 (PMLR, 2013).
  46. Identity mappings in deep residual networks. In ECCV, 630–645 (Springer, 2016).
  47. Zhang, R. Making convolutional networks shift-invariant again. In International conference on machine learning, 7324–7334 (PMLR, 2019).
  48. Li, L. et al. A system for massively parallel hyperparameter tuning. \JournalTitlearXiv preprint arXiv:1810.05934 (2018).
  49. Long short-term memory. \JournalTitleNeural computation 9, 1735–1780 (1997).
  50. Twomey, N. et al. A comprehensive study of activity recognition using accelerometers. In Informatics, vol. 5, 27 (Multidisciplinary Digital Publishing Institute, 2018).
  51. Yule, G. U. On the methods of measuring association between two attributes. \JournalTitleJournal of the Royal Statistical Society 75, 579–652 (1912).
  52. Cramér, H. Mathematical Methods of Statistics (PMS-9), Volume 9 (Princeton university press, 2016).
  53. Efron, B. The jackknife, the bootstrap and other resampling plans (SIAM, 1982).
  54. Sgdr: Stochastic gradient descent with warm restarts. \JournalTitlearXiv preprint arXiv:1608.03983 (2016).
  55. Smith, L. N. Cyclical learning rates for training neural networks. In 2017 IEEE winter conference on applications of computer vision (WACV), 464–472 (IEEE, 2017).
  56. Um, T. T. et al. Data augmentation of wearable sensor data for parkinson’s disease monitoring using convolutional neural networks. In Proceedings of the 19th ACM International Conference on Multimodal Interaction, 216–220 (2017).
  57. Doherty, A. et al. Gwas identifies 14 loci for device-measured physical activity and sleep duration. \JournalTitleNature communications 9, 1–8 (2018).
  58. Walmsley, R. et al. Reallocation of time between device-measured movement behaviours and risk of incident cardiovascular disease. \JournalTitleBritish journal of sports medicine (2021).
  59. Chen, Y. et al. Device-measured movement behaviours in over 20,000 china kadoorie biobank participants. \JournalTitleInternational Journal of Behavioral Nutrition and Physical Activity 20, 138 (2023).
  60. Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition. \JournalTitleSensors 16, 115 (2016).
  61. Yuan, H. et al. Self-supervised learning of accelerometer data provides new insights for sleep and its association with mortality. \JournalTitlemedRxiv (2023).
  62. Haresamudram, H. et al. Masked reconstruction based self-supervision for human activity recognition. In Proceedings of the 2020 ACM International Symposium on Wearable Computers, 45–49 (2020).
  63. Multi-task self-supervised learning for human activity detection. \JournalTitleProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 1–30 (2019).
  64. Assessing the state of self-supervised human activity recognition using wearables. \JournalTitleProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 1–47 (2022).
  65. Collossl: Collaborative self-supervised learning for human activity recognition. \JournalTitleProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 1–28 (2022).
  66. Yuan, H. et al. Self-supervised learning for human activity recognition using 700,000 person-days of wearable data. \JournalTitlearXiv preprint arXiv:2206.02909 (2022).
  67. Are accelerometers for activity recognition a dead-end? In Proceedings of the 21st International Workshop on Mobile Computing Systems and Applications, 39–44 (2020).
  68. Dropout: a simple way to prevent neural networks from overfitting. \JournalTitleThe journal of machine learning research 15, 1929–1958 (2014).
  69. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, 448–456 (PMLR, 2015).
  70. Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Competition and cooperation in neural nets, 267–285 (Springer, 1982).
  71. Liaw, R. et al. Tune: A research platform for distributed model selection and training. \JournalTitlearXiv preprint arXiv:1807.05118 (2018).
  72. Adam: A method for stochastic optimization. \JournalTitlearXiv preprint arXiv:1412.6980 (2014).
  73. Smartphone and smartwatch-based biometrics using activities of daily living. \JournalTitleIEEE Access 7, 133190–133202 (2019).
  74. Analysis of human behavior recognition algorithms based on acceleration data. In 2013 IEEE International Conference on Robotics and Automation, 1602–1607 (IEEE, 2013).
  75. Introducing a new benchmarked dataset for activity monitoring. In 2012 16th international symposium on wearable computers, 108–109 (IEEE, 2012).
  76. On-body localization of wearable devices: An investigation of position-aware activity recognition. In 2016 IEEE International Conference on Pervasive Computing and Communications (PerCom), 1–9 (IEEE, 2016).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Shing Chan (7 papers)
  2. Hang Yuan (30 papers)
  3. Catherine Tong (6 papers)
  4. Aidan Acquah (2 papers)
  5. Abram Schonfeldt (2 papers)
  6. Jonathan Gershuny (1 paper)
  7. Aiden Doherty (4 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com