OpenPack: A Large-scale Dataset for Recognizing Packaging Works in IoT-enabled Logistic Environments
Abstract: Unlike human daily activities, existing publicly available sensor datasets for work activity recognition in industrial domains are limited by difficulties in collecting realistic data as close collaboration with industrial sites is required. This also limits research on and development of methods for industrial applications. To address these challenges and contribute to research on machine recognition of work activities in industrial domains, in this study, we introduce a new large-scale dataset for packaging work recognition called OpenPack. OpenPack contains 53.8 hours of multimodal sensor data, including acceleration data, keypoints, depth images, and readings from IoT-enabled devices (e.g., handheld barcode scanners), collected from 16 distinct subjects with different levels of packaging work experience. We apply state-of-the-art human activity recognition techniques to the dataset and provide future directions of complex work activity recognition studies in the pervasive computing community based on the results. We believe that OpenPack will contribute to the sensor-based action/activity recognition community by providing challenging tasks. The OpenPack dataset is available at https://open-pack.github.io.
- R. Michel, “2016 warehouse/dc operations survey: Ready to confront complexity,” Nov 2016. [Online]. Available: https://www.logisticsmgmt.com/article/2016_warehouse_dc_operations_survey_ready_to_confront_complexity
- V. Yavas and Y. D. Ozkan-Ozen, “Logistics centers in the new industrial era: A proposed framework for logistics center 4.0,” Transportation Research Part E: Logistics and Transportation Review, vol. 135, p. 101864, 2020.
- S. Inoue, P. Lago, T. Hossain, T. Mairittha, and N. Mairittha, “Integrating activity recognition and nursing care records: The system, deployment, and a verification study,” Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 3, no. 3, 2019. [Online]. Available: https://doi.org/10.1145/3351244
- Q. Xia, A. Wada, J. Korpela, T. Maekawa, and Y. Namioka, “Unsupervised factory activity recognition with wearable sensors using process instruction information,” Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 3, no. 2, pp. 1–23, 2019.
- F. Niemann, C. Reining, F. Moya Rueda, N. R. Nair, J. A. Steffens, G. A. Fink, and M. Ten Hompel, “Lara: Creating a dataset for human activity recognition in logistics using semantic attributes,” Sensors, vol. 20, no. 15, p. 4083, 2020.
- Q. Xia, J. Korpela, Y. Namioka, and T. Maekawa, “Robust unsupervised factory activity recognition with body-worn accelerometer using temporal structure of multiple sensor data motifs,” Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 4, no. 3, pp. 1–30, 2020.
- J. Morales, N. Yoshimura, Q. Xia, A. Wada, Y. Namioka, and T. Maekawa, “Acceleration-based human activity recognition of packaging tasks using motif-guided attention networks,” in Proceedings of the IEEE International Conference on Pervasive Computing and Communications, 2022, pp. 1–12.
- N. Yoshimura, T. Maekawa, T. Hara, A. Wada, and Y. Namioka, “Acceleration-based activity recognition of repetitive works with lightweight ordered-work segmentation network,” Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 6, no. 2, 2022.
- D. Roggen, A. Calatroni, M. Rossi, T. Holleczek, K. Förster, G. Tröster, P. Lukowicz, D. Bannach, G. Pirkl, and A. Ferscha, “Collecting complex activity datasets in highly rich networked sensor environments,” in Proceedings of the International Conference on Networked Sensing Systems, 2010, pp. 233–240.
- A. Reiss and D. Stricker, “Introducing a new benchmarked dataset for activity monitoring,” in Proceedings of the International Symposium on Wearable Computers, 2012, pp. 108–109.
- B. Barshan and M. C. Yüksek, “Recognizing daily and sports activities in two open source machine learning environments using body-worn sensor units,” Computer Journal, vol. 57, no. 11, pp. 1649–1667, 2014.
- Y. Tang, D. Ding, Y. Rao, Y. Zheng, D. Zhang, L. Zhao, J. Lu, and J. Zhou, “COIN: A large-scale dataset for comprehensive instructional video analysis,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 1207–1216.
- M. Dallel, V. Havard, D. Baudry, and X. Savatier, “Inhard-industrial human action recognition dataset in the context of industrial collaborative robotics,” in Proceedings of the IEEE International Conference on Human-Machine Systems, 2020, pp. 1–6.
- Y. Ben-Shabat, X. Yu, F. Saleh, D. Campbell, C. Rodriguez-Opazo, H. Li, and S. Gould, “The IKEA ASM dataset: Understanding people assembling furniture through actions, objects and pose,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 847–859.
- E. H. Spriggs, F. De La Torre, and M. Hebert, “Temporal segmentation and activity classification from first-person sensing,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2009, pp. 17–24.
- F. Ofli, R. Chaudhry, G. Kurillo, R. Vidal, and R. Bajcsy, “Berkeley mhad: A comprehensive multimodal human action database,” in Proceedings of the IEEE Workshop on Applications of Computer Vision, 2013, pp. 53–60.
- C. Chen, R. Jafari, and N. Kehtarnavaz, “UTD-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor,” in Proceedings of the IEEE International Conference on Image Processing, 2015, pp. 168–172.
- Q. Kong, Z. Wu, Z. Deng, M. Klinkigt, B. Tong, and T. Murakami, “MMAct: A large-scale dataset for cross modal human action understanding,” in Proceedings of the IEEE International Conference on Computer Vision, October 2019, pp. 8657–8666.
- F. Niemann, S. Lüdtke, C. Bartelt, and M. Ten Hompel, “Context-aware human activity recognition in industrial processes,” Sensors, vol. 22, no. 1, p. 134, 2022.
- S. S. Alia, K. Adachi, N. Nahid, H. Kaneko, P. Lago, and S. Inoue, “Bento packaging activity recognition challenge,” 2021. [Online]. Available: https://dx.doi.org/10.21227/cwhs-t440
- S. Stein and S. J. McKenna, “Combining embedded accelerometers with computer vision for recognizing food preparation activities,” in Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing, 2013, pp. 729–738.
- H. Kuehne, A. Arslan, and T. Serre, “The language of actions: Recovering the syntax and semantics of goal-directed human activities,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 780–787.
- D. Damen, H. Doughty, G. M. Farinella, S. Fidler, A. Furnari, E. Kazakos, D. Moltisanti, J. Munro, T. Perrett, W. Price, and M. Wray, “Scaling egocentric vision: The EPIC-KITCHENS dataset,” in Proceedings of the European Conference on Computer Vision, 2018.
- P. Lago, S. Takeda, K. Adachi, S. S. Alia, M. Matsuki, B. Benai, S. Inoue, and F. Charpillet, “Cooking activity dataset with macro and micro activities,” 2020. [Online]. Available: https://dx.doi.org/10.21227/hyzg-9m49
- F. Sener, D. Chatterjee, D. Shelepov, K. He, D. Singhania, R. Wang, and A. Yao, “Assembly101: A large-scale multi-view video dataset for understanding procedural activities,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 21 096–21 106.
- F. Ordóñez and D. Roggen, “Deep convolutional and LSTM recurrent neural networks for multimodal wearable activity recognition,” Sensors, vol. 16, no. 1, p. 115, 2016.
- Y. Zhang, Z. Zhang, Y. Zhang, J. Bao, Y. Zhang, and H. Deng, “Human activity recognition based on motion sensor using U-Net,” IEEE Access, vol. 7, pp. 75 213–75 226, 2019.
- S. P. Singh, M. K. Sharma, A. Lay-Ekuakille, D. Gangwar, and S. Gupta, “Deep convlstm with self-attention for human activity decoding using wearable sensors,” IEEE Sensors, vol. 21, no. 6, pp. 8575–8582, 2020.
- A. Gulati, J. Qin, C.-C. Chiu, N. Parmar, Y. Zhang, J. Yu, W. Han, S. Wang, Z. Zhang, Y. Wu et al., “Conformer: Convolution-augmented transformer for speech recognition,” arXiv preprint arXiv:2005.08100, 2020.
- S. Münzner, P. Schmidt, A. Reiss, M. Hanselmann, R. Stiefelhagen, and R. Dürichen, “CNN-based sensor fusion techniques for multimodal human activity recognition,” in Proceedings of the ACM International Symposium on Wearable Computers, 2017, pp. 158–165.
- S. Yan, Y. Xiong, and D. Lin, “Spatial temporal graph convolutional networks for skeleton-based action recognition,” in Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018, pp. 7444–7452.
- Q. Xia, A. Wada, T. Yoshii, Y. Namioka, and T. Maekawa, “Comparative analysis of high-and low-performing factory workers with attention-based neural networks,” in Proceedings of International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, 2022, pp. 469–480.
- Y. Nishino, T. Maekawa, and T. Hara, “WeakCounter: Acceleration-based repetition counting of actions with weakly supervised learning,” in Proceedings of the International Symposium on Wearable Computers, 2021, pp. 144–146.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.