HaGRID - HAnd Gesture Recognition Image Dataset (2206.08219v2)
Abstract: This paper introduces an enormous dataset, HaGRID (HAnd Gesture Recognition Image Dataset), to build a hand gesture recognition (HGR) system concentrating on interaction with devices to manage them. That is why all 18 chosen gestures are endowed with the semiotic function and can be interpreted as a specific action. Although the gestures are static, they were picked up, especially for the ability to design several dynamic gestures. It allows the trained model to recognize not only static gestures such as "like" and "stop" but also "swipes" and "drag and drop" dynamic gestures. The HaGRID contains 554,800 images and bounding box annotations with gesture labels to solve hand detection and gesture classification tasks. The low variability in context and subjects of other datasets was the reason for creating the dataset without such limitations. Utilizing crowdsourcing platforms allowed us to collect samples recorded by 37,583 subjects in at least as many scenes with subject-to-camera distances from 0.5 to 4 meters in various natural light conditions. The influence of the diversity characteristics was assessed in ablation study experiments. Also, we demonstrate the HaGRID ability to be used for pretraining models in HGR tasks. The HaGRID and pretrained models are publicly available.
- Mediapipe hands. https://solutions.mediapipe.dev/hands, 2019.
- SHAPE Dataset. https://users.soict.hust.edu.vn/linhdt/dataset/, 2021.
- Gesture based home automation system. In 2017 International Conference on Nextgen Electronic Technologies: Silicon to Software (ICNETS2), pages 198–201, 2017.
- Zauner C. Implementation and benchmarking of perceptual image hash functions. 2010.
- A survey on hand gesture recognition. In 2013 International Conference on Computer Sciences and Applications, pages 313–316, 2013.
- The role of gesture in communication and cognition: Implications for understanding and treating neurogenic communication disorders. Frontiers in Human Neuroscience, 14, 2020.
- Multi-modal gesture recognition challenge 2013: dataset and results. Proceedings of the 15th ACM on International conference on multimodal interaction, pages 445–452, 2013.
- Dejan Chandra Gope. Hand gesture interaction with human-computer. Global Journal of Computer Science and Technology, 2012.
- A kinect-based sign language hand gesture recognition system for hearing- and speech-impaired: A pilot study of pakistani sign language. Assistive Technology, 27:34–43, 2015.
- Human computer interaction for vision based hand gesture recognition: A survey. In 2012 International Conference on Advanced Computer Science Applications and Technologies (ACSAT), pages 55–60, 2012.
- Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
- Searching for mobilenetv3. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 1314–1324, 2019.
- Lared: a large rgb-d extensible hand gesture dataset. In ACM SIGMM Conference on Multimedia Systems, 2014.
- Exploring the potentials of crowdsourcing for gesture data collection. International Journal of Human Computer Interaction, 0(0):1–10, 2023.
- A taxonomy of gestures in human computer interactions. 2005.
- Indian sign language recognition system using surf with svm and cnn. Array, 14:100141, 2022.
- Tensor canonical correlation analysis for action classification. In 2007 IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8, 2007.
- Arasl: Arabic alphabets sign language dataset. Data in Brief, 23:103777, 2019.
- Gradient-based learning applied to document recognition. In Proceedings of the IEEE, volume 86, pages 2278–2324, 1998.
- Focal loss for dense object detection. 2017 IEEE International Conference on Computer Vision (ICCV), pages 2999–3007, 2017.
- Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
- Ouhands database for hand detection and pose recognition. 2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA), pages 1–5, 2016.
- Online detection and classification of dynamic hand gestures with recurrent 3d convolutional neural network. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4207–4215, 2016.
- Bilingual sign recognition using image based hand gesture technique for hearing and speech impaired people. In 2016 International Conference on Computing Communication Control and automation (ICCUBEA), pages 1–6, 2016.
- Hands: an rgb-d dataset of static hand-gestures for human-robot interaction. Data in Brief, 35:106791, 2021.
- Hand gestures to control infotainment equipment in cars. In 2014 IEEE Intelligent Vehicles Symposium Proceedings, pages 1–6, 2014.
- A research study of hand gesture recognition technologies and applications for human vehicle interaction. In 2007 3rd Institution of Engineering and Technology Conference on Automotive Electronics, pages 1–15, 2007.
- Mobilenetv2: Inverted residuals and linear bottlenecks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4510–4520, 2018.
- Real-time american sign language recognition using desk and wearable computer based video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12):1371–1375, 1998.
- A study on intuitive gestures to control multimedia applications. 2008.
- Chalearn looking at people rgb-d isolated and continuous datasets for gesture recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2016.
- Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696, 2022.
- Aggregated residual transformations for deep neural networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5987–5995, 2017.
- Egogesture: A new dataset and benchmark for egocentric hand gesture recognition. IEEE Transactions on Multimedia, 20(5):1038–1038, 2018.