Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 100 tok/s
Gemini 2.5 Pro 58 tok/s Pro
GPT-5 Medium 29 tok/s
GPT-5 High 29 tok/s Pro
GPT-4o 103 tok/s
GPT OSS 120B 480 tok/s Pro
Kimi K2 215 tok/s Pro
2000 character limit reached

Jointly Modeling Spatio-Temporal Features of Tactile Signals for Action Classification (2404.15279v1)

Published 21 Jan 2024 in eess.SP and cs.AI

Abstract: Tactile signals collected by wearable electronics are essential in modeling and understanding human behavior. One of the main applications of tactile signals is action classification, especially in healthcare and robotics. However, existing tactile classification methods fail to capture the spatial and temporal features of tactile signals simultaneously, which results in sub-optimal performances. In this paper, we design Spatio-Temporal Aware tactility Transformer (STAT) to utilize continuous tactile signals for action classification. We propose spatial and temporal embeddings along with a new temporal pretraining task in our model, which aims to enhance the transformer in modeling the spatio-temporal features of tactile signals. Specially, the designed temporal pretraining task is to differentiate the time order of tubelet inputs to model the temporal properties explicitly. Experimental results on a public action classification dataset demonstrate that our model outperforms state-of-the-art methods in all metrics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. An Effective Video Transformer With Synchronized Spatiotemporal and Spatial Self-Attention for Action Recognition. IEEE Transactions on Neural Networks and Learning Systems, 1–14.
  2. Latent Temporal Flows for Multivariate Analysis of Wearables Data. arXiv preprint arXiv:2210.07475.
  3. ViViT: A Video Vision Transformer. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, 6816–6826. IEEE.
  4. ViViT: A Video Vision Transformer.
  5. BEiT: BERT Pre-Training of Image Transformers. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net.
  6. Is Space-Time Attention All You Need for Video Understanding? In Meila, M.; and Zhang, T., eds., Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, 813–824. PMLR.
  7. Spatio-temporal attention model for tactile texture recognition. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 9896–9902. IEEE.
  8. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Burstein, J.; Doran, C.; and Solorio, T., eds., Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), 4171–4186. Association for Computational Linguistics.
  9. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net.
  10. Multiscale Vision Transformers.
  11. Machine-knitted washable sensor array textile for precise epidermal physiological signal monitoring. Science advances, 6(11): eaay2840.
  12. Supervised Autoencoder Joint Learning on Heterogeneous Tactile Sensory Data: Improving Material Classification Performance. In IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2020, Las Vegas, NV, USA, October 24, 2020 - January 24, 2021, 10907–10913. IEEE.
  13. Deep learning approach towards accurate state of charge estimation for lithium-ion batteries using self-supervised transformer model. Scientific reports, 11(1): 1–13.
  14. Interactive, Collaborative Robots: Challenges and Opportunities. In Lang, J., ed., Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, 18–25. ijcai.org.
  15. Temporal Feature Alignment and Mutual Information Maximization for Video-Based Human Pose Estimation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, 10996–11006. IEEE.
  16. Video Swin Transformer.
  17. Highly wearable, breathable, and washable sensing textile for human motion and pulse monitoring. ACS applied materials & interfaces, 12(17): 19965–19973.
  18. Learning human–environment interactions using conformal tactile textiles. Nature Electronics, 4(3): 193–201.
  19. A hierarchically patterned, bioinspired e-skin able to detect the direction of applied pressure for robotics. Sci. Robotics, 3(24).
  20. XGBoost based machine learning approach to predict the risk of fall in older adults using gait outcomes. Scientific reports, 11(1): 1–9.
  21. DeltaCharger: Charging Robot With Inverted Delta Mechanism and CNN-Driven High Fidelity Tactile Perception for Precise 3D Positioning. IEEE Robotics Autom. Lett., 6(4): 7604–7610.
  22. Learning the signatures of the human grasp using a scalable tactile glove. Nat., 569(7758): 698–702.
  23. VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training. CoRR, abs/2203.12602.
  24. Video-based Human-Object Interaction Detection from Tubelet Tokens. arXiv preprint arXiv:2206.01908.
  25. Attention is all you need. Advances in neural information processing systems, 30.
  26. 3DKnITS: Three-dimensional Digital Knitting of Intelligent Textile Sensor for Activity Recognition and Biomechanical Monitoring. In 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 2403–2409. IEEE.
  27. Multiview Transformers for Video Recognition. In The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR).
  28. Multiview Transformers for Video Recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, 3323–3333. IEEE.
  29. A Transformer-based Framework for Multivariate Time Series Representation Learning. In Zhu, F.; Ooi, B. C.; and Miao, C., eds., KDD ’21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, Singapore, August 14-18, 2021, 2114–2124. ACM.
  30. Alignment-guided Temporal Attention for Video Action Recognition. arXiv preprint arXiv:2210.00132.
  31. Self-powered and self-functional cotton sock using piezoelectric and triboelectric hybrid mechanism for healthcare and sports monitoring. ACS nano, 13(2): 1940–1952.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com