Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PressureVision++: Estimating Fingertip Pressure from Diverse RGB Images (2301.02310v3)

Published 5 Jan 2023 in cs.CV

Abstract: Touch plays a fundamental role in manipulation for humans; however, machine perception of contact and pressure typically requires invasive sensors. Recent research has shown that deep models can estimate hand pressure based on a single RGB image. However, evaluations have been limited to controlled settings since collecting diverse data with ground-truth pressure measurements is difficult. We present a novel approach that enables diverse data to be captured with only an RGB camera and a cooperative participant. Our key insight is that people can be prompted to apply pressure in a certain way, and this prompt can serve as a weak label to supervise models to perform well under varied conditions. We collect a novel dataset with 51 participants making fingertip contact with diverse objects. Our network, PressureVision++, outperforms human annotators and prior work. We also demonstrate an application of PressureVision++ to mixed reality where pressure estimation allows everyday surfaces to be used as arbitrary touch-sensitive interfaces. Code, data, and models are available online.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4981–4990, 2018.
  2. TouchPose: Hand pose prediction, depth estimation, and touch classification from capacitive images. In The 34th Annual ACM Symposium on User Interface Software and Technology, pages 997–1009, 2021.
  3. What’s the point: Semantic segmentation with point supervision. In European Conference on Computer Vision (ECCV), pages 549–565. Springer, 2016.
  4. Reskin: versatile, replaceable, lasting tactile skins. Conference on Robot Learning (CoRL), 2021.
  5. ContactPose: A dataset of grasps with object contact and hand pose. In European Conference on Computer Vision (ECCV), August 2020.
  6. Samarth Manoj Brahmbhatt. Grasp contact between hand and object: Capture, analysis, and applications. PhD thesis, Georgia Institute of Technology, 2020.
  7. Flexible and stretchable fabric-based tactile sensor. Robotics and Autonomous Systems, 63:244–252, 2015.
  8. Weakly-supervised semantic segmentation via sub-category exploration. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8991–9000, 2020.
  9. Dexycb: A benchmark for capturing hand grasping of objects. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9044–9053, 2021.
  10. Estimating fingertip forces, torques, and local curvatures from fingernail images. Robotica, 38(7):1242–1262, 2020.
  11. Comfortable user interfaces: Surfaces reduce input error, time, and exertion for tabletop and mid-air user interfaces. In 2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pages 150–159. IEEE, 2022.
  12. 3d hand pose estimation on conventional capacitive touchscreens. In Proceedings of the 23rd International Conference on Mobile Human-Computer Interaction, MobileHCI ’21, New York, NY, USA, 2021. Association for Computing Machinery.
  13. The cityscapes dataset for semantic urban scene understanding. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3213–3223, 2016.
  14. Use the force, Luke! Learning to predict physical forces by simulating effects. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 224–233, 2020.
  15. Domain-adversarial training of neural networks. The journal of machine learning research, 17(1):2096–2030, 2016.
  16. Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recognition, 47(6):2280–2292, 2014.
  17. Acustico: surface tap detection and localization using wrist-based acoustic tdoa sensing. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology, pages 406–419, 2020.
  18. Learning to detect touches on cluttered tables. arXiv preprint arXiv:2304.04687, 2023.
  19. PressureVision: estimating hand pressure from a single RGB image. European Conference on Computer Vision (ECCV), 2022.
  20. ContactOpt: Optimizing contact to improve grasps. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1471–1481, 2021.
  21. Capauth: Identifying and differentiating user handprints on commodity capacitive touchscreens. In Proceedings of the 2015 International Conference on Interactive Tabletops & Surfaces, pages 59–62, 2015.
  22. Hololens 2 technical evaluation as mixed reality guide. arXiv preprint arXiv:2207.09554, 2022.
  23. HOnnotate: A method for 3D annotation of hand and object poses. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3196–3206, 2020.
  24. Megatrack: monochrome egocentric articulated hand-tracking for virtual reality. ACM Transactions on Graphics (ToG), 39:87, 2020.
  25. Umetrack: Unified multi-view end-to-end hand tracking for vr. In SIGGRAPH Asia 2022 Conference Papers, pages 1–9, 2022.
  26. Deep residual learning for image recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
  27. Depth errors analysis and correction for time-of-flight (tof) cameras. Sensors, 17(1):92, 2017.
  28. Visual cues for perceiving distances from objects to surfaces. Presence: Teleoperators & Virtual Environments, 11(6):652–664, 2002.
  29. Visual cues for imminent object contact in realistic virtual environments. In Proceedings Visualization 2000., pages 179–185. IEEE, 2000.
  30. Squeeze-and-excitation networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7132–7141, 2018.
  31. The relative contributions of stereo, lighting, and background scenes in promoting 3d depth visualization. ACM Transactions on Computer Humam Interaction, 6(3):214–242, 1999.
  32. Inferring interaction force from visual information without using physical force sensors. Sensors, 17(11):2455, 2017.
  33. Coding and use of tactile signals from the fingertips in object manipulation tasks. Nature Reviews Neuroscience, 10:345––359, 2009.
  34. Electroring: Subtle pinch and touch detection with a ring. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pages 1–12, 2021.
  35. Capacitive tactile sensor array for touch screen application. Sensors and Actuators A: Physical, 165(1):2–7, 2011.
  36. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations (ICLR), 2015.
  37. Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In European Conference on Computer Vision (ECCV), pages 695–711. Springer, 2016.
  38. Estimating 3d motion and forces of person-object interactions from monocular video. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8640–8649, 2019.
  39. Feature pyramid networks for object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2117–2125, 2017.
  40. Mediapipe: A framework for perceiving and processing reality. In Third Workshop on Computer Vision for AR/VR at IEEE Computer Vision and Pattern Recognition (CVPR) 2019, 2019.
  41. Learning human–environment interactions using conformal tactile textiles. Nature Electronics, 4(3):193–201, 2021.
  42. Photoplethysmograph fingernail sensors for measuring finger forces without haptic obstruction. IEEE Transactions on Robotics and Automation, 17(5):698–708, 2001.
  43. Measurement of finger posture and three-axis fingertip touch force using fingernail sensors. IEEE Transactions on Robotics and Automation, 20(1):26–35, 2004.
  44. Crafting a multi-task CNN for viewpoint estimation. British Machine Vision Conference (BMVC), 2016.
  45. Tapid: Rapid touch interaction in virtual reality using wearable sensing. In 2021 IEEE Virtual Reality and 3D User Interfaces (VR), pages 519–528. IEEE, 2021.
  46. Morph. Sensel Morph haptic sensing tablet. www.sensel.com/pages/the-sensel-morph, Last accessed on 2020-02-25.
  47. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
  48. Domain adaptive semantic segmentation using weak labels. In European Conference on Computer Vision (ECCV), pages 571–587. Springer, 2020.
  49. Towards force sensing from vision: Observing hand-object interactions to infer manipulation forces. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2810–2819, 2015.
  50. Hand-object contact force estimation from markerless visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12):2883–2896, 2017.
  51. Pressure Profile Systems. PPS TactileGlove.
  52. Utilisation of tactile sensors in ergonomic assessment of hand–handle interface: a review. Agron. Res, 12(3):907–914, 2014.
  53. Understanding everyday hands in action from RGB-D images. In IEEE International Conference on Computer Vision (ICCV), pages 3889–3897, 2015.
  54. Robust and reliable fabric, piezoresistive multitouch sensing surfaces for musical controllers. In Conference on New Interfaces for Musical Expression, pages 393–398, 2011.
  55. Timothy A Salthouse. Effects of age and skill in typing. Journal of Experimental Psychology: General, 113(3):345, 1984.
  56. Farout touch: Extending the range of ad hoc touch sensing with depth cameras. In Proceedings of the 2021 ACM Symposium on Spatial User Interaction, pages 1–12, 2021.
  57. Ready, steady, touch! sensing physical contact with a finger-mounted imu. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4(2):1–25, 2020.
  58. Render for CNN: Viewpoint estimation in images using CNNs trained with rendered 3d model views. In IEEE International Conference on Computer Vision (ICCV), pages 2686–2694, 2015.
  59. Learning the signatures of the human grasp using a scalable tactile glove. Nature, 569(7758):698–702, 2019.
  60. GRAB: A dataset of whole-body human grasping of objects. In European Conference on Computer Vision (ECCV), 2020.
  61. Tekscan. Pressure mapping, force measurement and tactile sensors. www.tekscan.com/products-solutions/sensorsLast, Last accessed on 2020-02-25.
  62. MRTouch: adding touch input to head-mounted mixed reality. IEEE Transactions on Visualization and Computer Graphics, 24(4):1653–1660, 2018.
  63. Aggregated residual transformations for deep neural networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5987–5995, 2017.
  64. Pavel Yakubovskiy. Segmentation models Pytorch, 2020.
  65. Oakink: A large-scale knowledge repository for understanding hand-object interaction. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20953–20962, 2022.
  66. Actitouch: Robust touch detection for on-skin ar/vr interfaces. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology, pages 1151–1159, 2019.
Citations (3)

Summary

We haven't generated a summary for this paper yet.