Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Gaze-based dual resolution deep imitation learning for high-precision dexterous robot manipulation (2102.01295v3)

Published 2 Feb 2021 in cs.RO and cs.AI

Abstract: A high-precision manipulation task, such as needle threading, is challenging. Physiological studies have proposed connecting low-resolution peripheral vision and fast movement to transport the hand into the vicinity of an object, and using high-resolution foveated vision to achieve the accurate homing of the hand to the object. The results of this study demonstrate that a deep imitation learning based method, inspired by the gaze-based dual resolution visuomotor control system in humans, can solve the needle threading task. First, we recorded the gaze movements of a human operator who was teleoperating a robot. Then, we used only a high-resolution image around the gaze to precisely control the thread position when it was close to the target. We used a low-resolution peripheral image to reach the vicinity of the target. The experimental results obtained in this study demonstrate that the proposed method enables precise manipulation tasks using a general-purpose robot manipulator and improves computational efficiency.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. T. Zhang, Z. McCarthy, O. Jow, D. Lee, X. Chen, K. Goldberg, and P. Abbeel, “Deep imitation learning for complex manipulation tasks from virtual reality teleoperation,” in International Conference on Robotics and Automation, 2018, pp. 1–8.
  2. P.-C. Yang, K. Sasaki, K. Suzuki, K. Kase, S. Sugano, and T. Ogata, “Repeatable folding task by humanoid robot worker using deep learning,” Robotics and Automation Letters, vol. 2, no. 2, pp. 397–403, 2016.
  3. J. Paillard, “Fast and slow feedback loops for the visual correction of spatial errors in a pointing task: a reappraisal,” Canadian Journal of Physiology and Pharmacology, vol. 74, no. 4, pp. 401–417, 1996.
  4. H. Kolb, “Simple anatomy of the retina,” Webvision: The Organization of the Retina and Visual System [Internet]., pp. 13–36, 1995. [Online]. Available: https://www.ncbi.nlm.nih.gov/books/NBK11533/
  5. M. Hayhoe and D. Ballard, “Eye movements in natural behavior,” Trends in Cognitive Sciences, vol. 9, pp. 188–94, 2005.
  6. A. J. de Brouwer, J. P. Gallivan, and J. R. Flanagan, “Visuomotor feedback gains are modulated by gaze position,” Journal of Neurophysiology, vol. 120, no. 5, pp. 2522–2531, 2018.
  7. U. Sailer, J. R. Flanagan, and R. S. Johansson, “Eye–hand coordination during learning of a novel visuomotor task,” Journal of Neuroscience, vol. 25, no. 39, pp. 8833–8842, 2005.
  8. D. Säfström, R. S. Johansson, and J. R. Flanagan, “Gaze behavior when learning to link sequential action phases in a manual task,” Journal of Vision, vol. 14, no. 4, pp. 3–3, 2014.
  9. F. Sarlegna, J. Blouin, J.-L. Vercher, J.-P. Bresciani, C. Bourdin, and G. M. Gauthier, “Online control of the direction of rapid reaching movements,” Experimental Brain Research, vol. 157, no. 4, pp. 468–471, 2004.
  10. C. Finn, X. Y. Tan, Y. Duan, T. Darrell, S. Levine, and P. Abbeel, “Deep spatial autoencoders for visuomotor learning,” in International Conference on Robotics and Automation, 2016, pp. 512–519.
  11. S. Levine, C. Finn, T. Darrell, and P. Abbeel, “End-to-end training of deep visuomotor policies,” The Journal of Machine Learning Research, vol. 17, no. 1, pp. 1334–1373, 2016.
  12. H. Kim, Y. Ohmura, and Y. Kuniyoshi, “Using human gaze to improve robustness against irrelevant objects in robot manipulation tasks,” Robotics and Automation Letters, vol. 5, no. 3, pp. 4415–4422, 2020.
  13. T. Tang, H.-C. Lin, Y. Zhao, W. Chen, and M. Tomizuka, “Autonomous alignment of peg and hole by force/torque measurement for robotic assembly,” in International Conference on Automation Science and Engineering, 2016, pp. 162–167.
  14. H. Inoue, “Force feedback in precise assembly tasks,” Massachusetts Institute of Technology, no. AIM-308, 1974.
  15. I.-W. Kim, D.-J. Lim, and K.-I. Kim, “Active peg-in-hole of chamferless parts using force/moment sensor,” in International Conference on Intelligent Robots and Systems, vol. 2, 1999, pp. 948–953.
  16. K. Sharma, V. Shirwalkar, and P. K. Pal, “Intelligent and environment-independent peg-in-hole search strategies,” in International Conference on Control, Automation, Robotics and Embedded Systems, 2013, pp. 1–6.
  17. S. Huang, K. Murakami, Y. Yamakawa, T. Senoo, and M. Ishikawa, “Fast peg-and-hole alignment using visual compliance,” in International Conference on Intelligent Robots and Systems, 2013, pp. 286–292.
  18. M. Majors and R. Richards, “A neural-network-based flexible assembly controller,” in International Conference on Artificial Neural Networks, 1995, pp. 268–273.
  19. T. Inoue, G. De Magistris, A. Munawar, T. Yokoya, and R. Tachibana, “Deep reinforcement learning for high precision assembly tasks,” in International Conference on Intelligent Robots and Systems, 2017, pp. 819–825.
  20. M. Inaba and H. Inoue, “Hand eye coordination in rope handling,” Journal of the Robotics Society of Japan, vol. 3, no. 6, pp. 538–547, 1985.
  21. J. Silvério and S. Calinon, “A laser-based dual-arm system for precise control of collaborative robots,” arXiv preprint arXiv:2011.01573, 2020.
  22. S. Huang, Y. Yamakawa, T. Senoo, and M. Ishikawa, “Robotic needle threading manipulation based on high-speed motion strategy using high-speed visual feedback,” in International Conference on Intelligent Robots and Systems, 2015, pp. 4041–4046.
  23. J. Aloimonos, I. Weiss, and A. Bandyopadhyay, “Active vision,” International Journal of Computer Vision, vol. 1, no. 4, pp. 333–356, 1988.
  24. R. Bajcsy, “Active perception,” Proceedings of the IEEE, vol. 76, no. 8, pp. 966–1005, 1988.
  25. D. H. Ballard, “Animate vision,” Artificial Intelligence, vol. 48, no. 1, pp. 57–86, 1991.
  26. J. Soong and C. Brown, “Inverse kinematics and gaze stabilization for the rochester robot head,” University of Rochester, Tech. Rep., 1991.
  27. J. L. Crowley, P. Bobet, and M. Mesrabi, “Gaze control for a binocular camera head,” in European Conference on Computer Vision, 1992, pp. 588–596.
  28. G. Sandini and V. Tagliasco, “An anthropomorphic retina-like structure for scene analysis,” Computer Graphics and Image Processing, vol. 14, no. 4, pp. 365–372, 1980.
  29. Y. Kuniyoshi, N. Kita, T. Suehiro, and S. Rougeaux, “Active stereo vision system with foveated wide angle lenses,” in Asian Conference on Computer Vision, 1995, pp. 191–200.
  30. K. Kuniyoshi, N. Kita, K. Sugimoto, S. Nakamura, and T. Suehiro, “A foveated wide angle lens for active vision,” in International Conference on Robotics and Automation, vol. 3, 1995, pp. 2982–2988.
  31. S. D. Whitehead and D. H. Ballard, “Learning to perceive and act by trial and error,” Machine Learning, vol. 7, no. 1, pp. 45–83, 1991.
  32. R. Zhang, Z. Liu, L. Zhang, J. A. Whritner, K. S. Muller, M. M. Hayhoe, and D. H. Ballard, “Agil: Learning attention from human for visuomotor tasks,” in European Conference on Computer Vision, 2018, pp. 663–679.
  33. D. Chotrov, Z. Uzunova, Y. Yordanov, and S. Maleshkov, “Mixed-reality spatial configuration with a zed mini stereoscopic camera,” 11 2018.
  34. C. M. Bishop, “Mixture density networks,” Neural Computing Research Group, Aston University, Tech. Rep., 1994.
  35. S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in International Conference on Machine Learning., 2015.
  36. X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,” in International Conference on Artificial Intelligence and Statistics, 2011, pp. 315–323.
  37. L. Liu, H. Jiang, P. He, W. Chen, X. Liu, J. Gao, and J. Han, “On the variance of the adaptive learning rate and beyond,” in International Conference on Learning Representations, 2020, pp. 1–14.
Citations (22)

Summary

We haven't generated a summary for this paper yet.