Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
140 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Versatile Demonstration Interface: Toward More Flexible Robot Demonstration Collection (2410.19141v2)

Published 24 Oct 2024 in cs.RO

Abstract: Previous methods for Learning from Demonstration leverage several approaches for a human to teach motions to a robot, including teleoperation, kinesthetic teaching, and natural demonstrations. However, little previous work has explored more general interfaces that allow for multiple demonstration types. Given the varied preferences of human demonstrators and task characteristics, a flexible tool that enables multiple demonstration types could be crucial for broader robot skill training. In this work, we propose Versatile Demonstration Interface (VDI), an attachment for collaborative robots that simplifies the collection of three common types of demonstrations. Designed for flexible deployment in industrial settings, our tool requires no additional instrumentation of the environment. Our prototype interface captures human demonstrations through a combination of vision, force sensing, and state tracking (e.g., through the robot proprioception or AprilTag tracking). Through a user study where we deployed our prototype VDI at a local manufacturing innovation center with manufacturing experts, we demonstrated VDI in representative industrial tasks. Interactions from our study highlight the practical value of VDI's varied demonstration types, expose a range of industrial use cases for VDI, and provide insights for future tool design.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. B. D. Argall, S. Chernova, M. Veloso, and B. Browning, “A survey of robot learning from demonstration,” Robotics and autonomous systems, vol. 57, no. 5, pp. 469–483, 2009.
  2. A. G. Billard, S. Calinon, and R. Dillmann, “Learning from humans,” Springer handbook of robotics, pp. 1995–2014, 2016.
  3. L. Sanneman, C. Fourie, J. A. Shah et al., “The state of industrial robotics: Emerging technologies, challenges, and key research directions,” Foundations and Trends® in Robotics, vol. 8, no. 3, pp. 225–306, 2021.
  4. P. Bakker, Y. Kuniyoshi et al., “Robot see, robot do: An overview of robot imitation,” in AISB96 Workshop on Learning in Robots and Animals, vol. 5, 1996, pp. 3–11.
  5. H. Ravichandar, A. S. Polydoros, S. Chernova, and A. Billard, “Recent advances in robot learning from demonstration,” Annual review of control, robotics, and autonomous systems, vol. 3, no. 1, pp. 297–330, 2020.
  6. S. Elliott, Z. Xu, and M. Cakmak, “Learning generalizable surface cleaning actions from demonstration,” in 2017 26th IEEE international symposium on robot and human interactive communication (RO-MAN).   IEEE, 2017, pp. 993–999.
  7. “Human-robot interface for instructing industrial tasks using kinesthetic teaching,” in IEEE ISR 2013.   IEEE, 2013, pp. 1–6.
  8. P. Kormushev, S. Calinon, and D. G. Caldwell, “Imitation learning of positional and force skills demonstrated via kinesthetic teaching and haptic input,” Advanced Robotics, vol. 25, no. 5, pp. 581–603, 2011.
  9. J. Fong and M. Tavakoli, “Kinesthetic teaching of a therapist’s behavior to a rehabilitation robot,” in 2018 international symposium on medical robotics (ISMR).   IEEE, 2018, pp. 1–6.
  10. S. Wrede, C. Emmerich, R. Grünberg, A. Nordmann, A. Swadzba, and J. Steil, “A user study on kinesthetic teaching of redundant robots in task and configuration space,” Journal of Human-Robot Interaction, vol. 2, no. 1, pp. 56–81, 2013.
  11. G. Niemeyer, C. Preusche, S. Stramigioli, and D. Lee, “Telerobotics,” Springer handbook of robotics, pp. 1085–1108, 2016.
  12. B. Akgun, K. Subramanian, and A. L. Thomaz, “Novel interaction strategies for learning from teleoperation,” in 2012 AAAI Fall Symposium Series, 2012.
  13. A. Padalkar, A. Pooley, A. Jain, A. Bewley, A. Herzog, A. Irpan, A. Khazatsky, A. Rai, A. Singh, A. Brohan et al., “Open x-embodiment: Robotic learning datasets and rt-x models,” arXiv preprint arXiv:2310.08864, 2023.
  14. S. C. Akkaladevi, A. Pichler, M. Plasch, M. Ikeda, and M. Hofmann, “Skill-based programming of complex robotic assembly tasks for industrial application.” Elektrotech. Informationstechnik, vol. 136, no. 7, pp. 326–333, 2019.
  15. M. Hagenow, E. Senft, R. Radwin, M. Gleicher, B. Mutlu, and M. Zinn, “Informing real-time corrections in corrective shared autonomy through expert demonstrations,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 6442–6449, 2021.
  16. S. Young, D. Gandhi, S. Tulsiani, A. Gupta, P. Abbeel, and L. Pinto, “Visual imitation made easy,” in Conference on Robot Learning.   PMLR, 2021, pp. 1992–2005.
  17. I. Soares, M. Petry, and A. P. Moreira, “Programming robots by demonstration using augmented reality,” Sensors, vol. 21, no. 17, p. 5976, 2021.
  18. M. B. Luebbers, C. Brooks, C. L. Mueller, D. Szafir, and B. Hayes, “Arc-lfd: Using augmented reality for interactive long-term robot skill maintenance via constrained learning from demonstration,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 3794–3800.
  19. B. C. Stadie, P. Abbeel, and I. Sutskever, “Third-person imitation learning,” arXiv preprint arXiv:1703.01703, 2017.
  20. F. Torabi, G. Warnell, and P. Stone, “Recent advances in imitation learning from observation,” arXiv preprint arXiv:1905.13566, 2019.
  21. C. Chi, S. Feng, Y. Du, Z. Xu, E. Cousineau, B. Burchfiel, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,” arXiv preprint arXiv:2303.04137, 2023.
  22. T. Z. Zhao, V. Kumar, S. Levine, and C. Finn, “Learning fine-grained bimanual manipulation with low-cost hardware,” arXiv preprint arXiv:2304.13705, 2023.
  23. C. Chi, Z. Xu, C. Pan, E. Cousineau, B. Burchfiel, S. Feng, R. Tedrake, and S. Song, “Universal manipulation interface: In-the-wild robot teaching without in-the-wild robots,” arXiv preprint arXiv:2402.10329, 2024.
  24. Z. Fu, T. Z. Zhao, and C. Finn, “Mobile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation,” arXiv preprint arXiv:2401.02117, 2024.
  25. T. Lin, Y. Zhang, Q. Li, H. Qi, B. Yi, S. Levine, and J. Malik, “Learning visuotactile skills with two multifingered hands,” arXiv preprint arXiv:2404.16823, 2024.
  26. P. Wu, Y. Shentu, Z. Yi, X. Lin, and P. Abbeel, “Gello: A general, low-cost, and intuitive teleoperation framework for robot manipulators,” arXiv preprint arXiv:2309.13037, 2023.
  27. B. Akgun and K. Subramanian, “Robot learning from demonstration: kinesthetic teaching vs. teleoperation,” Unpublished manuscript, vol. 26, 2011.
  28. A. Kramberger, “A comparison of learning-by-demonstration methods for force-based robot skills,” in 2014 23rd International Conference on Robotics in Alpe-Adria-Danube Region (RAAD).   IEEE, 2014, pp. 1–6.
  29. K. Fischer, F. Kirstein, L. C. Jensen, N. Krüger, K. Kukliński, M. V. aus der Wieschen, and T. R. Savarimuthu, “A comparison of types of robot control for programming by demonstration,” in 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).   IEEE, 2016, pp. 213–220.
  30. B. Maric, F. Zoric, F. Petric, and M. Orsag, “Comparative analysis of programming by demonstration methods: Kinesthetic teaching vs human demonstration,” arXiv preprint arXiv:2403.10140, 2024.
  31. P. Praveena, G. Subramani, B. Mutlu, and M. Gleicher, “Characterizing input methods for human-to-robot demonstrations,” in 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI).   IEEE, 2019, pp. 344–353.
  32. S. M. Prasad, S. M. Prasad, H. S. Maniar, C. Chu, R. B. Schuessler, and R. J. Damiano Jr, “Surgical robotics: impact of motion scaling on task performance,” Journal of the American College of Surgeons, vol. 199, no. 6, pp. 863–868, 2004.
  33. A. Ajoudani, A. M. Zanchettin, S. Ivaldi, A. Albu-Schäffer, K. Kosuge, and O. Khatib, “Progress and prospects of the human–robot collaboration,” Autonomous robots, vol. 42, pp. 957–975, 2018.
  34. H. Raei, J. M. Gandarias, E. De Momi, P. Balatti, and A. Ajoudani, “A multipurpose interface for close-and far-proximity control of mobile collaborative robots,” arXiv preprint arXiv:2406.02171, 2024.
  35. C. Wang, L. Fan, J. Sun, R. Zhang, L. Fei-Fei, D. Xu, Y. Zhu, and A. Anandkumar, “Mimicplay: Long-horizon imitation learning by watching human play,” arXiv preprint arXiv:2302.12422, 2023.
  36. L. Wang, J. Zhao, Y. Du, E. H. Adelson, and R. Tedrake, “Poco: Policy composition from and for heterogeneous robot learning,” arXiv preprint arXiv:2402.02511, 2024.
  37. M. Y. Cao, S. Laws, and F. R. y Baena, “Six-axis force/torque sensors for robotics applications: A review,” IEEE Sensors Journal, vol. 21, no. 24, pp. 27 238–27 251, 2021.
  38. E. Olson, “Apriltag: A robust and flexible visual fiducial system,” in 2011 IEEE international conference on robotics and automation.   IEEE, 2011, pp. 3400–3407.
  39. D. Rakita, B. Mutlu, and M. Gleicher, “An autonomous dynamic camera method for effective remote teleoperation,” in Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, 2018, pp. 325–333.
  40. E. Senft, M. Hagenow, P. Praveena, R. Radwin, M. Zinn, M. Gleicher, and B. Mutlu, “A method for automated drone viewpoints to support remote robot manipulation,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 7704–7711.
  41. J. Li, M. Sousa, K. Mahadevan, B. Wang, P. A. Aoyagui, N. Yu, A. Yang, R. Balakrishnan, A. Tang, and T. Grossman, “Stargazer: An interactive camera robot for capturing how-to videos based on subtle instructor cues,” in Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023, pp. 1–16.
  42. P. Praveena, Y. Wang, E. Senft, M. Gleicher, and B. Mutlu, “Periscope: A robotic camera system to support remote physical collaboration,” Proceedings of the ACM on Human-Computer Interaction, vol. 7, no. CSCW2, pp. 1–39, 2023.
  43. C. R. Dreher, M. Zaremski, F. Leven, D. Schneider, A. Roitberg, R. Stiefelhagen, M. Heizmann, B. Deml, and T. Asfour, “Erfassung und interpretation menschlicher handlungen für die programmierung von robotern in der produktion,” at-Automatisierungstechnik, vol. 70, no. 6, pp. 517–533, 2022.
  44. B. Wen, W. Yang, J. Kautz, and S. Birchfield, “Foundationpose: Unified 6d pose estimation and tracking of novel objects,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 17 868–17 879.
  45. M. D. Dogan, A. Taka, M. Lu, Y. Zhu, A. Kumar, A. Gupta, and S. Mueller, “Infraredtags: Embedding invisible ar markers and barcodes using low-cost, infrared-based 3d printing and imaging tools,” in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, 2022, pp. 1–12.
  46. S. G. Hart and L. E. Staveland, “Development of nasa-tlx (task load index): Results of empirical and theoretical research,” in Advances in psychology.   Elsevier, 1988, vol. 52, pp. 139–183.
  47. J. Brooke et al., “Sus-a quick and dirty usability scale,” Usability evaluation in industry, vol. 189, no. 194, pp. 4–7, 1996.
  48. A. M. Lund, “Measuring usability with the use questionnaire,” Usability interface, vol. 8, no. 2, pp. 3–6, 2001.

Summary

  • The paper introduces a versatile demonstration interface that supports multiple LfD modalities, overcoming limitations of single-approach methods.
  • It integrates vision, force, and proprioceptive sensing to enable teleoperation, kinesthetic teaching, and natural demonstrations in industrial settings.
  • User studies with manufacturing experts show that natural demonstrations accelerate task execution while multi-modal options enhance overall robot skill acquisition.

Versatile Demonstration Interface: Enhancing Flexibility in Robot Demonstration Collection

The paper under discussion proposes the Versatile Demonstration Interface (VDI), a novel attachment designed to augment the capabilities of collaborative robots in industrial settings by facilitating the collection of diverse types of demonstrations from human operators. This work addresses significant limitations of current Learning from Demonstration (LfD) methodologies, which typically rely on singular approaches such as teleoperation, kinesthetic teaching, or passive (natural) observation. Such constraints limit the ability to harness the full spectrum of human demonstration preferences and task-specific demonstration requirements. The central contribution of this research is a prototype interface that integrates multiple demonstration modes, offering increased flexibility and adaptability in LfD for industrial applications.

Technical Contributions

The VDI is engineered as an end-of-arm interface for collaborative robots, with a design philosophy centered around versatility, ease of deployment, and minimal environmental instrumentation. It incorporates a combination of sensing modalities including vision, force sensing, and proprioceptive feedback, facilitating the capture of demonstrations across three broad modalities:

  1. Teleoperation: The interface supports remote operation through a 6D input device, enabling users to impart demonstrations without direct physical interaction with the robot. This modality is advantageous for tasks requiring operator safety or precision scaling.
  2. Kinesthetic Teaching: Leveraging the intrinsic compliance of collaborative robots, the interface allows operators to physically guide the robot through desired motions. This method is particularly useful for intuitive, rapid demonstration of tasks that necessitate direct human-robot interaction.
  3. Natural Demonstrations: By detaching part of the interface, demonstrators can execute tasks in a modality akin to human manual operations, with the robot leveraging vision-based tracking to follow tool motions. This approach aims to capture demonstrations with high fidelity to human task execution strategies.

Empirical Evaluation

A user paper with manufacturing experts was conducted to evaluate the practical implications of this multifaceted interface in representative industrial tasks, namely a rolling task and a press-fitting task. Participants expressed a strong preference for natural demonstrations, citing their speed and intuitive similarity to familiar manual tasks. However, the paper also demonstrated the value of having access to alternative modalities, where task safety, precision, and the physical exertion of repetitious tasks came into consideration. Feedback emphasized the importance of improving tracking range and accuracy, as well as ergonomic enhancements for more seamless mode transitions.

Implications and Forward-Looking Perspectives

The development of the VDI highlights several key theoretical and practical implications. By accommodating multiple demonstration modalities, the interface underscores the necessity for adaptable LfD systems that can be tailored to both the operator's preferences and the task-specific requirements. In practice, this flexibility has the potential to accelerate robot skill acquisition in complex environments, reduce training times, and expand the range of tasks to which robots can be effectively applied.

Future research stemming from this work could explore the integration of more sophisticated feedback mechanisms in teleoperation, employ advanced sensor fusion techniques for precise natural demonstration tracking, and investigate the utility of multi-modal demonstrations in enhancing learning algorithms' efficiency and robustness. Furthermore, expanding the scope to investigate the impact on learning outcomes when incorporating diverse demonstrations could provide additional insights into optimizing LfD systems.

The VDI represents a promising step towards more dynamic, flexible robot programming interfaces that align more closely with human intuitiveness and task complexity. This research contributes foundational insights into the hardware and software requirements needed to realize such interfaces, underscoring the critical role of multidimensional demonstration systems in the advancement of autonomous robotic applications in industry.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com