Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Instructing Robots by Sketching: Learning from Demonstration via Probabilistic Diagrammatic Teaching (2309.03835v3)

Published 7 Sep 2023 in cs.RO and cs.LG

Abstract: Learning for Demonstration (LfD) enables robots to acquire new skills by imitating expert demonstrations, allowing users to communicate their instructions in an intuitive manner. Recent progress in LfD often relies on kinesthetic teaching or teleoperation as the medium for users to specify the demonstrations. Kinesthetic teaching requires physical handling of the robot, while teleoperation demands proficiency with additional hardware. This paper introduces an alternative paradigm for LfD called Diagrammatic Teaching. Diagrammatic Teaching aims to teach robots novel skills by prompting the user to sketch out demonstration trajectories on 2D images of the scene, these are then synthesised as a generative model of motion trajectories in 3D task space. Additionally, we present the Ray-tracing Probabilistic Trajectory Learning (RPTL) framework for Diagrammatic Teaching. RPTL extracts time-varying probability densities from the 2D sketches, applies ray-tracing to find corresponding regions in 3D Cartesian space, and fits a probabilistic model of motion trajectories to these regions. New motion trajectories, which mimic those sketched by the user, can then be generated from the probabilistic model. We empirically validate our framework both in simulation and on real robots, which include a fixed-base manipulator and a quadruped-mounted manipulator.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. 2018.
  2. B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” in ECCV, 2020.
  3. H. Ravichandar, A. S. Polydoros, S. Chernova, and A. Billard, “Recent advances in robot learning from demonstration,” Annual review of control, robotics, and autonomous systems, 2020.
  4. J. Kober and J. Peters, “Learning motor primitives for robotics,” in IEEE International Conference on Robotics and Automation, 2009.
  5. A. J. Ijspeert, J. Nakanishi, H. Hoffmann, P. Pastor, and S. Schaal, “Dynamical movement primitives: Learning attractor models for motor behaviors,” Neural Computation, 2013.
  6. A. Paraschos, C. Daniel, J. Peters, and G. Neumann, “Probabilistic movement primitives,” in Proceedings of the 26th International Conference on Neural Information Processing Systems, 2013.
  7. S. M. Khansari-Zadeh and A. Billard, “Learning stable nonlinear dynamical systems with gaussian mixture models,” IEEE Transactions on Robotics, 2011.
  8. S. M. Khansari-Zadeh and A. Billard, “Learning control lyapunov function to ensure stability of dynamical system-based robot reaching motions,” Robotics Auton. Syst., vol. 62, pp. 752–765, 2014.
  9. W. Zhi, T. Lai, L. Ott, and F. Ramos, “Diffeomorphic transforms for generalised imitation learning,” in Learning for Dynamics and Control Conference, L4DC, 2022.
  10. A. Mandlekar, Y. Zhu, A. Garg, J. Booher, M. Spero, A. Tung, J. Gao, J. Emmons, A. Gupta, E. Orbay, S. Savarese, and L. Fei-Fei, “Roboturk: A crowdsourcing platform for robotic skill learning through imitation,” in Conference on Robot Learning, 2018.
  11. T. Zhang, Z. McCarthy, O. Jow, D. Lee, K. Goldberg, and P. Abbeel, “Deep imitation learning for complex manipulation tasks from virtual reality teleoperation,” IEEE International Conference on Robotics and Automation (ICRA), 2018.
  12. F. Dellaert and Y. Lin, “Neural volume rendering: Nerf and beyond,” CoRR, 2021.
  13. D. Rebain, W. Jiang, S. Yazdani, K. Li, K. M. Yi, and A. Tagliasacchi, “Derf: Decomposed radiance fields,” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  14. L. Liu, J. Gu, K. Zaw Lin, T.-S. Chua, and C. Theobalt, “Neural sparse voxel fields,” Advances in Neural Information Processing Systems, 2020.
  15. T. Müller, A. Evans, C. Schied, and A. Keller, “Instant neural graphics primitives with a multiresolution hash encoding,” ACM Trans. Graph., 2022.
  16. J. L. Schönberger and J.-M. Frahm, “Structure-from-motion revisited,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  17. J. L. Schönberger, E. Zheng, M. Pollefeys, and J.-M. Frahm, “Pixelwise view selection for unstructured multi-view stereo,” in European Conference on Computer Vision (ECCV), 2016.
  18. E. Olson, “Apriltag: A robust and flexible visual fiducial system,” in IEEE International Conference on Robotics and Automation, 2011.
  19. G. Papamakarios, E. Nalisnick, D. J. Rezende, S. Mohamed, and B. Lakshminarayanan, “Normalizing flows for probabilistic modeling and inference,” J. Mach. Learn. Res., 2021.
  20. L. Dinh, J. Sohl-Dickstein, and S. Bengio, “Density estimation using real NVP,” in International Conference on Learning Representations, 2017.
  21. L. Ardizzone, J. Kruse, C. Rother, and U. Köthe, “Analyzing inverse problems with invertible neural networks,” in International Conference on Learning Representations, 2019.
  22. W. Zhi, T. Lai, L. Ott, E. V. Bonilla, and F. Ramos, “Learning efficient and robust ordinary differential equations via invertible neural networks,” in International Conference on Machine Learning, ICML, 2022.
  23. I. Kobyzev, S. Prince, and M. Brubaker, “Normalizing flows: An introduction and review of current methods,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
  24. W. Zhi, L. Ott, and F. Ramos, “Kernel trajectory maps for multi-modal probabilistic motion prediction,” Conference on Robot Learning (CoRL), 2019.
  25. W. Zhi, T. Lai, L. Ott, and F. Ramos, “Trajectory generation in new environments from past experiences,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021.
  26. W. Zhi, L. Ott, and F. Ramos, “Probabilistic trajectory prediction with structural constraints,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS, 2021.
  27. J. Fong, M. Wrenninge, C. Kulla, and R. Habel, “Production volume rendering: Siggraph 2017 course,” in ACM SIGGRAPH 2017 Courses, SIGGRAPH ’17, 2017.
  28. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems, 2019.
  29. C. Chamzas, C. Quintero-Peña, Z. K. Kingston, A. Orthey, D. Rakita, M. Gleicher, M. Toussaint, and L. E. Kavraki, “Motionbenchmaker: A tool to generate and benchmark motion planning datasets,” IEEE Robotics and Automation Letters, 2022.
  30. E. Coumans and Y. Bai, “Pybullet, a python module for physics simulation for games, robotics and machine learning.” http://pybullet.org, 2016–2019.
  31. T. Eiter and H. Mannila, “Computing discrete fréchet distance,” Technical Report CD-TR 94/64, 1994.
  32. R. Flamary, N. Courty, A. Gramfort, M. Z. Alaya, A. Boisbunon, S. Chambon, L. Chapel, A. Corenflos, K. Fatras, N. Fournier, L. Gautheron, N. T. Gayraud, H. Janati, A. Rakotomamonjy, I. Redko, A. Rolet, A. Schutz, V. Seguy, D. J. Sutherland, R. Tavenard, A. Tong, and T. Vayer, “Pot: Python optimal transport,” Journal of Machine Learning Research, vol. 22, no. 78, pp. 1–8, 2021.
  33. W. Zhi, K. van Wyk, I. Akinola, N. Ratliff, and F. Ramos, “Global and reactive motion generation with geometric fabric command sequences,” in IEEE International Conference on Robotics and Automation, ICRA, 2023.
  34. T. Lai, W. Zhi, T. Hermans, and F. Ramos, “Parallelised diffeomorphic sampling-based motion planning,” in Conference on Robot Learning (CoRL), 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Weiming Zhi (28 papers)
  2. Tianyi Zhang (262 papers)
  3. Matthew Johnson-Roberson (72 papers)
Citations (7)

Summary

  • The paper introduces Diagrammatic Teaching, a novel LfD paradigm that enables intuitive robot instruction using 2D sketches.
  • It leverages Ray-tracing Probabilistic Trajectory Learning to convert sketched trajectories into precise 3D motion paths.
  • Experimental evaluations in simulations and real robots confirm its effectiveness in replicating complex motions for practical tasks.

Instructing Robots by Sketching: Learning from Demonstration via Probabilistic Diagrammatic Teaching

This paper presents a novel approach to Learning from Demonstration (LfD) named Diagrammatic Teaching. This technique allows users to teach robots new skills by sketching desired demonstration trajectories on 2D images of the scene. The paper introduces a computational framework, Ray-tracing Probabilistic Trajectory Learning (RPTL), that transforms these sketches into a generative model of motion trajectories in 3D space.

Key Contributions:

  1. Diagrammatic Teaching Paradigm:
    • The authors propose Diagrammatic Teaching as an alternative paradigm for LfD. This eliminates the need for direct physical interaction with robots, as required in kinesthetic teaching, or proficiency in hardware use for teleoperation.
    • Users provide demonstrations by sketching on static 2D images captured or generated by models like NeRF. The sketched trajectories approximate desired robot motions, allowing intuitive skill transfer from humans to robots.
  2. Ray-tracing Probabilistic Trajectory Learning (RPTL):
    • RPTL leverages time-varying probability densities extracted from the 2D sketches and applies ray-tracing methods to map these onto corresponding regions in the 3D task space.
    • The framework fits a probabilistic model based on these 3D regions, allowing for the generation of new motion trajectories mimicking the user-provided sketches.

Technical Implementation:

The RPTL framework consists of several steps:

  • Density Estimation: This involves using normalizing flow models to compute probability densities over 2D trajectory sketches.
  • Ray-tracing for 3D Fitting: Utilizing ray-tracing techniques, the method identifies regions in 3D space that correspond to high-density sketches from different viewpoints.
  • Conditional Trajectory Generation: The trajectories can be generated to start from novel positions, increasing the flexibility of the taught skills.

Experimental Evaluation:

The paper validates RPTL through both simulation and physical robot experiments. Simulation results demonstrate that the model can accurately replicate complex trajectories, such as writing alphabetic characters, from minimal sketch input. Real-world tests involve tasks like drawer closing and object dropping, utilizing different robot platforms, including a fixed-base manipulator and a robotic dog equipped with an arm.

Implications and Future Directions:

The proposed approach shows promising implications for non-expert users to instruct robots intuitively through sketches. It lowers the barrier for robot programming and offers a more accessible interaction paradigm for robot skill acquisition. Future work could extend Diagrammatic Teaching by enhancing scene understanding via actively generated virtual perspectives and integrating additional user constraints to refine robot behavior further.

This research contributes to the LfD domain by introducing an innovative method that simplifies the interface between human instructors and robots. It lays a foundation for more fluid and interactive robot teaching systems, potentially broadening the adoption and adaptation of robots in various industries.

Youtube Logo Streamline Icon: https://streamlinehq.com