Working Backwards: Learning to Place by Picking (2312.02352v4)
Abstract: We present placing via picking (PvP), a method to autonomously collect real-world demonstrations for a family of placing tasks in which objects must be manipulated to specific, contact-constrained locations. With PvP, we approach the collection of robotic object placement demonstrations by reversing the grasping process and exploiting the inherent symmetry of the pick and place problems. Specifically, we obtain placing demonstrations from a set of grasp sequences of objects initially located at their target placement locations. Our system can collect hundreds of demonstrations in contact-constrained environments without human intervention using two modules: compliant control for grasping and tactile regrasping. We train a policy directly from visual observations through behavioural cloning, using the autonomously-collected demonstrations. By doing so, the policy can generalize to object placement scenarios outside of the training environment without privileged information (e.g., placing a plate picked up from a table). We validate our approach in home robot scenarios that include dishwasher loading and table setting. Our approach yields robotic placing policies that outperform policies trained with kinesthetic teaching, both in terms of success rate and data efficiency, while requiring no human supervision.
- T. Ablett, Y. Zhai, and J. Kelly, “Seeing all the angles: Learning multiview manipulation policies for contact-rich tasks from demonstrations,” pp. 7843–7850, IEEE, 2021.
- P. Florence, C. Lynch, A. Zeng, O. A. Ramirez, A. Wahid, L. Downs, A. Wong, J. Lee, I. Mordatch, and J. Tompson, “Implicit behavioral cloning,” in Conference on Robot Learning, pp. 158–168, PMLR, 2022.
- E. Jang, A. Irpan, M. Khansari, D. Kappler, F. Ebert, C. Lynch, S. Levine, and C. Finn, “Bc-z: Zero-shot task generalization with robotic imitation learning,” pp. 991–1002, PMLR, 2022.
- C. Lynch, A. Wahid, J. Tompson, T. Ding, J. Betker, R. Baruch, T. Armstrong, and P. Florence, “Interactive language: Talking to robots in real time,” arXiv preprint arXiv:2210.06407, 2022.
- M. Sundermeyer, A. Mousavian, R. Triebel, and D. Fox, “Contact-graspnet: Efficient 6-dof grasp generation in cluttered scenes,” pp. 13438–13444, IEEE, 2021.
- A. Bicchi and V. Kumar, “Robotic grasping and contact: A review,” vol. 1, pp. 348–353, IEEE, 2000.
- J. Bohg, A. Morales, T. Asfour, and D. Kragic, “Data-driven grasp synthesis—a survey,” IEEE Transactions on robotics, vol. 30, no. 2, pp. 289–309, 2013.
- A. Zeng, S. Song, K.-T. Yu, E. Donlon, F. R. Hogan, M. Bauza, D. Ma, O. Taylor, M. Liu, E. Romo, et al., “Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching,” vol. 41, no. 7, pp. 690–705, 2022.
- A. Mousavian, C. Eppner, and D. Fox, “6-dof graspnet: Variational grasp generation for object manipulation,” pp. 2901–2910, 2019.
- J. A. Haustein, K. Hang, J. Stork, and D. Kragic, “Object placement planning and optimization for robot manipulators,” pp. 7417–7424, IEEE, 2019.
- J. A. Haustein, S. Cruciani, R. Asif, K. Hang, and D. Kragic, “Placing objects with prior in-hand manipulation using dexterous manipulation graphs,” in 2019 IEEE-RAS 19th International Conference on Humanoid Robots (Humanoids), pp. 453–460, IEEE, 2019.
- K. Harada, T. Tsuji, K. Nagata, N. Yamanobe, and H. Onda, “Validating an object placement planner for robotic pick-and-place tasks,” vol. 62, no. 10, pp. 1463–1477, 2014.
- Y. Jiang, M. Lim, C. Zheng, and A. Saxena, “Learning to place new objects in a scene,” vol. 31, no. 9, pp. 1021–1043, 2012.
- M. J. Schuster, J. Okerman, H. Nguyen, J. M. Rehg, and C. C. Kemp, “Perceiving clutter and surfaces for object placement in indoor environments,” pp. 152–159, IEEE, 2010.
- S. Dong, D. K. Jha, D. Romeres, S. Kim, D. Nikovski, and A. Rodriguez, “Tactile-rl for insertion: Generalization to objects of unknown geometry,” pp. 6437–6443, 2021.
- C. Finn, S. Levine, and P. Abbeel, “Guided cost learning: Deep inverse optimal control via policy optimization,” pp. 49–58, PMLR, 2016.
- Y. Liu, D. Romeres, D. K. Jha, and D. Nikovski, “Understanding multi-modal perception using behavioral cloning for peg-in-a-hole insertion tasks,” arXiv preprint arXiv:2007.11646, 2020.
- J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli, “Deep unsupervised learning using nonequilibrium thermodynamics,” pp. 2256–2265, PMLR, 2015.
- J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” vol. 33, pp. 6840–6851, 2020.
- S. Nair, M. Babaeizadeh, C. Finn, S. Levine, and V. Kumar, “Trass: Time reversal as self-supervision,” pp. 115–121, IEEE, 2020.
- O. Limoyo, B. Chan, F. Marić, B. Wagstaff, A. R. Mahmood, and J. Kelly, “Heteroscedastic uncertainty for robust generative latent dynamics,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6654–6661, 2020.
- O. Limoyo, T. Ablett, and J. Kelly, “Learning sequential latent variable models from multimodal time series data,” in Intelligent Autonomous Systems 17 (I. Petrovic, E. Menegatti, and I. Marković, eds.), (Cham), pp. 511–528, Springer Nature Switzerland, 2023.
- L. Fu, H. Huang, L. Berscheid, H. Li, K. Goldberg, and S. Chitta, “Safely learning visuo-tactile feedback policies in real for industrial insertion,” arXiv preprint arXiv:2210.01340, 2022.
- O. Spector and D. Di Castro, “Insertionnet-a scalable solution for insertion,” vol. 6, no. 3, pp. 5509–5516, 2021.
- O. Spector, V. Tchuiev, and D. Di Castro, “Insertionnet 2.0: Minimal contact multi-step insertion using multimodal multiview sensory input,” arXiv preprint arXiv:2203.01153, 2022.
- K. Zakka, A. Zeng, J. Lee, and S. Song, “Form2fit: Learning shape priors for generalizable assembly from disassembly,” pp. 9404–9410, IEEE, 2020.
- S. Liu, Z. Zeng, T. Ren, F. Li, H. Zhang, J. Yang, C. Li, J. Yang, H. Su, J. Zhu, et al., “Grounding dino: Marrying dino with grounded pre-training for open-set object detection,” arXiv preprint arXiv:2303.05499, 2023.
- A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, P. Dollár, and R. Girshick, “Segment anything,” arXiv:2304.02643, 2023.
- P. Florence, L. Manuelli, and R. Tedrake, “Self-supervised correspondence in visuomotor policy learning,” vol. 5, no. 2, pp. 492–499, 2019.
- S. Sinha, A. Mandlekar, and A. Garg, “S4rl: Surprisingly simple self-supervision for offline reinforcement learning in robotics,” pp. 907–917, PMLR, 2022.
- M. Laskey, J. Lee, R. Fox, A. Dragan, and K. Goldberg, “Dart: Noise injection for robust imitation learning,” pp. 143–156, PMLR, 2017.
- D. Pomerleau, “Alvinn: An autonomous land vehicle in a neural network,” pp. 305 – 313, Morgan Kaufmann, December 1989.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” pp. 770–778, 2016.
- Z. Wang, A. Novikov, K. Zolna, J. S. Merel, J. T. Springenberg, S. E. Reed, B. Shahriari, N. Siegel, C. Gulcehre, N. Heess, et al., “Critic regularized regression,” vol. 33, pp. 7768–7778, 2020.
- A. Mandlekar, D. Xu, J. Wong, S. Nasiriany, C. Wang, R. Kulkarni, L. Fei-Fei, S. Savarese, Y. Zhu, and R. Martín-Martín, “What matters in learning from offline human demonstrations for robot manipulation,” arXiv preprint arXiv:2108.03298, 2021.
- Y. Zhu, A. Joshi, P. Stone, and Y. Zhu, “Viola: Imitation learning for vision-based manipulation with object proposal priors,” arXiv preprint arXiv:2210.11339, 2022.
- V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602, 2013.
- C. M. Bishop, “Mixture density networks,” 1994.
- A. G. Billard, S. Calinon, and R. Dillmann, “Learning from Humans,” in Springer Handbook of Robotics (B. Siciliano and O. Khatib, eds.), pp. 1995–2014, Cham: Springer International Publishing, 2016.
- I. Kostrikov, K. K. Agrawal, D. Dwibedi, S. Levine, and J. Tompson, “Discriminator-actor-critic: Addressing sample inefficiency and reward bias in adversarial imitation learning,” 2019.
- M. Orsini, A. Raichuk, L. Hussenot, D. Vincent, R. Dadashi, S. Girgin, M. Geist, O. Bachem, O. Pietquin, and M. Andrychowicz, “What matters for adversarial imitation learning?,” vol. 34, pp. 14656–14668, 2021.
- A. Mandlekar, D. Xu, J. Wong, S. Nasiriany, C. Wang, R. Kulkarni, L. Fei-Fei, S. Savarese, Y. Zhu, and R. Mart’in-Mart’in, “What matters in learning from offline human demonstrations for robot manipulation,” 2021.