Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Household navigation and manipulation for everyday object rearrangement tasks (2312.06129v1)

Published 11 Dec 2023 in cs.RO

Abstract: We consider the problem of building an assistive robotic system that can help humans in daily household cleanup tasks. Creating such an autonomous system in real-world environments is inherently quite challenging, as a general solution may not suit the preferences of a particular customer. Moreover, such a system consists of multi-objective tasks comprising -- (i) Detection of misplaced objects and prediction of their potentially correct placements, (ii) Fine-grained manipulation for stable object grasping, and (iii) Room-to-room navigation for transferring objects in unseen environments. This work systematically tackles each component and integrates them into a complete object rearrangement pipeline. To validate our proposed system, we conduct multiple experiments on a real robotic platform involving multi-room object transfer, user preference-based placement, and complex pick-and-place tasks. Project page: https://sites.google.com/eng.ucsd.edu/home-robot

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. A. Zareian, K. D. Rosa, D. H. Hu, and S.-F. Chang, “Open-vocabulary object detection using captions,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
  2. X. Gu, T.-Y. Lin, W. Kuo, and Y. Cui, “Open-vocabulary object detection via vision and language knowledge distillation,” arXiv preprint arXiv:2104.13921, 2021.
  3. M. Minderer, A. Gritsenko, A. Stone, M. Neumann, D. Weissenborn, A. Dosovitskiy, A. Mahendran, A. Arnab, M. Dehghani, Z. Shen et al., “Simple open-vocabulary object detection,” in European Conference on Computer Vision, 2022.
  4. X. Zhou, R. Girdhar, A. Joulin, P. Krähenbühl, and I. Misra, “Detecting twenty-thousand classes using image-level supervision,” in European Conference on Computer Vision, 2022.
  5. M. W. Wise, M. Ferguson, D. King, E. Diehr, and D. Dymesich, “Fetch and freight: Standard platforms for service robot applications,” in Workshop on Autonomous Mobile Service Robots, held at the 2016 International Joint Conference on Artificial Intelligence, NYC, 2016.
  6. X. Puig, K. Ra, M. Boben, J. Li, T. Wang, S. Fidler, and A. Torralba, “Virtualhome: Simulating household activities via programs,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
  7. M. Shridhar, J. Thomason, D. Gordon, Y. Bisk, W. Han, R. Mottaghi, L. Zettlemoyer, and D. Fox, “Alfred: A benchmark for interpreting grounded instructions for everyday tasks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020.
  8. M. Shridhar, X. Yuan, M.-A. Côté, Y. Bisk, A. Trischler, and M. Hausknecht, “Alfworld: Aligning text and embodied environments for interactive learning,” arXiv preprint arXiv:2010.03768, 2020.
  9. A. Szot, A. Clegg, E. Undersander, E. Wijmans, Y. Zhao, J. Turner, N. Maestre, M. Mukadam, D. S. Chaplot, O. Maksymets et al., “Habitat 2.0: Training home assistants to rearrange their habitat,” Advances in Neural Information Processing Systems, 2021.
  10. V.-P. Berges, A. Szot, D. S. Chaplot, A. Gokaslan, R. Mottaghi, D. Batra, and E. Undersander, “Galactic: Scaling end-to-end reinforcement learning for rearrangement at 100k steps-per-second,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  11. C. Li, F. Xia, R. Martín-Martín, M. Lingelbach, S. Srivastava, B. Shen, K. Vainio, C. Gokmen, G. Dharan, T. Jain et al., “igibson 2.0: Object-centric simulation for robot learning of everyday household tasks,” arXiv preprint arXiv:2108.03272, 2021.
  12. W. Huang, P. Abbeel, D. Pathak, and I. Mordatch, “Language models as zero-shot planners: Extracting actionable knowledge for embodied agents,” in International Conference on Machine Learning, 2022.
  13. S. Y. Min, D. S. Chaplot, P. Ravikumar, Y. Bisk, and R. Salakhutdinov, “Film: Following instructions in language with modular methods,” arXiv preprint arXiv:2110.07342, 2021.
  14. D. Batra, A. X. Chang, S. Chernova, A. J. Davison, J. Deng, V. Koltun, S. Levine, J. Malik, I. Mordatch, R. Mottaghi et al., “Rearrangement: A challenge for embodied ai,” arXiv preprint arXiv:2011.01975, 2020.
  15. S. Srivastava, S. Zilberstein, A. Gupta, P. Abbeel, and S. Russell, “Tractability of planning with loops,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29, 2015.
  16. B. Wu, R. Martin-Martin, and L. Fei-Fei, “M-ember: Tackling long-horizon mobile manipulation via factorized domain transfer,” arXiv preprint arXiv:2305.13567, 2023.
  17. G. Cui, W. Shuai, and X. Chen, “Semantic task planning for service robots in open worlds,” Future Internet, vol. 13, no. 2, 2021.
  18. C. Schaeffer and T. May, “Care-o-bot-a system for assisting elderly or disabled persons in home environments,” Assistive technology on the threshold of the new millenium, vol. 3, 1999.
  19. B. Graf, M. Hans, and R. D. Schraft, “Care-o-bot ii—development of a next generation robotic home assistant,” Autonomous robots, vol. 16, no. 2, 2004.
  20. U. Reiser, C. Connette, J. Fischer, J. Kubacki, A. Bubeck, F. Weisshardt, T. Jacobs, C. Parlitz, M. Hägele, and A. Verl, “Care-o-bot® 3-creating a product vision for service robot applications by integrating design and technology,” in 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009.
  21. R. Kittmann, T. Fröhlich, J. Schäfer, U. Reiser, F. Weißhardt, and A. Haug, “Let me introduce myself: I am care-o-bot 4, a gentleman robot,” Mensch und computer 2015–proceedings, 2015.
  22. J. Wu, R. Antonova, A. Kan, M. Lepert, A. Zeng, S. Song, J. Bohg, S. Rusinkiewicz, and T. Funkhouser, “Tidybot: Personalized robot assistance with large language models,” arXiv preprint arXiv:2305.05658, 2023.
  23. Y. Ding, X. Zhang, C. Paxton, and S. Zhang, “Task and motion planning with large language models for object rearrangement,” arXiv preprint arXiv:2303.06247, 2023.
  24. H. Chang, K. Gao, K. Boyalakuntla, A. Lee, B. Huang, H. U. Kumar, J. Yu, and A. Boularias, “Lgmcts: Language-guided monte-carlo tree search for executable semantic object rearrangement,” 2023.
  25. A. Brohan et al., “Rt-1: Robotics transformer for real-world control at scale,” 2023.
  26. Y. Jiang, A. Gupta, Z. Zhang, G. Wang, Y. Dou, Y. Chen, L. Fei-Fei, A. Anandkumar, Y. Zhu, and L. Fan, “Vima: General robot manipulation with multimodal prompts,” 2023.
  27. A. Brohan et al., “Rt-2: Vision-language-action models transfer web knowledge to robotic control,” 2023.
  28. M. Ahn et al., “Do as i can, not as i say: Grounding language in robotic affordances,” 2022.
  29. S. Castro, “Behavior Trees for Home Service Robotics Tasks,” https://www.youtube.com/watch?v=xbvMnpwXNPk, 2022.
  30. W. Hess, D. Kohler, H. Rapp, and D. Andor, “Real-time loop closure in 2d lidar slam,” in 2016 IEEE international conference on robotics and automation (ICRA), 2016.
  31. A. Pal, C. Nieto-Granda, and H. I. Christensen, “Deduce: Diverse scene detection methods in unseen challenging environments,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019.
  32. Y. Kant, A. Ramachandran, S. Yenamandra, I. Gilitschenski, D. Batra, A. Szot, and H. Agrawal, “Housekeep: Tidying virtual households using commonsense reasoning,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIX, 2022.
  33. N. Abdo, C. Stachniss, L. Spinello, and W. Burgard, “Robot, organize my shelves! tidying up objects by predicting user preferences,” in 2015 IEEE International Conference on Robotics and Automation (ICRA), 2015.
  34. Y. Koren, “Factorization meets the neighborhood: a multifaceted collaborative filtering model,” in Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, 2008.
  35. M. Sundermeyer, A. Mousavian, R. Triebel, and D. Fox, “Contact-graspnet: Efficient 6-dof grasp generation in cluttered scenes,” in 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com