TBD Pedestrian Data Collection: Towards Rich, Portable, and Large-Scale Natural Pedestrian Data (2309.17187v2)
Abstract: Social navigation and pedestrian behavior research has shifted towards machine learning-based methods and converged on the topic of modeling inter-pedestrian interactions and pedestrian-robot interactions. For this, large-scale datasets that contain rich information are needed. We describe a portable data collection system, coupled with a semi-autonomous labeling pipeline. As part of the pipeline, we designed a label correction web app that facilitates human verification of automated pedestrian tracking outcomes. Our system enables large-scale data collection in diverse environments and fast trajectory label production. Compared with existing pedestrian data collection methods, our system contains three components: a combination of top-down and ego-centric views, natural human behavior in the presence of a socially appropriate "robot", and human-verified labels grounded in the metric space. To the best of our knowledge, no prior data collection system has a combination of all three components. We further introduce our ever-expanding dataset from the ongoing data collection effort -- the TBD Pedestrian Dataset and show that our collected data is larger in scale, contains richer information when compared to prior datasets with human-verified labels, and supports new research opportunities.
- A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and S. Savarese, “Social lstm: Human trajectory prediction in crowded spaces,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., June 2016, pp. 961–971.
- A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, “Social gan: Socially acceptable trajectories with generative adversarial networks,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., June 2018, pp. 2255–2264.
- B. Ivanovic and M. Pavone, “The trajectron: Probabilistic multi-agent trajectory modeling with dynamic spatiotemporal graphs,” in Proc. IEEE/CVF International Conf. on Comput. Vis., 2019, pp. 2375–2384.
- K. M. Kitani, B. D. Ziebart, J. A. Bagnell, and M. Hebert, “Activity forecasting,” in Comput. Vis. – ECCV 2012, 2012, pp. 201–214.
- J. Liang, L. Jiang, K. Murphy, T. Yu, and A. Hauptmann, “The garden of forking paths: Towards multi-future trajectory prediction,” in Proc. IEEE Conf. on Comput. Vis. and Pattern Recognit., 2020.
- B. Okal and K. O. Arras, “Learning socially normative robot navigation behaviors with bayesian inverse reinforcement learning,” in 2016 IEEE International Conf. on Robotics and Automation (ICRA). IEEE, 2016, pp. 2889–2895.
- H. Kretzschmar, M. Spies, C. Sprunk, and W. Burgard, “Socially compliant mobile robot navigation via inverse reinforcement learning,” The International Journal of Robotics Research, vol. 35, no. 11, pp. 1289–1307, 2016.
- A. Biswas, A. Wang, G. Silvera, A. Steinfeld, and H. Admoni, “Socnavbench: A grounded simulation testing framework for evaluating social navigation,” arXiv preprint arXiv:2103.00047, 2021.
- H. Karnan, A. Nair, X. Xiao, G. Warnell, S. Pirk, A. Toshev, J. Hart, J. Biswas, and P. Stone, “Socially compliant navigation dataset (scand): A large-scale dataset of demonstrations for social navigation,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 11 807–11 814, 2022.
- D. Paez-Granados, Y. He, D. Gonon, D. Jia, B. Leibe, K. Suzuki, and A. Billard, “Pedestrian-robot interactions on autonomous crowd navigation: Reactive control methods and evaluation metrics,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022, pp. 149–156.
- D. Brščić, T. Kanda, T. Ikeda, and T. Miyashita, “Person tracking in large public spaces using 3-d range sensors,” IEEE Trans. on Human-Machine Syst., vol. 43, no. 6, pp. 522–534, 2013.
- B. Majecka, “Statistical models of pedestrian behaviour in the forum,” Master’s thesis, School of Informatics, University of Edinburgh, 2009.
- A. Robicquet, A. Sadeghian, A. Alahi, and S. Savarese, “Learning social etiquette: Human trajectory understanding in crowded scenes,” in Comput. Vis. – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds., 2016, pp. 549–565.
- S. Oh, A. Hoogs, A. Perera, N. Cuntoor, C.-C. Chen, J. T. Lee, S. Mukherjee, J. Aggarwal, H. Lee, L. Davis, et al., “A large-scale benchmark dataset for event recognit. in surveillance video,” in CVPR 2011. IEEE, 2011, pp. 3153–3160.
- S. Pellegrini, A. Ess, K. Schindler, and L. van Gool, “You’ll never walk alone: Modeling social behavior for multi-target tracking,” in Proc. IEEE Int. Conf. Comput. Vis., Sept 2009, pp. 261–268.
- A. Lerner, Y. Chrysanthou, and D. Lischinski, “Crowds by example,” Comput. Graph. Forum, vol. 26, no. 3, pp. 655–664, 2007.
- R. Martin-Martin, M. Patel, H. Rezatofighi, A. Shenoi, J. Gwak, E. Frankel, A. Sadeghian, and S. Savarese, “Jrdb: A dataset and benchmark of egocentric robot visual perception of humans in built environments,” IEEE Trans. Pattern Anal. Mach. Intell., 2021.
- P. Trautman and A. Krause, “Unfreezing the robot: Navigation in dense, interacting crowds,” in Proc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., Oct 2010, pp. 797–803.
- M. Sun, F. Baldini, P. Trautman, and T. Murphey, “Move beyond trajectories: Distribution space coupling for crowd navigation,” arXiv preprint arXiv:2106.13667, 2021.
- H. Nishimura, B. Ivanovic, A. Gaidon, M. Pavone, and M. Schwager, “Risk-sensitive sequential action control with multi-modal human trajectory forecasting for safe crowd-robot interaction,” in 2020 IEEE/RSJ International Conf. on Intell. Robots and Syst. (IROS). IEEE, 2020, pp. 11 205–11 212.
- C. Mavrogiannis, F. Baldini, A. Wang, D. Zhao, P. Trautman, A. Steinfeld, and J. Oh, “Core Challenges of Social Robot Navigation: A Survey,” arXiv e-prints, p. arXiv:2103.05668, Mar. 2021.
- H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, and O. Beijbom, “nuscenes: A multimodal dataset for autonomous driving,” in CVPR, 2020.
- M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Y. Zhang, P. Sun, Y. Jiang, D. Yu, Z. Yuan, P. Luo, W. Liu, and X. Wang, “Bytetrack: Multi-object tracking by associating every detection box,” arXiv preprint arXiv:2110.06864, 2021.
- A. Sadeghian, V. Kosaraju, A. Sadeghian, N. Hirose, H. Rezatofighi, and S. Savarese, “Sophie: An attentive gan for predicting paths compliant to social and physical constraints,” in Proc. IEEE/CVF Conf. on Comput. Vis. and Pattern Recognit. (CVPR), June 2019.
- A. Mohamed, K. Qian, M. Elhoseiny, and C. Claudel, “Social-stgcnn: A social spatio-temporal graph convolutional neural network for human trajectory prediction,” in Proc. IEEE/CVF Conf. on Comput. Vis. and Pattern Recognit. (CVPR), June 2020.
- A. Wang and A. Steinfeld, “Group split and merge prediction with 3D convolutional networks,” IEEE Trans. Robot. Autom., vol. 5, no. 2, pp. 1923–1930, 2020.
- A. Rudenko, L. Palmieri, M. Herman, K. M. Kitani, D. M. Gavrila, and K. O. Arras, “Human motion trajectory prediction: A survey,” arXiv preprint arXiv:1905.06113, 2019.
- M. Everett, Y. F. Chen, and J. P. How, “Motion planning among dynamic, decision-making agents with deep reinforcement learning,” in IEEE/RSJ International Conf. on Intell. Robots and Syst. (IROS), Madrid, Spain, Sept. 2018.
- C. Chen, Y. Liu, S. Kreiss, and A. Alahi, “Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforcement learning,” in Proc. IEEE International Conf. on Robotics and Automation (ICRA), 2019, pp. 6015–6022.
- Y. Chen, C. Liu, B. E. Shi, and M. Liu, “Robot navigation in crowds by graph convolutional networks with attention learned from human gaze,” IEEE Trans. Robot. Autom., vol. 5, no. 2, pp. 2754–2761, 2020.
- L. Tai, J. Zhang, M. Liu, and W. Burgard, “Socially compliant navigation through raw depth inputs with generative adversarial imitation learning,” in 2018 IEEE International Conf. on Robotics and Automation (ICRA), 2018, pp. 1111–1117.
- Y. Gao and C.-M. Huang, “Evaluation of socially-aware robot navigation,” Frontiers in Robotics and AI, p. 420, 2021.
- P. Trautman, J. Ma, R. M. Murray, and A. Krause, “Robot navigation in dense human crowds: Statistical models and experimental studies of human-robot cooperation,” International Journal of Robotics Research, vol. 34, no. 3, pp. 335–356, 2015.
- C. Cao, P. Trautman, and S. Iba, “Dynamic channel: A planning framework for crowd navigation,” in 2019 International Conf. on Robotics and Automation (ICRA). IEEE, 2019, pp. 5551–5557.
- A. Wang, C. Mavrogiannis, and A. Steinfeld, “Group-based motion prediction for navigation in crowded environments,” in Conf. on Robot Learning. PMLR, 2022, pp. 871–882.
- B. Wilson, W. Qi, T. Agarwal, J. Lambert, J. Singh, S. Khandelwal, B. Pan, R. Kumar, A. Hartnett, J. Kaesemodel Pontes, D. Ramanan, P. Carr, and J. Hays, “Argoverse 2: Next generation datasets for self-driving perception and forecasting,” in Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, J. Vanschoren and S. Yeung, Eds., vol. 1. Curran, 2021.
- P. Dendorfer, H. Rezatofighi, A. Milan, J. Shi, D. Cremers, I. Reid, S. Roth, K. Schindler, and L. Leal-Taixé, “Mot20: A benchmark for multi object tracking in crowded scenes,” arXiv:2003.09003[cs], Mar. 2020, arXiv: 2003.09003.
- C. Gloor, “Pedsim: Pedestrian crowd simulation,” URL http://pedsim. silmaril. org, vol. 5, no. 1, 2016.
- N. Tsoi, M. Hussein, J. Espinoza, X. Ruiz, and M. Vázquez, “Sean: Social environment for autonomous navigation,” in Proc. 8th International Conf. on Human-Agent Interaction, 2020, pp. 281–283.
- C. Mavrogiannis, F. Baldini, A. Wang, D. Zhao, P. Trautman, A. Steinfeld, and J. Oh, “Core challenges of social robot navigation: A survey,” arXiv preprint arXiv:2103.05668, 2021.
- M. Kuribayashi, T. Ishihara, D. Sato, J. Vongkulbhisal, K. Ram, S. Kayukawa, H. Takagi, S. Morishima, and C. Asakawa, “Pathfinder: Designing a map-less navigation system for blind people in unfamiliar buildings,” in Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, ser. CHI ’23. New York, NY, USA: Association for Computing Machinery, 2023.
- D. Brščić, H. Kidokoro, Y. Suehiro, and T. Kanda, “Escaping from children’s abuse of social robots,” in Proc. of the tenth annual acm/ieee international Conf. on human-robot interaction, 2015, pp. 59–66.
- Z. Yan, T. Duckett, and N. Bellotto, “Online learning for human classification in 3d lidar-based tracking,” in 2017 IEEE/RSJ International Conf. on Intell. Robots and Syst. (IROS). IEEE, 2017, pp. 864–871.
- A. Rudenko, T. P. Kucner, C. S. Swaminathan, R. T. Chadalavada, K. O. Arras, and A. J. Lilienthal, “Thör: Human-robot navigation data collection and accurate motion trajectories dataset,” IEEE Trans. Robot. Autom., vol. 5, no. 2, pp. 676–682, 2020.
- T. Chavdarova, P. Baqué, S. Bouquet, A. Maksai, C. Jose, T. Bagautdinov, L. Lettry, P. Fua, L. Van Gool, and F. Fleuret, “Wildtrack: A multi-camera hd dataset for dense unscripted pedestrian detection,” in Proc. IEEE Conf. on Comput. Vis. and Pattern Recognit., 2018, pp. 5030–5039.
- B. Benfold and I. Reid, “Stable multi-target tracking in real-time surveillance video,” in CVPR 2011. IEEE, 2011, pp. 3457–3464.
- B. Zhou, X. Wang, and X. Tang, “Understanding collective crowd behaviors: Learning a mixture model of dynamic pedestrian-agents,” in 2012 IEEE Conf. on Comput. Vis. and Pattern Recognit. IEEE, 2012, pp. 2871–2878.
- A. Alahi, V. Ramanathan, and L. Fei-Fei, “Socially-aware large-scale crowd forecasting,” in Proc. IEEE Conf. on Comput. Vis. and Pattern Recognit., 2014, pp. 2203–2210.
- T. Salzmann, B. Ivanovic, P. Chakravarty, and M. Pavone, “Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVIII 16. Springer, 2020, pp. 683–700.
- Y. Yuan, X. Weng, Y. Ou, and K. M. Kitani, “Agentformer: Agent-aware transformers for socio-temporal multi-agent forecasting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 9813–9823.
- K. Mangalam, Y. An, H. Girase, and J. Malik, “From goals, waypoints & paths to long term human trajectory forecasting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 15 233–15 242.