Transfer-LMR: Heavy-Tail Driving Behavior Recognition in Diverse Traffic Scenarios (2405.05354v1)
Abstract: Recognizing driving behaviors is important for downstream tasks such as reasoning, planning, and navigation. Existing video recognition approaches work well for common behaviors (e.g. "drive straight", "brake", "turn left/right"). However, the performance is sub-par for underrepresented/rare behaviors typically found in tail of the behavior class distribution. To address this shortcoming, we propose Transfer-LMR, a modular training routine for improving the recognition performance across all driving behavior classes. We extensively evaluate our approach on METEOR and HDD datasets that contain rich yet heavy-tailed distribution of driving behaviors and span diverse traffic scenarios. The experimental results demonstrate the efficacy of our approach, especially for recognizing underrepresented/rare driving behaviors.
- P. Koopman, “The heavy tail safety ceiling,” in Automated and Connected Vehicle Systems Testing Symposium, vol. 1145. SAE, 2018, pp. 8950–8961.
- R. Chandra, X. Wang, M. Mahajan, R. Kala, R. Palugulla, C. Naidu, A. Jain, and D. Manocha, “Meteor: A dense, heterogeneous, and unstructured traffic dataset with rare behaviors,” in 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 9169–9175.
- V. Ramanishka, Y.-T. Chen, T. Misu, and K. Saenko, “Toward driving scene understanding: A dataset for learning driver behavior and causal reasoning,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7699–7707.
- R. Chandra, U. Bhattacharya, A. Bera, and D. Manocha, “Densepeds: Pedestrian tracking in dense crowds using front-rvo and sparse features,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2019, pp. 468–475.
- R. Chandra, U. Bhattacharya, T. Randhavane, A. Bera, and D. Manocha, “Roadtrack: Realtime tracking of road agents in dense and heterogeneous environments,” in 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020, pp. 1270–1277.
- R. Chandra, “Towards autonomous driving in dense, heterogeneous, and unstructured traffic,” Ph.D. dissertation, University of Maryland, College Park, 2022.
- M. Xu, M. Gao, Y.-T. Chen, L. S. Davis, and D. J. Crandall, “Temporal recurrent networks for online action detection,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 5532–5541.
- M. Shimosaka, T. Kaneko, and K. Nishi, “Modeling risk anticipation and defensive driving on residential roads with inverse reinforcement learning,” in 17th International IEEE Conference on Intelligent Transportation Systems (ITSC). IEEE, 2014, pp. 1694–1700.
- R. Chandra, U. Bhattacharya, A. Bera, and D. Manocha, “Traphic: Trajectory prediction in dense and heterogeneous traffic using weighted interactions,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 8483–8492.
- R. Chandra, U. Bhattacharya, C. Roncal, A. Bera, and D. Manocha, “Robusttp: End-to-end trajectory prediction for heterogeneous road-agents in dense traffic with noisy sensor inputs,” in Proceedings of the 3rd ACM Computer Science in Cars Symposium, 2019, pp. 1–9.
- R. Chandra, T. Guan, S. Panuganti, T. Mittal, U. Bhattacharya, A. Bera, and D. Manocha, “Forecasting trajectory and behavior of road-agents using spectral clustering in graph-lstms,” IEEE Robotics and Automation Letters, vol. 5, no. 3, pp. 4882–4890, 2020.
- C. Li, Y. Meng, S. H. Chan, and Y.-T. Chen, “Learning 3d-aware egocentric spatial-temporal interaction via graph convolutional networks,” in 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020, pp. 8418–8424.
- C. Noguchi and T. Tanizawa, “Ego-vehicle action recognition based on semi-supervised contrastive learning,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 5988–5998.
- X. Zhang, Z. Wu, Z. Weng, H. Fu, J. Chen, Y.-G. Jiang, and L. S. Davis, “Videolt: Large-scale long-tailed video recognition,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 7960–7969.
- T. Perrett, S. Sinha, T. Burghardt, M. Mirmehdi, and D. Damen, “Use your head: Improving long-tail video recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 2415–2425.
- X. Li and H. Xu, “Meid: mixture-of-experts with internal distillation for long-tailed video recognition,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 2, 2023, pp. 1451–1459.
- W. Moon, H. S. Seong, and J.-P. Heo, “Minority-oriented vicinity expansion with attentive aggregation for video long-tailed recognition,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 2, 2023, pp. 1931–1939.
- D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning spatiotemporal features with 3d convolutional networks,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 4489–4497.
- J. Carreira and A. Zisserman, “Quo vadis, action recognition? a new model and the kinetics dataset,” in proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 6299–6308.
- M. Patrick, D. Campbell, Y. Asano, I. Misra, F. Metze, C. Feichtenhofer, A. Vedaldi, and J. F. Henriques, “Keeping your eye on the ball: Trajectory attention in video transformers,” Advances in neural information processing systems, vol. 34, pp. 12 493–12 506, 2021.
- R. Chandra, A. Bera, and D. Manocha, “Using graph-theoretic machine learning to predict human driver behavior,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 3, pp. 2572–2585, 2021.
- R. Chandra, U. Bhattacharya, T. Mittal, A. Bera, and D. Manocha, “Cmetric: A driving behavior measure using centrality functions,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 2035–2042.
- R. Chandra, U. Bhattacharya, T. Mittal, X. Li, A. Bera, and D. Manocha, “Graphrqi: Classifying driver behaviors using graph spectrums,” in 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020, pp. 4350–4357.
- A. Mavrogiannis, R. Chandra, and D. Manocha, “B-gap: Behavior-rich simulation and navigation for autonomous driving,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 4718–4725, 2022.
- N. Oliver and A. P. Pentland, “Graphical models for driver behavior recognition in a smartcar,” in Proceedings of the IEEE Intelligent Vehicles Symposium 2000 (Cat. No. 00TH8511). IEEE, 2000, pp. 7–12.
- D. Mitrovic, “Reliable method for driving events recognition,” IEEE transactions on intelligent transportation systems, vol. 6, no. 2, pp. 198–205, 2005.
- Y. Ma, Z. Zhang, S. Chen, Y. Yu, and K. Tang, “A comparative study of aggressive driving behavior recognition algorithms based on vehicle motion data,” IEEE Access, vol. 7, pp. 8028–8038, 2018.
- M. Matousek, M. Yassin, R. van der Heijden, F. Kargl, et al., “Robust detection of anomalous driving behavior,” in 2018 IEEE 87th Vehicular Technology Conference (VTC Spring). IEEE, 2018, pp. 1–5.
- M. Matousek, E.-Z. Mohamed, F. Kargl, C. Bösch, et al., “Detecting anomalous driving behavior using neural networks,” in 2019 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2019, pp. 2229–2235.
- H. Xu, Y. Gao, F. Yu, and T. Darrell, “End-to-end learning of driving models from large-scale video datasets,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2174–2182.
- B. Kang, S. Xie, M. Rohrbach, Z. Yan, A. Gordo, J. Feng, and Y. Kalantidis, “Decoupling representation and classifier for long-tailed recognition,” arXiv preprint arXiv:1910.09217, 2019.
- S. Zhang, Z. Li, S. Yan, X. He, and J. Sun, “Distribution alignment: A unified framework for long-tail visual recognition,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 2361–2370.
- S. Alshammari, Y.-X. Wang, D. Ramanan, and S. Kong, “Long-tailed recognition via weight balancing,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 6897–6907.
- K. Tang, M. Tao, J. Qi, Z. Liu, and H. Zhang, “Invariant feature learning for generalized long-tailed classification,” in European Conference on Computer Vision. Springer, 2022, pp. 709–726.
- D. Damen, H. Doughty, G. M. Farinella, A. Furnari, E. Kazakos, J. Ma, D. Moltisanti, J. Munro, T. Perrett, W. Price, et al., “Rescaling egocentric vision,” arXiv preprint arXiv:2006.13256, 2020.
- K. Soomro, A. R. Zamir, and M. Shah, “Ucf101: A dataset of 101 human actions classes from videos in the wild,” arXiv preprint arXiv:1212.0402, 2012.
- A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei, “Large-scale video classification with convolutional neural networks,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2014, pp. 1725–1732.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.