Synthetic Data Generation Framework, Dataset, and Efficient Deep Model for Pedestrian Intention Prediction (2401.06757v2)
Abstract: Pedestrian intention prediction is crucial for autonomous driving. In particular, knowing if pedestrians are going to cross in front of the ego-vehicle is core to performing safe and comfortable maneuvers. Creating accurate and fast models that predict such intentions from sequential images is challenging. A factor contributing to this is the lack of datasets with diverse crossing and non-crossing (C/NC) scenarios. We address this scarceness by introducing a framework, named ARCANE, which allows programmatically generating synthetic datasets consisting of C/NC video clip samples. As an example, we use ARCANE to generate a large and diverse dataset named PedSynth. We will show how PedSynth complements widely used real-world datasets such as JAAD and PIE, so enabling more accurate models for C/NC prediction. Considering the onboard deployment of C/NC prediction models, we also propose a deep model named PedGNN, which is fast and has a very low memory footprint. PedGNN is based on a GNN-GRU architecture that takes a sequence of pedestrian skeletons as input to predict crossing intentions.
- Is attention to bounding boxes all you need for pedestrian action prediction? In Intelligent Vehicles Symposium (IV), 2022.
- Deep virtual-to-real distillation for pedestrian crossing prediction. In Intelligent Transportation Systems Conference (ITSC), 2022.
- Pedestrian intention prediction: A multi-task perspective. In Symposium of the European Association for Research in Transportation (hEART), 2020.
- Virtual KITTI 2. arXiv:2001.10773, 2020.
- Pedestrian graph +: A fast pedestrian crossing prediction model based on graph convolutional networks. IEEE Trans. on Intelligent Transportation Systems, 23:21050–21061, 2022.
- Pedestrian graph: Pedestrian crossing prediction based on 2D pose estimation and graph convolutional networks. In Intelligent Transportation Systems Conference (ITSC), 2019.
- OpenPose: Realtime multi-person 2d pose estimation using part affinity fields. IEEE Trans. on Pattern Analysis and Machine Intelligence, 43:172–186, 2021.
- Realtime multi-person 2D pose estimation using part affinity fields. In Int. Conf. on Computer Vision and Pattern Recognition (CVPR), 2017.
- MixedPeds: Pedestrian detection in unannotated videos using synthetically generated human-agents for training. In AAAI Conference on Artificial Intelligence, 2017.
- Semantic image segmentation: Two decades of research. Foundations and Trends in Computer Graphics and Vision, 14:1–162, 2022.
- CARLA: An open urban driving simulator. In Conference on Robot Learning (CoRL), 2017.
- Training a convolutional neural network for multi-class object detection using solely virtual world data. In International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2016.
- AlphaPose: Whole-body regional multi-person pose estimation and tracking in real-time. IEEE Trans. on Pattern Analysis and Machine Intelligence, 45:7157–7173, 2023.
- RMPE: Regional multi-person pose estimation. In International Conference on Computer Vision (ICCV), 2017.
- Zhijie Fang and Antonio M. López. Is the pedestrian going to cross? answering by 2d pose estimation. In Intelligent Vehicles Symposium (IV), 2018.
- TrouSPI-Net: Spatio-temporal attention on parallel atrous convolutions and U-GRUs for skeletal pedestrian crossing prediction. In International Conference on Automatic Face and Gesture Recognition(FG), 2021.
- Monocular depth estimation through virtual-world supervision and real-world SFM self-supervision. IEEE Trans. on Intelligent Transportation Systems, 23:12738–12751, 2021.
- MCIP: Multi-stream network for pedestrian crossing intention prediction. In European Conference on Computer Vision (ECCV)–Workshops, 2022.
- Learning scene-specific pedestrian detectors without real data. In Int. Conf. on Computer Vision and Pattern Recognition (CVPR), 2015.
- Christoph G. Keller and Dariu M. Gavrila. Will the pedestrian cross? a study on pedestrian path prediction. IEEE Trans. on Intelligent Transportation Systems, 15:494–506, 2014.
- PedX: Benchmark dataset for metric 3-D pose estimation of pedestrians in complex urban intersections. IEEE Robotics and Automation Letters, 4:1940–1947, 2018.
- Thomas N. Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representation (ICLR), 2017.
- Benchmark for evaluating pedestrian action prediction. In Winter conf. on Applications of Computer Vision (WACV), 2021.
- Google Auto LLC. Google self-driving car testing report on disengagements of autonomous mode, December 2015.
- Learning appearance in virtual scenarios for pedestrian detection. In Int. Conf. on Computer Vision and Pattern Recognition (CVPR), 2010.
- FuSSI-Net: Fusion of spatio-temporal skeletons for intention prediction network. In Asilomar Conference on Signals, Systems, and Computers, 2020.
- PIE: A large-scale dataset and models for pedestrian intention estimation and trajectory prediction. In International Conference on Computer Vision (ICCV), 2019.
- are they going to cross? a benchmark dataset and baseline for pedestrian crosswalk behavior. In International Conference on Computer Vision (ICCV)–Workshops, 2017.
- Pedestrian intention prediction: A convolutional bottom-up multi-task approach. Transportation Research Part C: Emerging Technologies, 130:103259, 2021.
- Playing for data: Ground truth from computer games. In European Conference on Computer Vision (ECCV), 2016.
- The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In Int. Conf. on Computer Vision and Pattern Recognition (CVPR), 2016.
- Context-based detection of pedestrian crossing intention for autonomous driving in urban environments. In Int. Conf. on Intelligent Robots and Systems (IROS), 2016.
- Nicolas Schneider and Dariu M. Gavrila. Pedestrian path prediction with recursive bayesian filters: A comparative study. In German Conference on Pattern Recognition (GCPR), 2013.
- AirSim: High-fidelity visual and physical simulation for autonomous vehicles. In Field and Service Robotics (FSR), 2018.
- SynPeDS: A synthetic dataset for pedestrian detection in urban traffic scenes. In ACM Computer Science in Cars Symposium, 2022.
- Pedestrian crossing prediction based on invariant feature extraction of cross-spectral images. In International Conference on Autonomous Unmanned Systems (ICAUS), 2023.
- Deep visual domain adaptation: A survey. Neurocomputing, 312:135–153, 2018.
- CARLA-BSP: a simulated dataset with pedestrians. arXiv:2305.00204, 2023.
- Synscapes: A photorealistic synthetic dataset for street scene parsing. arXiv:1810.08705, 2018.
- Predicting pedestrian crossing intention with feature fusion and spatio-temporal attention. IEEE Trans. on Intelligent Transportation Systems, 7:221–230, 2021.