Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 45 tok/s

Gemini 2.5 Pro 52 tok/s Pro

GPT-5 Medium 30 tok/s Pro

GPT-5 High 24 tok/s Pro

GPT-4o 96 tok/s Pro

Kimi K2 206 tok/s Pro

GPT OSS 120B 457 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

Virtual KITTI 2 (2001.10773v1)

Published 29 Jan 2020 in cs.CV, cs.RO, and eess.IV

Abstract: This paper introduces an updated version of the well-known Virtual KITTI dataset which consists of 5 sequence clones from the KITTI tracking benchmark. In addition, the dataset provides different variants of these sequences such as modified weather conditions (e.g. fog, rain) or modified camera configurations (e.g. rotated by 15 degrees). For each sequence, we provide multiple sets of images containing RGB, depth, class segmentation, instance segmentation, flow, and scene flow data. Camera parameters and poses as well as vehicle locations are available as well. In order to showcase some of the dataset's capabilities, we ran multiple relevant experiments using state-of-the-art algorithms from the field of autonomous driving. The dataset is available for download at https://europe.naverlabs.com/Research/Computer-Vision/Proxy-Virtual-Worlds.

Citations (227)

View on Semantic Scholar

Collections

Summary

The paper introduces enhanced photorealism and additional data modalities, such as stereo imaging and semantic annotations, to advance autonomous driving research.
It leverages Unity’s HDRP for realistic rendering and validates dataset performance through experiments on tracking, segmentation, and depth estimation.
The findings suggest that synthetic data from Virtual KITTI 2 achieves performance comparable to real-world data under diverse environmental conditions.

Virtual KITTI 2: An Enhanced Synthetic Dataset for Autonomous Driving Applications

"Virtual KITTI 2" presents a significant update to the earlier Virtual KITTI dataset, a synthetic dataset designed as an ancillary tool for training and evaluating autonomous driving systems. This paper details enhancements to the original dataset through improved photorealism and expanded features, enabling robust algorithm testing under varied synthetic conditions. Utilizing recent advancements in the Unity game engine's capabilities, Virtual KITTI 2 aims to narrow the realism gap between synthetic and real-world data.

Dataset Enhancements

Virtual KITTI 2 maintains the core of its predecessor by focusing on creating sequence clones from the KITTI tracking benchmark, allowing for controlled manipulations of environmental conditions and camera parameters. The dataset includes RGB, depth, and semantic data, with added data modalities such as instance segmentation and scene flow. Enhancements include leveraging the Unity 2018.4 LTS version and the High Definition Render Pipeline (HDRP) to deliver advanced lighting and post-processing, significantly improving the dataset's photorealism. The inclusion of stereo images, not present in the original, extends the dataset's applicability to stereo vision tasks.

Experimental Evaluations

The paper presents a series of experiments conducted with contemporary computer vision algorithms to validate the dataset's utility. A key experiment re-evaluates multi-object tracking performance using Faster-RCNN, and results indicate that performance using synthetic data is close to real data, echoing findings from the earlier version. Comparative tests on different weather conditions underscore that while fog and rain variations present challenges, most geometry manipulations have minimal impact.

The paper also explores stereo matching capabilities by deploying GANet and demonstrates that the synthetic stereo pairs exhibit performance comparable to real data under controlled conditions. Significant variations arise under challenging conditions like fog, emphasizing the dataset's role in stress-testing algorithms before deployment.

With regards to monocular depth and pose estimation, results acquired using SfmLearner highlight that the realistic lighting and spatial coherence in Virtual KITTI 2 allow for transferring trained models to similar real-world tasks, with promising accuracy, especially under unchanged conditions.

Lastly, semantic segmentation experiments using Adapnet++ show that RGB models achieve robust performance across varied environmental conditions in Virtual KITTI 2, underscoring its utility for domain adaptation research in semantic segmentation tasks.

Implications and Future Directions

Virtual KITTI 2 demonstrates the role synthetic datasets can play in reducing reliance on costly real-world data collection, affording researchers a flexible and controlled platform to iterate and test autonomous vehicle algorithms under diverse conditions. Enhanced photorealism and comprehensive ground-truth data support a wide array of computer vision tasks, offering a valuable resource for advancing semi-supervised and unsupervised learning methodologies.

Future trajectories might focus on further integrating advanced environment dynamics or exploring augmentation techniques that preserve data fidelity while enhancing model training diversity. Additionally, leveraging Virtual KITTI 2 for domain adaptation research could further blur the line between synthetic and real data, establishing a baseline for future synthetic dataset development.