Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Paris-Lille-3D: a large and high-quality ground truth urban point cloud dataset for automatic segmentation and classification (1712.00032v2)

Published 30 Nov 2017 in cs.LG, cs.CV, and stat.ML

Abstract: This paper introduces a new Urban Point Cloud Dataset for Automatic Segmentation and Classification acquired by Mobile Laser Scanning (MLS). We describe how the dataset is obtained from acquisition to post-processing and labeling. This dataset can be used to learn classification algorithm, however, given that a great attention has been paid to the split between the different objects, this dataset can also be used to learn the segmentation. The dataset consists of around 2km of MLS point cloud acquired in two cities. The number of points and range of classes make us consider that it can be used to train Deep-Learning methods. Besides we show some results of automatic segmentation and classification. The dataset is available at: http://caor-mines-paristech.fr/fr/paris-lille-3d-dataset/

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Xavier Roynard (4 papers)
  2. Jean-Emmanuel Deschaud (29 papers)
  3. François Goulette (28 papers)
Citations (264)

Summary

An Overview of the Paris-Lille-3D Urban Point Cloud Dataset for Segmentation and Classification

The presented paper introduces the Paris-Lille-3D dataset, a comprehensive and high-quality urban point cloud dataset aimed at facilitating automatic segmentation and classification. Acquired through Mobile Laser Scanning (MLS), this dataset addresses the growing demand for large-scale, accurately segmented data essential for training modern machine learning algorithms, particularly those utilizing deep learning techniques for segmentation and classification tasks.

Dataset Details and Acquisition Methodology

The Paris-Lille-3D dataset encompasses approximately 2 kilometers of urban data obtained from two cities, Paris and Lille. The dataset includes 143.1 million meticulously labeled points, distributed across 50 distinct classes, allowing the dataset to serve numerous applications in machine learning and data-driven methodologies.

Critical to the dataset's quality is the MLS system, which ensures high-density point clouds with an impressive level of detail. This acquisition method involves a multi-layered LiDAR, Velodyne HDL-32E, complemented by a GPS and IMU. The sensors are strategically mounted to optimize the precision of the data, and the entire system is calibrated without relying on SLAM, cloud registration, or loop closure techniques during post-processing.

Comparison with Existing Datasets

The paper makes a comprehensive comparison between the Paris-Lille-3D dataset and existing urban 3D point cloud datasets, such as the Oakland, Semantic3D, Paris-rue-Madame, and IQmulus datasets. A quantitative analysis highlights a significant edge in terms of point count and classification diversity. For instance, Paris-Lille-3D includes a higher number of classes (50), which is a critical factor for training robust classification algorithms.

Methodology for Segmentation and Classification

To demonstrate the utility of the dataset, results are shown from experiments evaluating automatic segmentation and classification methods. The paper references a segmentation framework based on connectivity and geometric properties, and a classification method utilizing descriptors and Random Forest algorithms. Despite the integrated improvements for performance robustness, the basic segmentation method highlights limitations such as over-segmentation in densely populated scenarios.

Implications and Future Directions

The Paris-Lille-3D dataset stands to significantly impact the development and benchmarking of point cloud processing methods, particularly for deep-learning models that require substantial amounts of labeled data. By providing high-resolution, accurately classified urban environments, the dataset opens avenues for advancements in urban scene understanding, autonomous driving applications, and smart city solutions.

Looking forward, the dataset can be instrumental in developing more sophisticated algorithms that tackle the intricate challenges of 3D point cloud segmentation and classification, such as addressing occlusion and variation in point density. The presence of a diversified class set and voluminous data points makes Paris-Lille-3D a valuable asset for researchers seeking to push the boundaries of current methodologies in computer vision and robotics.

In summary, the Paris-Lille-3D dataset provides a critical infrastructure for advancing research in automatic segmentation and classification of urban point clouds, highlighting its relevance in the evolution of machine learning applications and providing the foundation for future algorithmic improvements.