Papers
Topics
Authors
Recent
Search
2000 character limit reached

POLARIS-53K: Polarimetric & Multimodal Datasets

Updated 24 April 2026
  • POLARIS-53K is a suite of three independent, large-scale polarimetric datasets tailored for automotive, exoplanetary, and maritime perception tasks.
  • Each dataset pioneers novel sensor modalities—integrating synchronized RGB-polarimetric, high-contrast astronomical, and multimodal maritime data—to boost machine learning performance.
  • The datasets provide comprehensive calibration, benchmark splits, and state-of-the-art baseline results that drive advancements in representation learning and perception algorithms.

POLARIS-53K refers to three distinct, independently developed datasets in large-scale polarimetric imaging and perception: (1) an automotive driving benchmark providing synchronized RGB-polarimetric, lidar, GNSS/INS, and segmentation data (Baltaxe et al., 2023); (2) a high-contrast, polarimetric imaging collection for exoplanetary disk representation learning (Cao et al., 4 Jun 2025); and (3) a large-scale, multimodal maritime object detection and tracking dataset (Choi et al., 2024). Each is notable for pioneering new data modalities or scales within its domain and is closely tied to state-of-the-art machine learning for perception and representation.

1. Automotive Perception: POLARIS-53K RGB-Polarimetric Driving Dataset

The POLARIS-53K automotive dataset is the first large-scale, time-synchronized, and spatially calibrated collection to integrate RGB-polarimetric imagery, dense lidar, high-rate GNSS/IMU navigation, and pixel-level free-space segmentation masks for outdoor driving scenarios (Baltaxe et al., 2023). Designed to enable perception algorithms leveraging light polarization, it contains the following:

Modality Quantities/Resolution Format / Sensor
RGB-Polarimetric 12,627 frames / 1.25MP (1624×1224) Lucid Vision TRI050S-QC
Lidar 12,627 scans / 128 channels Velodyne Alpha Prime
GNSS/IMU Continuous @ 100Hz (pose, velocity) OxTS RT3000, .csv/rosbag
Free-space labels 8,141 masks / pixel-level PNG (1=drivable, 0=non-drivable)

Polarimetric channels are captured in four orientations (0°, 45°, 90°, 135°), providing per-pixel calculation of Stokes parameters and derived quantities:

  • S0=12(P0+P45+P90+P135)S_0 = \frac{1}{2}(P_0+P_{45}+P_{90}+P_{135}) (total intensity)
  • S1=P0P90S_1 = P_0 - P_{90}
  • S2=P45P135S_2 = P_{45} - P_{135}
  • DoLP=S12+S22S0\mathrm{DoLP} = \frac{\sqrt{S_{1}^{2} + S_{2}^{2}}}{S_0}
  • AoLP=12atan2(S2,S1)\mathrm{AoLP} = \frac{1}{2}\,\mathrm{atan2}(S_2, S_1)

Supervised free-space annotations (1/0 mask) are derived semi-automatically using Segment Anything (SAM) with manual refinement. Depth ground truth corresponds to projected lidar points, with masks provided for occluded pixels.

Calibration

  • Intrinsics: OpenCV chessboard
  • Extrinsics: Planar least-squares using chessboard planes
  • Camera-to-lidar: u=K[Rt]Xlidaru = K\,[R|t]\,X_{\mathrm{lidar}}
  • GNSS/INS: WGS-84 and local-ENU, synchronized by timestamp

Benchmark Splits and Statistics

Task Train Validation Test Note
Free-space detection 6,206 856 969 Geographically disjoint
Monocular depth 6,116 778 778 Vehicles >15 km/h
  • DoLP histogram: Peaks << 0.2 on matte, up to 0.8 on glass
  • Scene coverage: Six suburban, fair-weather, midday; street depths 5–50 m

Baseline Results

Free-space detection (KITTI-road metrics):

  • RGBP-RoadSeg (RGB + polarization): IoU = 0.939, AP = 0.994
  • Adding polarization yields +1.5% IoU gain versus RGB

Monocular depth (Eigen metrics):

  • RGB-Depth: RMSE = 6.389, δ<1.25\delta<1.25 = 0.904
  • RGBP-Depth: RMSE = 6.172, δ<1.25\delta<1.25 = 0.911

Minimal DNN architectural modifications are needed: input expansion to [RGB,Pfeat][RGB, P_{\mathrm{feat}}] (where S1=P0P90S_1 = P_0 - P_{90}0) suffices to yield state-of-the-art results (Baltaxe et al., 2023).

2. Exoplanetary Imaging: POLARIS-53K High-Contrast Polarimetric Benchmark

POLARIS-53K in astrophysics is a high-quality, uniformly reduced polarimetric imaging dataset used for direct imaging of exoplanetary disks with public VLT/SPHERE-IRDIS dual-beam polarimetric data collected from 2014–2024 (Cao et al., 4 Jun 2025). Its composition:

Data Type # of Samples Resolution / Format
Preprocessed exposures 75,910 4n × 1024×1024 FITS cubes (Q+, Q−, U+, U−)
PDI postprocessed S1=P0P90S_1 = P_0 - P_{90}1 921 1024×1024 FITS
Cropped polarimetric images 53,000 256×256 NumPy, FITS, .jpeg

The “53K” refers to central S1=P0P90S_1 = P_0 - P_{90}2 crops from 75,910 exposures, excluding the coronagraph-masked star core. Uniform IRDAP preprocessing (bad-pixel removal, flat-fielding, background subtraction, astrometric alignment, dual-beam comb.) standardizes all inputs.

Polarimetric Representations

  • S1=P0P90S_1 = P_0 - P_{90}3, S1=P0P90S_1 = P_0 - P_{90}4
  • S1=P0P90S_1 = P_0 - P_{90}5, S1=P0P90S_1 = P_0 - P_{90}6
  • All frames log-transformed, rescaled: exposures S1=P0P90S_1 = P_0 - P_{90}7, polarization images S1=P0P90S_1 = P_0 - P_{90}8

Labeling and Splits

801 out of the 921 S1=P0P90S_1 = P_0 - P_{90}9 frames are pseudo-labeled via spectral clustering (substantially reducing manual annotation requirement). Classes: reference star (no disk) vs. target disk system (strong scattered-light disk). Class imbalance (96 manually labeled disks, 825 references) is mitigated by stratified sampling.

Recommended ML splits (for S2=P45P135S_2 = P_{45} - P_{135}0):

  • Train: 67/67 per class, Val: 15/15, Test: 14/14

Benchmarking Tasks

  • Representation learning: Masked Autoencoder, DeepCluster, SimCLR, Diff-SimCLR (proposed)
  • Diff-SimCLR achieves 93.0% mean supervised accuracy (SVC, 32D embedding), 77.3% unsupervised clustering
  • Vision-language zero-shot (GPT-4.1, Gemini-2.0-Flash): max 75% accuracy

Applications

  • Automated RDI reference selection for starlight subtraction
  • Disk morphology/anomaly classification
  • Generative background imputation for PSF subtraction
  • Transfer to other telescopes (e.g., GPI/CHARIS, Roman)

3. Maritime Perception: PoLaRIS-53K Object Detection and Tracking Dataset

PoLaRIS-53K addresses multimodal maritime perception: 360,000 frames from 5 diverse video sequences on the Pohang Canal, annotated for detection and multi-object tracking of ships and buoys (Choi et al., 2024).

Modality Resolution Frequency
Stereo RGB 2048×1080 10 Hz
Thermal IR 2048×1080 10 Hz
3D Lidar - 10 Hz
2D Radar BEV (x,y) ∼1 Hz

Annotations and Calibration

  • 190,000 2D boxes (S2=P45P135S_2 = P_{45} - P_{135}1120,000 ship, S2=P45P135S_2 = P_{45} - P_{135}270,000 buoy)
  • LiDAR point-level per-object labels: S2=P45P135S_2 = P_{45} - P_{135}380,000 per sequence
  • Radar cluster labels via DBSCAN
  • Track-IDs for all dynamic objects
  • Minimum box: S2=P45P135S_2 = P_{45} - P_{135}4 px (0.0045% frame), manual refinement for <5% objects
  • Known extrinsic/intrinsic matrices for cross-modal projections.

Detection and Tracking Protocols

Metrics:

  • IoU: S2=P45P135S_2 = P_{45} - P_{135}5
  • mAP, [email protected]:.95, MOTA, IDF1, IDP, IDR

Baseline closed-set models (COCO-pretrained):

Model Day-RGB mAP Night-RGB mAP Day-TIR mAP Night-TIR mAP
YOLOv8-L 79.6% 84.2% 58.4% 64.0%
YOLOv10-L 80.1% 82.9% 54.9% 59.1%
RT-DETR-R50 78.4% 69.1% 53.4% 55.2%
  • RGB tracking: OC-SORT achieves MOTA 98.5% (day), ByteTrack 99.6% (night)
  • TIR tracking lags RGB; best: MOTA 75.5% (day), 88.0% (night)

Known Limitations

  • Only two object classes; no heavy-weather or high-severity occlusion
  • TIR images are low-res/blurry; radar clusters miss distant targets
  • Geographic coverage restricted to Pohang Canal

Recommendations

  • For TIR, train from scratch; low-light enhancements are beneficial
  • Fuse depth and radar for robust detection/tracking
  • Minimum 10×10 px objects stress small-object detection capacity

4. Comparison and Nomenclature Across Domains

The “POLARIS-53K” designation is non-unique, referencing three unrelated datasets:

Domain Dataset Purpose/Type Primary Ref
Automotive RGB-polarimetric driving, spatiotemporal alignment (Baltaxe et al., 2023)
Astronomy Exoplanet disk polarimetric imaging, representation (Cao et al., 4 Jun 2025)
Maritime Stereo-RGB/TIR/Lidar/Radar for object detection/tracking (Choi et al., 2024)

Each dataset is tailored for advanced machine learning tasks in its field, but inter-dataset interoperability is neither claimed nor supported.

5. Impact on Perception Algorithms and Research

The introduction of these large, well-annotated polarimetric and multimodal datasets has enabled:

  • State-of-the-art perception benchmarks leveraging polarization, surpassing prior RGB-only performance (automotive).
  • New paradigms for unsupervised and supervised representation learning in high-dynamic-range, high-contrast scientific imaging, advancing both astrophysics and machine learning (astronomy).
  • Extensive evaluation and training of object detection and tracking models in challenging small-object, low-light, multi-sensor, real-world maritime environments (maritime).

For all domains, the release of POLARIS-53K datasets has been pivotal in quantifying the benefits of polarimetric and multimodal sensing and providing robust baselines for the next generation of perception algorithms.

6. Access, Usage, and Best Practices

Methodological best practices are detailed in each reference, with a focus on task-appropriate normalization, calibration, and augmentation pipelines. Stratified sampling is vital where labeling is highly imbalanced. For detection/tracking tasks, cross-sensor calibration and careful manual refinement remain critical.


Principal References:

  • "Polarimetric Imaging for Perception" (Baltaxe et al., 2023)
  • "POLARIS: A High-contrast Polarimetric Imaging Benchmark Dataset for Exoplanetary Disk Representation Learning" (Cao et al., 4 Jun 2025)
  • "PoLaRIS Dataset: A Maritime Object Detection and Tracking Dataset in Pohang Canal" (Choi et al., 2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to POLARIS-53K Dataset.