Papers
Topics
Authors
Recent
2000 character limit reached

NIR Dataset Overview: Acquisition & Applications

Updated 5 January 2026
  • NIR datasets are collections of images or spectra captured at near-infrared wavelengths (typically 700–1000 nm) with specialized sensors ensuring pixel-level alignment with visible channels.
  • They employ rigorous acquisition protocols, including precise sensor calibration and advanced preprocessing methods like Gaussian pyramids and dark/white corrections.
  • These datasets support diverse applications such as semantic segmentation, multimodal fusion, and biometric analysis, with tailored splits and annotation regimes for robust evaluation.

A Near-Infrared (NIR) dataset is a data collection comprising images or spectra acquired at near-infrared wavelengths (typically 700–1000 nm, sometimes extending to ~1700 nm or 2500 nm in spectral applications). NIR datasets provide critical information beyond the visible spectrum and are foundational in multimodal computer vision, remote sensing, spectral analysis, biomedical diagnostics, and various scene understanding tasks. The precise structure of a NIR dataset—its sensor configuration, calibration, acquisition protocol, content, splits, format, and public accessibility—varies sharply by application domain and research objective.

1. Acquisition Hardware and Sensor Protocols

NIR image datasets require carefully specified optics and sensor design to guarantee pixel-wise correspondence between NIR and visible channels. The approach by Gastal et al. uses a jAI AD-080CL two-CCD color camera, in which a dichroic/prism sensor splits visible (RGB) and NIR light onto distinct sensors, ensuring perfect pixel-to-pixel registration and synchronized frame acquisition. The NIR sensor has no Bayer filter and the IR-cut filter is removed, so the sensor records a full-band grayscale NIR image at 10 bits per pixel, 1024×768 px (Limmer et al., 2016). The spectral bandpass of such NIR channels is typically defined by the absence of the IR-cut filter, capturing wavelengths ≥700 nm, though specific cut-on/cut-off values are not always provided.

Datasets targeting robot vision or precision agriculture employ even more complex acquisition rigs. For example, NIRPlant uses Allied Vision Alvium 1800 U-501 monochrome cameras (nominal center ~850 nm, bandwidth ~125 nm), radiometrically calibrated against NIST-traceable reflectance standards, and collects data in parallel with RGB, depth (ZED 2i stereo), and LiDAR (Neuvition Titan). In all cases, rigid mechanical mounting and synchronizing the triggers across all sensors are required for sub-pixel alignment and cross-modality fusion (Chang et al., 20 Aug 2025).

Other datasets, such as hyperspectral face cubes (Ng et al., 2023), apply pushbroom Specim FX10 cameras, capturing at hundreds of contiguous wavelengths (Δλ ≈ 1.34 nm) spanning both visible and NIR bands. Calibration involves dark and white reference correction, linearization, and resampling to fixed bands.

2. Content, Environment, and Annotation Regimes

The environmental focus and annotation richness of NIR datasets are highly domain-specific. Gastal et al.'s dataset comprises ~38,495 aligned NIR–RGB road scene frame pairs (32,795 train, 800 eval), collected exclusively under summer, sunny conditions in the Ulm/Tübingen (Germany) region. Image diversity is limited to rural highways, vegetation, and scattered built elements; no semantic or instance annotations exist, only pixel-aligned RGB ground truth. Split strategy isolates certain days/tracks for evaluation to enforce scene-level generalization (Limmer et al., 2016).

In contrast, NIRPlant offers botanical breadth: 34 plant scenes × 360 views × 4 modalities (NIR, RGB, depth, LiDAR), under indoor/outdoor, variable illumination, and augmented with phenological/meteorological text metadata. All frames are co-registered; train/val/test splits are by held-out scenes (Chang et al., 20 Aug 2025).

Medical or spectral datasets—e.g., NIR-SC-UFES for in-vivo skin cancer detection—acquire single-point NIR spectra (125 bands, 900–1700 nm) directly from lesions, with clinical diagnosis as ground truth and strict protocol for measurement and annotation (Loss et al., 2024). Datasets focusing on face recognition (LAMP-HQ) or affect (Oulu-Casia NIR MorphSet) curate large subject-diverse image collections, systematically varying pose, expression, and accessory, often with manual verification, demographic balancing, and standard splits for benchmarking (Yu et al., 2019, Chen et al., 2022).

Spatial and semantic annotation varies. Some multiclass segmentation datasets (e.g., TAS-NIR) provide per-pixel, class-indexed masks aligned to fine-grained semantic categories (e.g., grass, bush, tree) across all registered VIS–NIR pairs (Mortimer et al., 2022).

3. Preprocessing, Calibration, and Augmentation

Effective use of NIR data hinges on robust normalization and calibration. Gastal et al.'s approach is representative: raw frames are processed into multi-level Gaussian pyramids (downsampled at each scale), with per-patch local mean subtraction and standard deviation normalization, yielding detail images as elementwise products h=I′∘σh=I'\circ\sigma (Limmer et al., 2016). For spectral datasets, calibration to reflectance employs standardized dark/white correction:

R(λ)=Iraw(λ)−Idark(λ)Iwhite(λ)−Idark(λ)R(\lambda) = \frac{I_\text{raw}(\lambda) - I_\text{dark}(\lambda)}{I_\text{white}(\lambda) - I_\text{dark}(\lambda)}

as in the barley germination dataset, followed by log-transform for absorbance (Engstrøm et al., 23 Apr 2025).

Data augmentation is typically task-specific. For time-series PPG NIR datasets, heart-rate variability is synthetically expanded through temporal warping, enabling the model to generalize across a broader physiological range (Hanley et al., 2023). In fire detection, synthetic NIR images are generated from RGB via a conditional GAN (FIRE-GAN) or grayscale conversion, dramatically expanding the dataset scale and diversity (Khai et al., 29 Dec 2025). For segmentation or detection, patch extraction, geometric transforms, or color normalization may be applied.

4. Format, Distribution, and Access

NIR datasets are commonly distributed as archives or repositories containing aligned image or spectral data, metadata (sensor, timestamp, environmental conditions), and annotation files. File types include PNG, TIFF, RAW for images; CSV, JSON for labels and metadata; and PCD, EXR for depth or point clouds. Organization is often hierarchical by scene or modality.

Public accessibility varies:

A summary table illustrates selected NIR datasets:

Dataset Modality/Scene Size Split Annotation Access
Gastal et al. Road RGB–NIR 38k pairs 32.8k/800 RGB only Not public
NIRPlant Plant RGB/NIR/depth/LiDAR >12k/scene 27/4/3 scenes Multimodal+text CC BY 4.0, GitHub
LAMP-HQ Face NIR–VIS 73k imgs 300/273 subjects ID/scene/pose Open on request
ParkSeg12k Satellite RGB+NIR 12.6k pairs train/val/test Binary mask MIT, GitHub
NIR-SC-UFES Skin NIR spectra 971 lesions 80/20% splits Histopathology Forthcoming, open

5. Downstream Use Cases and Standard Evaluation

NIR datasets are central in a range of vision and analysis tasks:

  • Colorization and Fusion: Direct learnable mapping from NIR to RGB space for photorealistic restoration, with high-frequency NIR features enhancing RGB estimation (Limmer et al., 2016).
  • 3D Reconstruction and SLAM: Multimodal NIR–RGB–depth–LiDAR fusion with cross-attention architectures enhances structural priors, view synthesis, and robustness under challenging illumination (Chang et al., 20 Aug 2025, Kim et al., 2024).
  • Semantic Segmentation: Vegetation indices (NDVI, EVI) derived from NIR and VIS are combined with CNN output via logit fusion and CRFs in unstructured terrain (Mortimer et al., 2022, Qiam et al., 2024).
  • Face Recognition and Biometry: Large-pose, multi-spectral NIR–VIS datasets evaluated for cross-modal matching, style transfer, and identity preservation under exhaustive demographic and attribute variation (Yu et al., 2019).
  • Diagnosis and Chemometrics: Regression (PLSR, MLP, CNN) or classification (LightGBM, SVM, 1D-CNN) of biochemical content or disease state from preprocessed NIR spectra (Loss et al., 2024, Chiniadis et al., 2023, Engstrøm et al., 23 Apr 2025).
  • Detection Tasks: Nighttime fire detection with annotated NIR imagery, leveraging synthetic or real data; detection metrics mAP, AP, precision, recall standardized for evaluation (Khai et al., 29 Dec 2025).

Performance metrics and benchmarking protocols are dataset- and task-specific, including mean Intersection-over-Union (mIoU), pixel accuracy, MAE, PSNR, SSIM, LPIPS, RMSE, and spectral/structural similarity for spectral cubes.

6. Recommendations for Generic Dataset Construction

Best practices evident across canonical NIR datasets include:

  • Hardware-Based Pixel Alignment: Employ dichroic-prism dual-CCD/camera systems for inherent spatial correspondence, obviating post-hoc registration (Limmer et al., 2016, Kim et al., 2024).
  • Scene and Condition Diversity: Capture data across multiple days, tracks, seasons, and environmental conditions for generalization. Integrate indoor/outdoor, variable illumination, and multiple object categories (Chang et al., 20 Aug 2025).
  • Extensible Modality Stack: Where possible, combine NIR with RGB, depth, LiDAR, thermal, and rich textual/semantic metadata to facilitate a spectrum of research tasks.
  • Robust Splits and Hold-Outs: Stratify train/validation/test sets at the scene, subject, or day level to explicitly prevent overfitting to event-specific variation.
  • Transparent Preprocessing: Standardize calibration, normalization, and augmentation pipelines; document all corrections, reference standards, and derived features.
  • Public Distribution and Licensing: Prefer open, unrestricted access with thorough documentation, sample code, and clear citation policies.

Adherence to such practices will result in datasets both suited for their target research area and broadly extensible to emerging NIR-driven applications.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to NIR Dataset.