Papers
Topics
Authors
Recent
2000 character limit reached

ReCo-Data: Multi-Domain Research Datasets

Updated 25 December 2025
  • ReCo-Data is a multi-domain resource that aggregates synthetic and real datasets to support zero-shot segmentation, in-context video editing, smart urban planning, and magnetics research.
  • Its vision component uses CLIP-based retrieval and co-segmentation to generate high-purity pseudo-masks, significantly boosting performance on segmentation benchmarks.
  • The collection also features large-scale vector data for urban layouts and ab initio computed parameters for rare-earth magnets, enabling practical applications in smart cities and materials discovery.

ReCo-Data refers to several high-impact datasets and synthetic data generation workflows in vision, language, urban design, and materials science. The term ReCo-Data is associated with at least four distinct resources: (1) a synthetic segmentation corpus for zero-shot transfer in vision (Shin et al., 2022); (2) a large-scale, high-quality instructional video editing dataset for region-constrained in-context video generation (Zhang et al., 19 Dec 2025); (3) a detailed vector dataset for real-world residential community layout planning (Chen et al., 2022); and (4) a parameter-rich dataset for rare-earth–cobalt magnets from high-throughput ab initio computations (Zhang et al., 8 Jan 2025). Each serves as a foundational resource for its respective domain, with distinct methodologies, statistical profiles, and application targets.

1. ReCo-Data for Zero-Shot Semantic Segmentation

ReCo-Data (Shin et al., 2022) is a synthetic, concept-centric segmentation corpus designed to facilitate zero-shot and unsupervised transfer in semantic segmentation. The core workflow involves retrieving k unlabeled images most relevant to a target concept from large unlabelled pools (e.g., ImageNet-1K, LAION-5B) via CLIP-based embedding similarity, and generating high-purity pseudo-masks for each by co-segmentation.

Data Generation Pipeline

  • Retrieval: For each concept cc, compute fimg(I)=ψI(I)Ref_\text{img}(I) = \psi_I(I) \in \mathbb{R}^e for all images II and ftxt(c)=ψT(c)Ref_\text{txt}(c) = \psi_T(c) \in \mathbb{R}^e. Select top-kk images by S(I,c)=fimg(I)ftxt(c)S(I, c) = f_\text{img}(I) \cdot f_\text{txt}(c) (cosine similarity).
  • Co-segmentation: For each set of kk images, dense features are extracted and a global block-wise similarity matrix AA is constructed to identify seed pixels that are maximally consistent across the archive. These seed features are averaged and used to generate initial masks, refined via language-guided CLIP saliency (“DenseCLIP”).
  • Context Elimination: To avoid segmenting background common to several concepts (sky, road, person), dirty regions are suppressed using language-instructed saliency maps, precomputed for frequent distractors.

Dataset Statistics

Split (example) Concepts (C) Archive Size (k) Images Image Res Masks
PASCAL-Context 59 50 2,950 320×320 2,950
COCO-Stuff 171 50 8,550 N/A 8,550
Cityscapes 27 50 1,350 320×320 1,350
KITTI-STEP 19 50 950 Original 950

Mean object size varies by domain; on PASCAL-Context, the mask covers \sim25% of 320×320320\times320, while in Cityscapes it is \sim15%.

Training and Evaluation

DeepLabv3+ (ResNet-101) is trained with these synthetic masks. Data augmentation includes random scale, crop, flip, color jitter, and Gaussian blur. Loss is pixel-wise cross-entropy. Zero-shot transfer and unsupervised adaptation on standard segmentation benchmarks show significant improvement over prior unsupervised and CLIP-based baselines.

Key results:

  • Zero-shot, PASCAL-Context: 26.3% mIoU vs. 19.6% (DenseCLIP)
  • Unsupervised adaptation, Cityscapes: 24.2% mIoU vs. 21.0% (STEGO)

Ablation confirms the contribution of larger kk, CLIP refinement, and context gating.

Limitations

Performance is bounded by CLIP's concept coverage; rare or novel concepts unrevealed in pretraining are not captured. Small objects tend to be missed due to coarse stride in seed selection. The method incurs higher inference cost than backbone-only models. The dataset's visual domain is object-centric and photographer-biased (Shin et al., 2022).

2. ReCo-Data: Instruction-Based Video Editing Corpus

ReCo-Data for instructional video editing (Zhang et al., 19 Dec 2025) is a large-scale, high-quality dataset aimed at training and benchmarking region-constrained video editing models.

Dataset Characteristics

  • Size: 500,000 instruction–video pairs
  • Clip Specs: 81 frames, 16 fps, 480×832 pixels, 5 seconds each
  • Tasks: Instance-level object addition, removal, replacement, and global style transfer (balanced: ∼125k per task)
  • Quality: >91% high-quality retention in human evaluation, outperforming prior datasets (InsV2V, InsViE, Senorita: 17.9–29.2%)
  • Sources: Raw videos from HD-VG, OpenS2V-Nexus, Pexels; captions and editing instructions generated using Qwen-2.5-VL-32B and Gemini-2.5-Flash-Thinking
  • Mask Extraction: GroundingDINO + SAM v2 guided by spaCy NER for object entity detection

Data Structure

Each data entry comprises:

  • Source video
  • Target-edited video
  • Task label (Add, Remove, Replace, Style)
  • Binary mask sequence (for local edits)
  • Plain-text edit instruction and target caption
  • Per-video metadata (duration, frame rate, etc.)

Evaluation Protocol

A VLLM (Gemini-2.5-Flash-Thinking) benchmark evaluates 480 held-out samples (120/task) using structured metrics:

  • Edit Accuracy (SEAS_{EA}): SASPCP3\sqrt[3]{SA \cdot SP \cdot CP}
  • Video Naturalness (SVNS_{VN}): ANSNMN3\sqrt[3]{AN \cdot SN \cdot MN}
  • Video Quality (SVQS_{VQ}): VFTSES3\sqrt[3]{VF \cdot TS \cdot ES}
  • Overall Score (SS): (SEA+SVN+SVQ)/3(S_{EA} + S_{VN} + S_{VQ})/3 Manual splits are not provided; users must partition for their own experiments.

Limitations

No formal inter-annotator agreement or guideline documentation is provided. The annotation and generation process introduces synthetic pipeline biases. The dataset is limited to four core editing tasks, excluding, for example, structured multi-object edits or denoising (Zhang et al., 19 Dec 2025).

3. ReCo-Data: Residential Community Layout Planning Dataset

The ReCo-Data in architectural and urban design denotes the largest open-source vector database for residential community layouts (Chen et al., 2022).

Core Attributes

  • Total Communities: 37,646
  • Total Buildings: 598,728 (mean \approx15.9/community; range 7–143)
  • Data Formats: JSON and GeoJSON (primary), with conversion to Shapefile, PNG/SVG raster, and 3D (OBJ/GLTF)
  • Per-Building Fields: ID, polygon (Mercator x/y), number of storeys (used as height: 3m×3\,\mathrm{m} \times floors)
  • Per-Community Fields: Unique ID, city label, boundary polygon, associated buildings
  • Desensitization: Coordinates offset to avoid matching real-world geolocations
Statistic Value
Communities 37,646
Buildings 598,728
Mean/building 15.9
Min/max 7/143

Data Curation

Raw vectors from OpenStreetMap and Google Earth Engine (footprints/heights) and Baidu Map APIs (parcels) are merged. Cleaning discards communities with extreme building counts and aligns all geometries in a common Transverse Mercator CRS. Building-to-community assignment is computed by point-in-polygon.

Applications and Benchmarks

Tasks include generative layout design (GAN-based models), morphological pattern recognition (e.g., clustering, GCN-based classification), and spatial metrics (FAR, BCR, sunlight analysis). Baseline GANs (unconditional and boundary-constrained pix2pix) are trained and evaluated using Fréchet Inception Distance (FID) and Intersection-over-Union (IoU).

Dataset DCGAN FID cGAN FID
h_city_60 75.1 70.8
city_40 68.9 63.7
city_60 62.3 57.4
ReCo (full) 56.4 49.2

Larger dataset size correlates with lower FID; conditional models yield better results, especially when handling irregular boundaries. Code snippets for data loading, batch preparation, and visualization are provided in the official repository (Chen et al., 2022).

4. "ReCo Dataset" in Rare-Earth Cobalt Magnetics

In the context of rare-earth–cobalt magnetics, "ReCo dataset" denotes the compilation of calculated structural, electronic, magnetic, and crystal-field parameters for RECo₅ and related materials (Zhang et al., 8 Jan 2025).

Computational Pipeline

  • Structure Generation: Enumerate RE (La–Lu, ex. Ce), Co/Fe doping ratios, lattice distortions
  • Ab initio Calculations:
    • Open-core DFT with Hund's-rule penalty for 4f spin/orbital moment enforcement
    • DFT+Hubbard I for multiplet effects and crystal-field extraction
    • QSGW for self-energy corrections
  • Parameter Extraction: 4f spin and orbital magnetic moments, Stevens crystal-field parameters (AlmA_l^m), uniaxial anisotropy constants (K1,K2K_1, K_2), exchange fields, energy above hull (thermodynamic stability), Curie temperature estimates

Key example: | RE | MScM_S\parallel c (μB\mu_B) | MLcM_L\parallel c (μB\mu_B) | A20A_2^0 (meV) | A40A_4^0 (meV) | A60A_6^0 (meV) | K1K_1 (meV) | |-----|-------------|-------------|----------|----------|----------|----------| | Nd | 3.2 | 2.1 | –45 | +5.1 | –0.4 | –4.0 | | Sm | 4.8 | 3.4 | –75 | +7.6 | –0.6 | –19 |

Use Cases

  • Rapid screening of novel RE–Co compounds for permanent magnet applications
  • Machine-learning ranking for high-anisotropy, high-moment, low-stability energy candidates
  • Crystal-field mapping for anisotropy design The pipeline is robust for the computational discovery of rare-earth–cobalt magnets, with support for scalability to broader structural variants (Zhang et al., 8 Jan 2025).

5. Accessibility, Licensing, and Limitations

  • Vision/Linguistic corpora: Datasets are publicly released (or linked from project web pages/repositories), typically under research licenses; explicit license terms must be checked with each issuing group.
  • Video editing corpus: Download available for research; license not explicitly stated.
  • Urban planning: Dataset and scripts are published on Kaggle and GitHub.
  • Magnetic materials: Data and workflow details are excerpted from publication, with plausible code/workflow provided in supplementary materials. Known limitations in most ReCo-Data resources include synthetic biases (automatic curation, limited domain diversity), absence of standardized train/val/test splits, and potential annotation omissions (absence of exhaustive documentation or agreement metrics).

6. Domain-Specific Impact and Research Directions

ReCo-Data variants enable cross-domain progress:

  • In vision, synthetic corpus generation mitigates annotation costs, enabling scalable zero-shot transfer and open-vocabulary segmentation.
  • In video, region-constrained in-context data strongly supports compositional and spatially-aware modeling for instruction-based editing.
  • In urban design, large-scale geometric vectorization supports deep generative models and high-resolution spatial analytics for smart-city scenarios.
  • In materials science, parameter-rich datasets underpin both physical insights and machine-guided exploration of functional magnetic materials.

Collectively, ReCo-Data exemplifies automated, large-scale, annotation-efficient data curation and benchmarking, serving as a critical reference point for future efforts in data-driven research across multiple domains.

Representative references: (Shin et al., 2022, Zhang et al., 19 Dec 2025, Chen et al., 2022, Zhang et al., 8 Jan 2025)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to ReCo-Data.