Unpaired Anime Scenery Dataset
- Unpaired anime scenery dataset is a curated collection of high-resolution anime background images extracted from films like Your Name without paired real-world references.
- It supports image-to-image translation by offering diverse scene types, including mountains, cityscapes, and various lighting conditions for stylization and restoration.
- Advanced curation protocols, such as brightness separation and segmentation-based filtering, ensure quality, semantic diversity, and robust benchmarks in research.
An unpaired anime scenery dataset refers to a corpus of anime-style environment images collected without explicit ground-truth correspondences to real-world natural scenes. These datasets enable domain adaptation, image-to-image translation, and enhancement pipelines that do not rely on paired supervision, supporting research on stylization, restoration, and illumination correction in the domain of anime scene synthesis and processing.
1. Dataset Composition, Scope, and Structure
The most widely cited unpaired anime scenery datasets originate from two primary sources: the “Shinkai-style anime scenes” collection introduced in Scenimefy (Jiang et al., 2023), and the multi-source compilation for low-illumination enhancement described in the DRU framework (Gao et al., 26 Dec 2025). The Shinkai-style dataset comprises 5 958 high-resolution (1080×1080, 1:1) key-frame images, manually extracted from nine Makoto Shinkai films (e.g., Your Name, Weathering with You, Children Who Chase Lost Voices). The focus is on outdoor, scene-centric visuals with substantial stylistic uniformity—mountains, cityscapes, rivers, lakes, and expansive skies, with exclusion of frames dominated by characters or with insufficient quality.
The DRU-enhancement dataset aggregates images from multiple anime sources:
- 5 952 Shinkai-style scenes (Scenimefy pipeline);
- 6 196 frames from four AnimeGAN sources (Miyazaki, Shinkai, Hosoda films);
- 6 656 natural landscape images translated to anime style using a pretrained Scenimefy generator (CycleGAN pipeline).
These collections are disjoint and unpaired with respect to natural photos, i.e., there is no one-to-one scene content correspondence.
Summary Table (adapted from (Jiang et al., 2023)):
| Property | Shinkai-style Dataset | DRU-enhancement Aggregate |
|---|---|---|
| Total images | 5 958 | 18 804 |
| Source films | 9 Shinkai titles | Multiple (Shinkai, Miyazaki, Hosoda, pseudo-anime) |
| Resolution | 1080×1080 (uniform) | 720×406 to 1920×1080 (varies) |
| Scene focus | Outdoor, city, nature | Broad: rural, urban, weather, illumination |
| Public access | Project page (forthcoming) | Provided via code repository (per paper) |
2. Unpaired Structure and Domain Characteristics
All reported anime scenery datasets are constructed to be unpaired: no manual alignment or correspondence exists between images in the anime domain and real-world photographs. The Shinkai-style set provides a pure anime domain reference; real images from LHQ or CycleGAN-translated pseudo-anime scenes are held separately for baseline or training input.
Neither benchmark provides pixel-level semantic masks, metadata, or textual scene descriptions with the image release. Internal curation may involve automatic semantic segmentation (e.g., Mask2Former) for pseudo-pair filtration or diversity analysis, but these labels are not distributed to end users. In low-illumination enhancement contexts, each frame is tagged as {dark|bright} with an associated RP uncertainty score, facilitating data-centric experiments involving uncertainty-aware learning (Gao et al., 26 Dec 2025).
Stylistically, color palettes span pastel sunsets to neon-lit nightscapes, encompassing diverse lighting conditions (day, dusk, night, interiors, atmospheric effects) but maintain consistent hand-drawn background layering and organic texture signatures, especially in the Shinkai dataset. The DRU-enhancement aggregate displays a broader stylistic range, reflecting its multi-author, pseudo-anime, and weather-varied components.
3. Dataset Curation, Filtering, and Annotation Protocols
Both primary datasets are extracted via hand curation procedures from feature films and pre-existing anime datasets. Key curation steps for the Shinkai-style set (Jiang et al., 2023):
- Manual frame extraction from source films emphasizing diversity of environments.
- Filtering for high resolution (≥1080×1080), removal of duplicates, exclusion of close character portraits, low-quality, or irrelevant frames.
- Emphasis on “scene-only” content, ensuring ≥5 semantic categories per image where feasible.
For the DRU dataset (Gao et al., 26 Dec 2025), a structured three-stage pipeline is applied:
- Data aggregation across real and pseudo-anime sources.
- Coarse brightness separation using Quartile Average Brightness (QAB): partitioning each image into quartiles, computing mean pixel intensity, with thresholds and to define initial dark/bright/uncertain groups.
- Refined classification using a binary ResNet-18 classifier (trained on high-confidence dark/bright images); uncertain samples are reassigned, median-corrected, and manually reviewed at low confidence.
No paired supervision or pixel-aligned scenes exist; masking and semantic consistency constraints are utilized only for internal filtering and pseudo-pair generation.
4. Distribution, Splitting, and Usage in I2I and Enhancement Benchmarks
The standard storage paradigm organizes anime images as JPEGs in directory trees suitable for typical deep learning workflows:
1 2 3 4 |
anime_scenes/ ├─ train/ (~5,000) ├─ val/ (~500) ├─ test/ (~458) |
All 5 958 Shinkai frames are used for generator fine-tuning in Scenimefy’s original experiment; no standard split is prescribed, though a 10% validation/test holdout is suggested if reengineering. For the DRU-enhancement task, splits are defined by illumination label:
| Split | Dark images | Bright images |
|---|---|---|
| trainDark | 8 240 | — |
| trainBright | — | 8 501 |
| testDark | 2 063 | — |
No one-to-one correspondence or paired exposure exists; all usage is strictly unpaired. Associated methods for low-light enhancement (e.g., EnlightenGAN variants, ZeroDCE++, RUAS) operate using the trainDark/trainBright splits (Gao et al., 26 Dec 2025).
5. Scene and Domain Diversity
Explicit semantic diversity quantification is performed via segmentation-based analysis (e.g., Mask2Former), confirming the presence of at least five semantic categories in most Shinkai-style images (Jiang et al., 2023). Scene types span mountains, forests, grasslands, rivers, lakes, seascapes, cityscapes, and meteorological conditions (cloud formations, city lights, mist). The DRU dataset includes broader stylistic diversity due to its inclusion of pseudo-anime translations (covering, e.g., seasonal changes, varied weather, rare lighting situations).
A plausible implication is that domain shift robustness, stylistic generalization, and performance in data-uncertain conditions can all be empirically tested using these datasets, given their semantic richness and source diversity.
6. Benchmarking Practices and Empirical Results
Anime scenery datasets underpin objective comparisons on image-to-image translation and enhancement benchmarks. Scenimefy reports the following Fréchet Inception Distance (FID) scores against the Shinkai-style dataset as the reference domain (Jiang et al., 2023):
| Method | FID ↓ |
|---|---|
| Real (LHQ vs anime) | 121.81 |
| CartoonGAN | 67.20 |
| AnimeGAN | 67.74 |
| White-box | 61.97 |
| CTSS | 66.73 |
| VToonify | 90.58 |
| Scenimefy (Ours) | 48.92 |
User preference (fraction of votes, 30 subjects):
| Method | Style | Content | Overall |
|---|---|---|---|
| CartoonGAN | 0.067 | 0.087 | 0.073 |
| AnimeGAN | 0.083 | 0.080 | 0.077 |
| White-box | 0.110 | 0.103 | 0.103 |
| CTSS | 0.043 | 0.123 | 0.057 |
| VToonify | 0.010 | 0.030 | 0.017 |
| Scenimefy | 0.687 | 0.577 | 0.673 |
For enhancement tasks (Gao et al., 26 Dec 2025), suggested metrics include BRISQUE, PIQE, NIMA, and PI. Performance comparison is fair only when protocols respect the unpaired nature of the data and the domain-specific illumination labels.
7. Licensing, Release, and Practical Considerations
The Shinkai-style dataset is scheduled for public, non-commercial access at the Scenimefy project page, though no explicit license was announced in (Jiang et al., 2023); users are advised to contact the authors for usage terms. The DRU-enhancement collection is distributed via code repositories associated with the framework, where open-source details (e.g., MIT, CC BY) may be specified (Gao et al., 26 Dec 2025). Preprocessing, resizing, and filtering strategies may be adopted according to task-specific requirements, especially given that input resolutions and augmentation conventions differ across subtasks.
Researchers employing these datasets are able to benchmark scene stylization, domain adaptation, and enhancement methods under conditions that isolate anime-specific phenomena, illumination variability, and unpaired image structure. This has enabled rigorous comparative evaluation and model development beyond paired or natural reference paradigms.