Papers
Topics
Authors
Recent
2000 character limit reached

ChessReD: Chess Recognition Dataset

Updated 21 November 2025
  • Chess Recognition Dataset (ChessReD) is a large-scale, annotated collection of chessboard images captured under diverse real-world conditions.
  • It provides thorough annotations including FEN strings, board corners, and piece bounding boxes to support end-to-end chess piece configuration extraction.
  • The dataset serves as a benchmark for evaluating both deep learning and conventional methods using metrics such as configuration accuracy and per-square error rate.

The Chess Recognition Dataset (ChessReD) is a large-scale, real-world annotated dataset designed to advance the field of chessboard configuration recognition from images. It comprises high-resolution photographs of real chess games, annotated at both board and square level for end-to-end chess piece configuration extraction tasks. ChessReD serves as a challenging benchmark for both conventional pipeline and novel end-to-end deep learning systems, with emphasis on real-world conditions, sensor and viewing diversity, and rigorous annotation protocols (Masouris et al., 2023, Abeykoon et al., 14 Nov 2025).

1. Composition and Acquisition Protocol

ChessReD contains 10,800 RGB photographs acquired throughout 100 unique, complete chess games—a comprehensive coverage enabled by selecting diverse ECO opening codes and capturing each board state after every half-move. The game-level split (train: 6,479 images from 60 games; val: 2,192 from 20; test: 2,129 from 20) precludes near-duplicate frames across splits. Each image is unique to one game state; consecutive frames differ by exactly one move (Masouris et al., 2023).

Capture Setup

Images were obtained with three different smartphones:

  • Apple iPhone 12 (3,024 × 3,024 px)
  • Huawei P40 Pro (3,072 × 3,072 px)
  • Samsung Galaxy S8 (3,024 × 3,024 px)

Photographs span a wide range of camera viewpoints: from nearly top-down (0°) to highly oblique angles (>60°), emulating white-player, black-player, and spectator perspectives. Captures were conducted under varied lighting conditions (natural daylight, fluorescent, and mixed), using a single club-style physical chess set. This acquisition protocol introduces photometric, geometric, and background diversity absent from synthetic datasets (Masouris et al., 2023, Abeykoon et al., 14 Nov 2025).

A secondary, manually annotated subset—ChessReD2K—comprising 2,078 images (from 20 games) provides precise chessboard corner points and per-piece bounding boxes, with 2 px mean corner difference and bounding box IoU > 0.95 between annotators (Masouris et al., 2023).

2. Annotation Methodology and Data Structure

Primary image-level annotations are derived directly from the FEN string corresponding to each game's state. For each image, the FEN string is parsed to enumerate all occupied squares—yielding tuples of the form (square_id, piece_class), where piece_class ∈ {P, N, B, R, Q, K, p, n, b, r, q, k}. Empty squares are implicit. JSON is used as the primary annotation format; each entry specifies:

The organizational structure is as follows:

Directory Content Format
images/ train/, val/, test/ JPGs filenames by state/cam/view
labels/ *.json FEN, per-square classes

Strict quality assurance is enforced: FEN-based labeling is deterministic, and ChessReD2K manual annotations are double-checked, with <2 px mean corner error and >0.95 IoU for bounding boxes (Masouris et al., 2023).

3. Dataset Statistics and Diversity

Each of the 10,800 images represents a unique full-board chess configuration. The total number of square-level instances is 691,200 (10,800 × 64). The piece-class distribution is imbalanced; empty squares dominate, with piece type frequency varying with game phase. For illustration, typical figures approximate:

  • Empty: ≈ 345,600 (50%)
  • White Pawn: ≈ 86,400 (12.5%)
  • Black Pawn: ≈ 86,400 (12.5%)
  • All other pieces: ≈ 17,280 (2.5% each)

Exact counts differ by phase; empirical statistics per split are reported in (Masouris et al., 2023). For example, in the training split, pawn instances number 35,888 (white) and 35,021 (black); for kings, 6,479 (each color), and for queens, 4,076 (white) and 3,996 (black).

The dataset covers early, middlegame, and endgame frames, offering a full spectrum of common play. Lighting ranges from dim (~200 lux) to bright (~600 lux), with both natural and synthetic sources, further augmenting appearance variability (Abeykoon et al., 14 Nov 2025). Board backgrounds include wood, cloth, and plastic, along with scene clutter and floor/table surfaces.

The ChessReD2K subset provides bounding box annotations generated and cross-validated by dual annotators for a subset of images, supporting additional tasks (detection, localization) with high annotation reliability (Masouris et al., 2023).

4. Task Definitions, Evaluation Metrics, and Benchmarking

ChessReD is used for multiple computer vision tasks, primarily:

  • Board detection and homography estimation (from annotated corners)
  • Square-wise classification (occupancy + 13-way piece type)
  • End-to-end board configuration prediction (FEN string recovery from full image)
  • Domain adaptation between synthetic and real imagery

Primary evaluation metrics are:

  • Board-level configuration accuracy:

ConfigAccuracy=1N∑i=1N1(C^i=Ci)\text{ConfigAccuracy} = \frac{1}{N} \sum_{i=1}^N \mathbb{1}\left(\hat{C}_i = C_i\right)

where CiC_i is the 64-square ground truth, C^i\hat{C}_i the predicted board.

  • Per-square error rate:

PerSquareErrorRate=164N∑i=1N#(wrong squaresi)\text{PerSquareErrorRate} = \frac{1}{64N} \sum_{i=1}^N \#(\text{wrong squares}_i)

Baseline Performance

  • ResNeXt-101 (32×8d) end-to-end model (trained from scratch, cross-entropy loss, Adam optimizer, 200 epochs) yields:
    • ConfigAccuracy = 15.26%
    • Mean incorrect squares per board = 3.40
    • PerSquareErrorRate = 5.31%
    • on the official test split (N=2,129) (Masouris et al., 2023)
  • Chesscog pipeline (board detection → square localization → occupancy → piece classification) achieves:
    • ConfigAccuracy = 2.30% overall (up to 6.69% on ~34% of images where board was detected)
    • Mean incorrect squares/board = 42.87
    • PerSquareErrorRate = 73.64%
  • SVM (HOG+RBF) baseline:
    • Square-level accuracy: 68%
    • Non-empty tiles: 97.11%, board-level perfect classification: 63.96% (residual CNN baseline)
    • Noted confusion: empty squares vs pawns; black king most difficult (lowest accuracy) (Abeykoon et al., 14 Nov 2025)

These results highlight the substantive gap in performance between synthetic datasets and challenging, realistic photographs.

5. Technical Challenges and Variability

ChessReD introduces significant challenges not present in prior synthetic datasets:

  • Perspective variability (0° to >60° off-axis)
  • Frequent piece occlusions and mutual shadowing
  • Strong viewpoint foreshortening and severe lighting artifacts (shadows, glare)
  • Non-uniform, cluttered backgrounds and worn boards
  • Diverse smartphone cameras, differing in sensor characteristics and optics

Figure samples from (Masouris et al., 2023) detail scenarios such as low-angle views causing overlapping crowns, extreme foreshortening, and top-down perspectives that obscure critical piece features.

This diversity, together with the natural image acquisition protocol, makes ChessReD uniquely difficult compared to rendered datasets such as the one in (Wölflein et al., 2021), which uses a single 3D model, uniform backgrounds, and a restricted set of lighting/camera variations.

6. Usage, Access, and Licensing

ChessReD is openly accessible at 4TU.ResearchData Data Repository, under CC BY 4.0 license. The dataset structure is standardized for direct integration into common deep learning workflows: all images and JSON-formatted label files are organized by split (train, validation, test), with metadata including device, original resolution, lighting flag, and capture angle (Abeykoon et al., 14 Nov 2025).

  • Preprocessing:
    • Resize to consistent short-side (e.g., 800 px) or network input (e.g., 224×224)
    • Normalize RGB channels to ImageNet statistics
    • Optional board rectification via provided corner coordinates
  • Augmentation:
    • Random rotation (±15°), perspective jitter
    • Brightness/contrast/hue color jitter
    • Synthetic occlusions (random patches or cutout)
    • Blur or Gaussian noise
  • Splitting: All camera views of a unique board state are grouped in the same split to preclude leakage (Masouris et al., 2023).

Citation is required per dataset license; see (Abeykoon et al., 14 Nov 2025) for the canonical citation form.

7. Significance, Applications, and Future Directions

ChessReD fills a critical gap for benchmarking and advancing robust chess recognition under realistic conditions. It supports:

  • Benchmarking full-board, piece-wise, and detection/localization models under real-world variation
  • Transfer learning and domain adaptation studies from synthetic to real data and vice versa
  • Fine-grained error analysis by split, viewpoint, lighting, and phase

Potential research extensions identified in (Masouris et al., 2023):

  • Inclusion of multiple physical chess sets for cross-design generalization
  • Synthesis of board textures/backgrounds via GANs
  • Annotating video streams, move detection, and chess clock/capture information
  • Adding check/checkmate flags and additional chess metadata

The ChessReD dataset is regarded as a benchmark resource for rigorous evaluation of chessboard and piece recognition systems and is used in recent frameworks such as CVChess (Abeykoon et al., 14 Nov 2025), which implements FEN extraction using deep residual CNNs trained and validated on ChessReD images.

Comparative Table: ChessReD vs. Synthetic Chess Datasets

Aspect ChessReD (Masouris et al., 2023) Chesscog Synthetic (Wölflein et al., 2021)
Source Real photos (physical set, club env.) Rendered images (3D model)
No. of board images 10,800 4,888
Camera/viewpoint diversity Multiple phones, 0°–60° angles 45°–60° elevations, one set, flash/spotlight
Lighting conditions Natural + artificial, real world Simulated, limited
Board background Wood, cloth, plastic, cluttered Single wooden style
Piece/board variation Single set, real-world wear/occlusion Single model, synthetic variation
Annotations FEN, corners, bboxes (ChessReD2K) FEN, corners, square-occ, bboxes
License CC BY 4.0 CC BY-NC (see OSF repo)

ChessReD’s unique combination of real-world acquisition, annotation rigor, and comprehensive diversity provides an essential foundation for developing and benchmarking chess configuration recognition algorithms that generalize beyond synthetic scenarios (Masouris et al., 2023, Abeykoon et al., 14 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (3)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Chess Recognition Dataset (ChessReD).