JOBlight Benchmark: Join Estimation Insights
- JOBlight Benchmark is a lightweight variant of JOB designed to evaluate database query optimizer performance on join-size estimation and underestimation bias.
- It leverages a downsized IMDb-derived schema with 70 SQL queries to create challenging scenarios using correlated joins and selective predicates.
- The benchmark employs metrics like xBound lower bounds to correct optimizer estimates, improving cardinality estimation accuracy in modern database systems.
JOBlight Benchmark is a database benchmark designed as a “light” variant of the classic Join Order Benchmark (JOB), specifically constructed to assess the behavior of database query optimizers in the presence of correlated joins and selective predicates, with a strong focus on cardinality estimation and its associated errors. The benchmark provides a systematic substrate for evaluating the accuracy and robustness of cardinality estimators, especially regarding join-size underestimation—a long-standing challenge in the field. It is used notably in the “xBound: Join Size Lower Bounds” framework to empirically and algorithmically investigate optimizer deficiencies and the efficacy of provable lower-bound methods (Stoian et al., 19 Jan 2026).
1. Schema, Dataset Characteristics, and Query Design
JOBlight adopts the IMDb-derived schema and data backbone from JOB, but downsizes it for tractability and reproducibility. The dataset is approximately 1–1.5 GB on disk and encompasses eight tables, each joined via a common key (movie_id), as detailed below:
| Table Name | Approximate Row Count |
|---|---|
| title | ≈ 800 K |
| name | ≈ 1 M |
| cast_info | ≈ 2.5 M |
| cast_info_idx | ≈ 2.5 M |
| movie_info | ≈ 6 M |
| movie_info_idx | ≈ 6 M |
| movie_keyword | ≈ 3 M |
| keyword | ≈ 26 K |
The benchmark includes 70 SQL queries, crafted to create challenging scenarios for optimizers by employing correlated joins and high-selectivity predicates, such as equality on “info_type” or narrow range constraints.
2. Join Subexpressions and Query Patterns
JOBlight’s query structure is defined to exhaustively cover the space of self-join-free inner join subexpressions. All 696 such subexpressions present in the workload are Berge-acyclic (i.e., their join graphs do not contain cycles), and every join is executed over the movie_id key.
- Join Span: Maximum of four tables are joined in any given subexpression.
- Join Graph: All subexpressions exhibit a “star” topology over movie_id. However, depending on predicate arrangements, the optimizer’s logical view may interpolate to chains or snowflakes.
- Selectivities: Highly variable. Two-way joins can retain only 0.01%–5% of input pairs; three- and four-way joins can be extremely selective, with selectivities in the range 10⁻⁶ to 10⁻³.
3. Measurement and Definition of Join-Size Underestimation
Cardinality estimation for a subexpression Q is formally analyzed by comparing the true output cardinality, true(Q), against the optimizer’s estimate, est(Q). A subexpression is classified as underestimated if
The evaluation uses two quantitative metrics:
- %Underestimates: The proportion of subexpressions for which est(Q) < true(Q).
- Median One-sided Q-error underestimation: The median value of true(Q)/est(Q) taken only over subexpressions exhibiting underestimation.
To detect provably incorrect underestimation, the xBound framework computes a hard lower bound LB(Q) on true(Q). Any case where est(Q) < LB(Q) is flagged as a violation.
A core formula used for lower-bounding the number of distinct joining keys in an n-way join leverages a lightweight inclusion–exclusion principle:
where denotes the set of join keys in the -th table.
Subsequently, once the lower bound is determined, xBound applies reverse Hölder–type inequalities (e.g., the Pólya–Szegő inequality for n = 2) on degree sequence prefixes to yield lower bounds for the join size.
4. Baseline Systems and Comparative Metrics
JOBlight was used to empirically test and compare multiple modern database systems and estimation methodologies:
- Systems Under Test:
- DuckDB v1.4
- PostgreSQL 18
- Microsoft Fabric Data Warehouse (DW)
- Reference Bound Frameworks:
- xBound (join-size lower bounds)
- LpBound (join-size upper bounds using ℓ₀, ℓ₁, ℓ₂, ℓ_∞ norms)
For each system and subexpression, the metrics collected include:
- % Underestimates ()
- % Hard-lower-bound violations ( xBound LB)
- % Hard-upper-bound violations ( LpBound UB)
A summary of these violations on JOBlight is presented in the following table:
| System | % Underest (est<true) | % est < xBound LB | % est > LpBound UB |
|---|---|---|---|
| DuckDB 1.4 | 85.3% | 14.9% | 8.3% |
| PostgreSQL 18 | 79.3% | 6.9% | 0.4% |
| Fabric DW | 53.4% | 0.0% | 6.0% |
These figures show widespread underestimation, particularly in open-source systems, and that xBound’s lower bounds catch a substantial fraction of hard underestimation errors.
5. Impact of Lower-Bound Corrections on Optimizer Estimates
By “clipping” an optimizer’s estimate to be at least xBound’s lower bound for every subexpression, one-sided Q-error metrics improve, particularly for low-degree joins. Table 2 summarizes the impact on median one-sided Q-error for 1-, 2-, and 3-way joins:
| System | 1-join | 2-join | 3-join |
|---|---|---|---|
| DuckDB 1.4 | 2.34 → 2.05 | 29.45 → 28.30 | 417.50 (no chg.) |
| PostgreSQL18 | 2.00 → 1.76 | 1.61 → 1.61 | 2.09 (no chg.) |
In its final configuration, xBound (with 256 partitions, norm-stitching, MCVs, etc.) corrects 17.5% of DuckDB’s and 8.7% of PostgreSQL’s subquery underestimates. On these subexpressions, estimates below xBound’s proven lower bound are immediately rectified by clipping, thus eliminating a significant class of underestimation errors (Stoian et al., 19 Jan 2026).
6. Methodological Principles in xBound on JOBlight
The xBound approach on JOBlight consists of the following steps for n-way joins sharing a key:
- Lower-bound : Use zonemaps (min/max per column) and optional range partitioning to estimate the intersection cardinality of join keys.
- Fetch Norms of Degree Sequences: Calculate ℓ₁, ℓ₂, ℓ_∞, and ℓ₋∞ (min) prefix norms of the base tables’ degree sequences up to rank , using “norm stitching” on power-of-two prefixes.
- Apply Reverse Dot-Product Inequalities: Utilize Pólya–Szegő for and generalized reverse Hölder inequalities for , as well as “min-degree” bounds, to obtain multiple candidate lower bounds.
- Aggregate Candidates: Set as the final lower bound.
The Pólya–Szegő inequality for vectors of length (with and similarly for ) is given by:
with the reverse Hölder inequality employed for generalization to multi-way joins.
7. Significance and Visualization of Results
Analysis on JOBlight demonstrates a persistent optimizer bias toward underestimation, which prior upper-bound frameworks such as LpBound cannot remediate. Figure 1 in “xBound: Join Size Lower Bounds” (not reproduced here) depicts the joint distribution of underestimates, hard lower-bound violations (est < xBound LB), and upper-bound violations, illustrating that xBound’s lower bounds capture a previously undiagnosed subset of severe cardinality underestimates.
A key implication is that work on join-size lower bounds—embodied by JOBlight’s empirical profile—offers a concrete pathway to arrest the optimizer’s most damaging mode of failure: underplanning for query execution due to erroneously low cardinality estimates. This establishes JOBlight as a critical diagnostic and benchmarking instrument for advancing the robustness of modern database optimizers (Stoian et al., 19 Jan 2026).