HybridCORELS: Certifiably Optimal Hybrid Models

Updated 23 February 2026

HybridCORELS is an algorithmic framework that integrates interpretable rule-list classifiers with complex black-box models to deliver certifiably optimal solutions under explicit transparency constraints.
It extends the CORELS branch-and-bound approach to optimize a mixed loss function that balances misclassification, rule-list complexity, and controlled transparency in binary classification tasks.
Empirical evaluations on benchmarks like COMPAS and ACS demonstrate that HybridCORELS achieves comparable or superior accuracy to pure black-box methods while ensuring user-specified transparency.

HybridCORELS is an algorithmic framework for constructing hybrid models that combine interpretable rule-list classifiers with complex black-box models, delivering certifiably optimal solutions under explicit transparency constraints. Through its extension of the CORELS branch-and-bound approach, HybridCORELS enables precise control over the division of input space between interpretable and black-box components while providing strong theoretical and empirical guarantees (Ferry et al., 2023).

1. Model Composition and Gating

A HybridCORELS classifier addresses binary classification tasks on input space $\mathcal{X}$ with outputs in $\{0,1\}$ . It consists of a triplet $(h_s, h_c, \Omega)$ , where $h_s:\mathcal{X}\to\{0,1\}$ is an interpretable rule-list (the simple model), $h_c:\mathcal{X}\to\{0,1\}$ is a pre-trained black-box classifier, and $\Omega\subseteq\mathcal{X}$ is the region of input space covered by the rule list. The inference-time prediction is governed by a gating function $g(x) = \mathbf{1}[x\in\Omega]$ , which routes $x$ to $h_s$ if $x\in\Omega$ and otherwise to $\{0,1\}$ 0. The transparency of the hybrid model, $\{0,1\}$ 1, is the probability $\{0,1\}$ 2 under the data distribution. In practice, the empirical transparency $\{0,1\}$ 3 is enforced as $\{0,1\}$ 4 for a user-chosen transparency level $\{0,1\}$ 5.

2. Optimization Objective

The HybridCORELS optimization problem seeks a rule list $\{0,1\}$ 6 satisfying the transparency constraint while minimizing overall misclassification:

$\{0,1\}$ 7

where $\{0,1\}$ 8 is the empirical 0-1 loss on $\{0,1\}$ 9, $(h_s, h_c, \Omega)$ 0 is the length of the rule list, and $(h_s, h_c, \Omega)$ 1 is a small penalty (used only to break ties) since $(h_s, h_c, \Omega)$ 2 is a hard constraint. The objective thus balances empirical error across both model regions, rule-list complexity, and coverage, with formal prioritization of strict transparency thresholds.

HybridCORELS has two major training paradigms:

Post–black-box: The black-box $(h_s, h_c, \Omega)$ 3 is pre-trained and fixed, and $(h_s, h_c, \Omega)$ 4 is optimized with $(h_s, h_c, \Omega)$ 5 held constant.
Pre–black-box: The rule-list is fit first, with regions not covered handled by a black-box trained subsequently.

3. Algorithmic Structure: CORELS Extension

HybridCORELS extends the CORELS (Certifiably Optimal RulE ListS) algorithm, originally formulated for pure rule-lists, by modifying the objective and update steps. CORELS uses a prefix-tree representing partial rule-lists, a priority queue sorted by valid lower bounds, and a branch-and-bound scheme to search for provably optimal solutions. HybridCORELS integrates the new hybrid objective and transparency constraint. The lower bound remains valid because black-box error can only decrease with rule-list extension and transparency penalties can be eliminated by additional coverage.

If the queue empties, HybridCORELS guarantees that the found rule list is globally optimal under the given transparency constraint.

Component	CORELS	HybridCORELS
Objective	$(h_s, h_c, \Omega)$ 6	Mixed hybrid loss + complexity + coverage penalty
Constraint	None	Hard transparency $(h_s, h_c, \Omega)$ 7
Search Structure	Prefix-tree, PQ	Prefix-tree, PQ with transparency updates

4. Theoretical Foundation and Generalization

Under mild finiteness assumptions on both the interpretable and complex hypothesis spaces, the class of HybridCORELS models is PAC-learnable. For finite rule-list ( $(h_s, h_c, \Omega)$ 8) and black-box ( $(h_s, h_c, \Omega)$ 9) spaces, with an “oracle” hybrid model of true risk zero, the PAC bound for excess risk $h_s:\mathcal{X}\to\{0,1\}$ 0 satisfies:

$h_s:\mathcal{X}\to\{0,1\}$ 1

where the bound $h_s:\mathcal{X}\to\{0,1\}$ 2 captures contributions from both model parts and $h_s:\mathcal{X}\to\{0,1\}$ 3 is the sample size. For a fixed oracle region $h_s:\mathcal{X}\to\{0,1\}$ 4, the union disappears. The minimizer $h_s:\mathcal{X}\to\{0,1\}$ 5 of the PAC bound as a function of $h_s:\mathcal{X}\to\{0,1\}$ 6 quantifies an optimal “sweet-spot” transparency: the empirical risk can be better for a hybrid than either component alone, reflecting a fundamental hybrid regularization benefit.

5. Empirical Performance

HybridCORELS was evaluated on three benchmarks: COMPAS recidivism prediction ( $h_s:\mathcal{X}\to\{0,1\}$ 76k examples), UCI Adult Income ( $h_s:\mathcal{X}\to\{0,1\}$ 849k), and ACS Employment ( $h_s:\mathcal{X}\to\{0,1\}$ 9200k). Black-box baselines included Random Forests, AdaBoost, and Gradient-Boosted Trees (scikit-learn, cross-validated). Comparisons across Hybrid Rule Set (HyRS), Companion Rule List (CRL), and both HybridCORELS variants revealed:

Transparency under HybridCORELS increases monotonically with $h_c:\mathcal{X}\to\{0,1\}$ 0, displaying negligible run-to-run variance. In contrast, HyRS/CRL exhibit substantial stochasticity at given transparency levels.
On Adult and ACS, transparency of approximately 0.7–0.8 can be achieved with no loss in overall accuracy relative to pure black-box models.
On COMPAS, HybridCORELSPre surpasses black-box accuracy at intermediate transparency (0.5–0.6), by up to 2 percentage-points. This aligns with the “sweet-spot self-regularization” predicted by theory.
HybridCORELS consistently matches or exceeds HyRS/CRL performance at all tested transparencies.

As an example, on ACS with AdaBoost, the pure black-box achieved approximately 74% accuracy; HybridCORELSPre matched this at 75% transparency, while HyRS/CRL trailed by 1–2 percentage points at comparable transparency (Ferry et al., 2023).

6. Strengths, Limitations, and Future Directions

HybridCORELS provides:

Certifiably optimal hybridization: Retains the CORELS global optimality guarantee under hard transparency.
Black-box agnosticism: Compatible with any pre-trained $h_c:\mathcal{X}\to\{0,1\}$ 1 supporting (optionally) instance weighting.
User-controllable transparency: Enforcement of hard constraints on $h_c:\mathcal{X}\to\{0,1\}$ 2 yields empirically realizable, user-specified transparency; stochastic instability common to HyRS/CRL is eliminated.
Empirical regularization sweet-spot: Demonstrated capacity to attain accuracy superior to either component alone at intermediate transparency levels.

Limitations and open challenges include:

Scalability to large antecedent pools: The prefix-tree search can become computationally expensive as the rule base grows, or when extreme transparency is required.
Dependency on quality of pre-mined antecedents: The algorithm's success hinges on effective antecedent mining, an external, potentially non-trivial pipeline step.
Lack of end-to-end training: Current methods require a fixed two-stage pipeline (pre–/post–black-box). Joint optimization of $h_c:\mathcal{X}\to\{0,1\}$ 3 and $h_c:\mathcal{X}\to\{0,1\}$ 4 remains unaddressed.
Generality beyond rule lists: While extending to other interpretable classes is in principle straightforward, it requires new analysis and PAC bounds.

Future work includes exploring end-to-end hybrid learning with guarantees, adaptive (data-dependent) transparency, multiclass extensions, and investigating the empirical-theoretical gap in PAC bounds and observed sweet-spots (Ferry et al., 2023).

Markdown Report Issue Upgrade to Chat

References (1)

Learning Hybrid Interpretable Models: Theory, Taxonomy, and Methods (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to HybridCORELS.