HybridCORELS: Certifiably Optimal Hybrid Models
- HybridCORELS is an algorithmic framework that integrates interpretable rule-list classifiers with complex black-box models to deliver certifiably optimal solutions under explicit transparency constraints.
- It extends the CORELS branch-and-bound approach to optimize a mixed loss function that balances misclassification, rule-list complexity, and controlled transparency in binary classification tasks.
- Empirical evaluations on benchmarks like COMPAS and ACS demonstrate that HybridCORELS achieves comparable or superior accuracy to pure black-box methods while ensuring user-specified transparency.
HybridCORELS is an algorithmic framework for constructing hybrid models that combine interpretable rule-list classifiers with complex black-box models, delivering certifiably optimal solutions under explicit transparency constraints. Through its extension of the CORELS branch-and-bound approach, HybridCORELS enables precise control over the division of input space between interpretable and black-box components while providing strong theoretical and empirical guarantees (Ferry et al., 2023).
1. Model Composition and Gating
A HybridCORELS classifier addresses binary classification tasks on input space with outputs in . It consists of a triplet , where is an interpretable rule-list (the simple model), is a pre-trained black-box classifier, and is the region of input space covered by the rule list. The inference-time prediction is governed by a gating function , which routes to if and otherwise to . The transparency of the hybrid model, , is the probability under the data distribution. In practice, the empirical transparency is enforced as for a user-chosen transparency level .
2. Optimization Objective
The HybridCORELS optimization problem seeks a rule list satisfying the transparency constraint while minimizing overall misclassification:
where is the empirical 0-1 loss on , is the length of the rule list, and is a small penalty (used only to break ties) since is a hard constraint. The objective thus balances empirical error across both model regions, rule-list complexity, and coverage, with formal prioritization of strict transparency thresholds.
HybridCORELS has two major training paradigms:
- Post–black-box: The black-box is pre-trained and fixed, and is optimized with held constant.
- Pre–black-box: The rule-list is fit first, with regions not covered handled by a black-box trained subsequently.
3. Algorithmic Structure: CORELS Extension
HybridCORELS extends the CORELS (Certifiably Optimal RulE ListS) algorithm, originally formulated for pure rule-lists, by modifying the objective and update steps. CORELS uses a prefix-tree representing partial rule-lists, a priority queue sorted by valid lower bounds, and a branch-and-bound scheme to search for provably optimal solutions. HybridCORELS integrates the new hybrid objective and transparency constraint. The lower bound remains valid because black-box error can only decrease with rule-list extension and transparency penalties can be eliminated by additional coverage.
If the queue empties, HybridCORELS guarantees that the found rule list is globally optimal under the given transparency constraint.
| Component | CORELS | HybridCORELS |
|---|---|---|
| Objective | Mixed hybrid loss + complexity + coverage penalty | |
| Constraint | None | Hard transparency |
| Search Structure | Prefix-tree, PQ | Prefix-tree, PQ with transparency updates |
4. Theoretical Foundation and Generalization
Under mild finiteness assumptions on both the interpretable and complex hypothesis spaces, the class of HybridCORELS models is PAC-learnable. For finite rule-list () and black-box () spaces, with an “oracle” hybrid model of true risk zero, the PAC bound for excess risk satisfies:
$\Pr\bigl[\exists\,\triplet\;\text{ERM on data}:\;\poploss(\triplet)>\epsilon\bigr] \leq \sum_{\Omega\in\mathcal{P}} B(\epsilon, C_\Omega, |\mathcal{H}_c|, |\mathcal{H}_s|, M)$
where the bound captures contributions from both model parts and is the sample size. For a fixed oracle region , the union disappears. The minimizer of the PAC bound as a function of quantifies an optimal “sweet-spot” transparency: the empirical risk can be better for a hybrid than either component alone, reflecting a fundamental hybrid regularization benefit.
5. Empirical Performance
HybridCORELS was evaluated on three benchmarks: COMPAS recidivism prediction (6k examples), UCI Adult Income (49k), and ACS Employment (200k). Black-box baselines included Random Forests, AdaBoost, and Gradient-Boosted Trees (scikit-learn, cross-validated). Comparisons across Hybrid Rule Set (HyRS), Companion Rule List (CRL), and both HybridCORELS variants revealed:
- Transparency under HybridCORELS increases monotonically with , displaying negligible run-to-run variance. In contrast, HyRS/CRL exhibit substantial stochasticity at given transparency levels.
- On Adult and ACS, transparency of approximately 0.7–0.8 can be achieved with no loss in overall accuracy relative to pure black-box models.
- On COMPAS, HybridCORELSPre surpasses black-box accuracy at intermediate transparency (0.5–0.6), by up to 2 percentage-points. This aligns with the “sweet-spot self-regularization” predicted by theory.
- HybridCORELS consistently matches or exceeds HyRS/CRL performance at all tested transparencies.
As an example, on ACS with AdaBoost, the pure black-box achieved approximately 74% accuracy; HybridCORELSPre matched this at 75% transparency, while HyRS/CRL trailed by 1–2 percentage points at comparable transparency (Ferry et al., 2023).
6. Strengths, Limitations, and Future Directions
HybridCORELS provides:
- Certifiably optimal hybridization: Retains the CORELS global optimality guarantee under hard transparency.
- Black-box agnosticism: Compatible with any pre-trained supporting (optionally) instance weighting.
- User-controllable transparency: Enforcement of hard constraints on yields empirically realizable, user-specified transparency; stochastic instability common to HyRS/CRL is eliminated.
- Empirical regularization sweet-spot: Demonstrated capacity to attain accuracy superior to either component alone at intermediate transparency levels.
Limitations and open challenges include:
- Scalability to large antecedent pools: The prefix-tree search can become computationally expensive as the rule base grows, or when extreme transparency is required.
- Dependency on quality of pre-mined antecedents: The algorithm's success hinges on effective antecedent mining, an external, potentially non-trivial pipeline step.
- Lack of end-to-end training: Current methods require a fixed two-stage pipeline (pre–/post–black-box). Joint optimization of and remains unaddressed.
- Generality beyond rule lists: While extending to other interpretable classes is in principle straightforward, it requires new analysis and PAC bounds.
Future work includes exploring end-to-end hybrid learning with guarantees, adaptive (data-dependent) transparency, multiclass extensions, and investigating the empirical-theoretical gap in PAC bounds and observed sweet-spots (Ferry et al., 2023).