OPCE-CASH: Multi-Objective Algorithm Selection
- The paper demonstrates OPCE-CASH's ability to optimize classification F1-score, prediction confidence, and inference latency with a multi-objective metaheuristic.
- OPCE-CASH integrates algorithm selection and hyperparameter tuning into a unified framework, delivering a Pareto frontier of trade-off optimal models.
- Empirical evaluations reveal up to 99.7% F1-score and sub-millisecond inference times, validating its effectiveness in resource-constrained cybersecurity settings.
Optimized Performance, Confidence, and Efficiency-based Combined Algorithm Selection and Hyperparameter Optimization (OPCE-CASH) is a multi-objective extension of the Combined Algorithm Selection and Hyperparameter Optimization (CASH) problem. OPCE-CASH targets not only classification effectiveness, but also prediction confidence and computational efficiency, formulating algorithm selection and configuration as a three-objective optimization problem. This approach is particularly relevant in resource-constrained domains, such as IoT and edge-deployed cybersecurity, where it is critical to jointly optimize for detection efficacy, reliable confidence estimates, and deployment costs. OPCE-CASH leverages multi-objective metaheuristics—specifically, Multi-Objective Particle Swarm Optimization (MOPSO)—to produce a Pareto frontier of configurations that trade off these criteria, departing from single-objective or mean-based bandit approaches that dominate traditional CASH settings (Yang et al., 11 Nov 2025).
1. Multi-Objective Formulation of OPCE-CASH
OPCE-CASH generalizes classic CASH by expanding its objective space. Given a dataset , a set of candidate algorithms (for example, ), and associated hyperparameter spaces , OPCE-CASH seeks
where the objective vector is: Here:
- is the -score averaged over cross-validation folds,
- is the average prediction confidence,
- is a sigmoid-normalized inference latency.
Model configurations are thereby evaluated on effectiveness (via ), reliability (mean predictive certainty), and efficiency (runtime). Each metric is batch-computed via -fold cross-validation per candidate solution.
2. Decision Variables and Encoding
Each candidate in the optimization is represented as a vector combining discrete algorithm selection and real-valued (or integer) hyperparameter settings:
- An algorithm identifier ( for XGBoost or LightGBM),
- Individual hyperparameters such as , , ,
- Additional hyperparameters are optionally encoded (e.g., subsample ratios or regularization).
Categorical (algorithm) variables are handled by partitioning the candidate vector, and standard rounding or probabilistic flips are applied during the optimization trajectory to maintain feasibility. Hyperparameter domains are layer-bounded to ranges suitable for resource-constrained operation (e.g., ) (Yang et al., 11 Nov 2025).
3. Multi-Objective Optimization Strategy
A Multi-Objective Particle Swarm Optimization (MOPSO) metaheuristic is used for search:
- Each swarm particle encodes a possible setting.
- Position and velocity are updated per standard PSO rules, with categorical decisions treated by mapping or rounding within each iteration.
- The Pareto archive is updated at each iteration by nondominated sorting of particles' objective vectors.
- All objectives are normalized to to facilitate balanced multi-objective optimization.
- No explicit constraints are imposed, as latency and model size are driven to optimality by inclusion as objectives (Yang et al., 11 Nov 2025).
The algorithm proceeds for a fixed number of generations or until the Pareto front stabilizes, resulting in a frontier of trade-off-optimal models.
4. Pipeline Integration and Workflow
OPCE-CASH typically operates as the concluding phase of a multi-stage AutoML pipeline, following:
- Automated data preprocessing (AutoDP),
- Feature selection (e.g., OIP-AutoFS with MOPSO),
- OPCE-CASH (joint algorithm and hyperparameter search under multi-objective optimization).
After feature selection, OPCE-CASH operates on the reduced dataset, exploring both algorithm and hyperparameter spaces to optimize detection, confidence, and latency. The result is a Pareto set rather than a single solution, allowing users to select models according to deployment constraints (e.g., prioritizing latency for edge devices versus F1 for centralized analysis). All evaluation metrics are computed via -fold CV for robustness (Yang et al., 11 Nov 2025).
5. Constraints, Regularization, and Practicalities
No explicit hard constraints (such as strict memory caps) are enforced at the OPCE-CASH stage. Instead,
- Efficiency and model size are optimized directly,
- XGBoost includes regularization terms () to discourage large, overcomplex trees,
- LightGBM applies built-in GOSS/EFB heuristics for efficient construction. Implementation details include swarm sizes typically in the range 20–50, 30–100 PSO iterations, and parallel evaluation of swarms using scikit-learn, XGBoost, and LightGBM (Yang et al., 11 Nov 2025).
6. Empirical Performance
In ablation experiments on cybersecurity datasets (CICIDS2017, IoTID20), OPCE-CASH alone yields high -scores (e.g., 99.7% for XGBoost on CICIDS2017), low inference latency (0.002–0.0036 ms/sample), compact model sizes, and high average confidence (up to ). These metrics are achieved for both XGBoost and LightGBM, validating the multi-objective design. When OPCE-CASH is combined with feature selection, further improvements in latency and model compactness are observed, while maintaining Pareto-optimal solutions. The approach is effective for both edge and cloud applications, allowing flexible deployment based on trade-offs (Yang et al., 11 Nov 2025).
7. Connections to Bandit-Based and Single-Objective Approaches
OPCE-CASH extends prior CASH methods such as ER-UCB and MaxUCB, which model algorithm selection as a multi-armed bandit structured around maximizing extreme-region or maximum-reward statistics rather than average reward. ER-UCB instantiates a two-level procedure: bandit-based algorithm selection overlays independent HPO subroutines, with arms chosen according to a confidence-weighted mean-plus-variance index targeting the upper tail of performance (Hu et al., 2019). MaxUCB further adapts the strategy to bounded, light-tailed reward distributions for Cash, with bonuses tuned to the concentration properties of such distributions (Balef et al., 8 May 2025). OPCE-CASH's explicit handling of confidence and efficiency aligns with the emerging view that effective AutoML systems must optimize for deployment-specific properties—not just model accuracy but tunable trade-offs along the axes of reliability and resource use.
| Objective | Metric | Typical Value (CICIDS2017) |
|---|---|---|
| Effectiveness | -score | |
| Confidence | Avg. Prob. | |
| Efficiency | Inference time | $0.002$–$0.003$ ms/sample |
OPCE-CASH provides a template for next-generation AutoML and AutoAI pipelines in settings requiring careful trade-offs between predictive performance, confidence calibration, and efficient inference, as documented in the MOO-AutoML IDS study (Yang et al., 11 Nov 2025), with conceptual roots in cascaded and bandit-based approaches (Hu et al., 2019, Balef et al., 8 May 2025).