- The paper introduces a formal decomposition of strategic MSE into predictive loss, manipulability gain, and heterogeneity gap, clarifying how each trade-off affects model performance.
- The study derives optimality gap bounds that guide the joint tuning of feature selection and ridge regularization, demonstrating near-optimal performance under specific manipulation cost conditions.
- Empirical results from synthetic benchmarks and a Medicare Advantage simulation show that joint strategic feature selection can achieve up to a 40% reduction in post-manipulation MSE.
Motivation and Problem Setting
Strategic manipulation of input features in high-stakes algorithmic decision-making—such as risk adjustment in healthcare payments—compromises predictive validity and incentivizes undesirable behaviors like upcoding. Traditionally, adversarial robustness and strategic classification focus on selecting predictors that optimize performance against strategic agents. However, practical deployment is constrained: redesigning prediction pipelines is often infeasible due to institutional inertia. In real-world policy contexts, decision makers typically rely on coarse levers—feature selection and ridge regularization—to mitigate strategic manipulation. This paper formally analyzes the efficacy of these levers, specifically characterizing their interplay and the resultant performance gap relative to the strategic optimum.
Theoretical Contributions
Strategic MSE Decomposition
The paper develops a linear strategic learning framework with quadratic manipulation costs, modeling the agent's best response as a∗=H−1θ. The strategic MSE after manipulation is decomposed into three principal components: predictive loss from omitting features, strategic burden relieved by restricting the support, and a heterogeneity gap induced by cost structure:
- Predictive loss: Lpred​(S) quantifies irrecoverable signal from dropped features.
- Manipulability gain: −(Γ([d])−Γ(S)) measures the reduction in strategic vulnerability.
- Heterogeneity gap: CH​(S)δH​(S)2 captures loss due to manipulation cost anisotropy among retained features.
This explicit decomposition rigorously demonstrates that feature selection based solely on manipulability is suboptimal. Optimal supports balance signal, manipulability, and cost geometry, a phenomenon illustrated in a detailed synthetic example.
Figure 1: The best subset balances predictive loss, manipulability gain, and heterogeneity gap—optimality is achieved by supports that exploit joint structure.
Optimality Gap Bounds
A lower bound on achievable strategic MSE is established, showing an irreducible gap between zero-intercept ridge estimators and the strategic optimum; this gap vanishes for high-cost manipulation directions (i.e., when θ∗⊤H−1θ∗ is small).
An upper bound shows conditions under which support-restricted ridge is near-optimal: when there exists a support with minimal predictive loss, sufficient reduction in strategic burden, and homogeneous manipulation costs (δH​(S)≈0). Scalar ridge achieves the restricted oracle exactly when manipulation costs are isotropic—generalized ridge is only needed when retained features have heterogeneous costs. This is formalized both for isotropic and two-level cost regimes.
Policy Implications: Design Principles for Strategic Robustness
Joint Tuning of Feature Selection and Regularization
Empirical and theoretical results invalidate the heuristic of separately ranking features by manipulability or predictive value. The optimal support and regularization level are interdependent; regularization alters preferred supports and vice versa.
Figure 2: Manipulable groups with homogeneous costs can be retained and regularized—joint tuning with ridge yields strategic robustness.
Homogeneous Manipulable Groups and Proxy Features
Contrary to prevailing policy, groups of highly manipulable but homogeneous features may optimally be retained and aggressively regularized. Regularization cannot substitute for feature exclusion when cost heterogeneity is pronounced.
Less manipulable, correlated proxies can replace manipulable features without substantial predictive loss. As manipulability or feature correlation increases, supports optimally switch from direct to proxy features.
Figure 3: Strategic feature selection exploits proxy relationships—optimal supports transition to less manipulable substitutes as correlation increases.
Interior Solutions and Retention of Intensely Coded Features
Empirical benchmarks in Medicare Advantage payment confirm theory: blanket exclusion of diagnosis groups is not necessary for robustness. Retaining intensely coded, predictive HCCs under joint support-restricted ridge achieves lower post-manipulation MSE than both full-support models and heuristic exclusion.
Figure 4: Feature selection under optimal regularization retains predictive features—even intensely coded HCCs—without sacrificing strategic robustness.
Algorithmic Solutions
A computational pipeline for joint support and regularization selection is proposed: continuous-weight relaxation, rounding, and exact local refinement. Synthetic benchmarks demonstrate recovery of the exact oracle under combinatorial constraints, validating the algorithmic approach.
Figure 5: Weighted screen-and-refit achieves oracle strategic MSE across synthetic regimes—combining relaxation and local refinement is critical.
Case Study: Healthcare Payments
A simulation calibrated to real Medicare Advantage coding demonstrates the practicality of the framework and algorithm. Feature selection under optimal regularization achieves a 40% reduction in post-manipulation MSE relative to full ridge and outperforms prediction- and cost-only baselines.

Figure 6: Retained features reflect joint signal and manipulability—not all intensely coded groups are dropped; the optimal selection balances these properties.
Robustness Under Cost Uncertainty
Regularization and feature selection inherently reduce exposure to uncertain manipulation directions. Unlike intercept correction (optimal only if costs are precisely known), these levers yield smaller worst-case MSE under cost misspecification. Support restriction and shrinkage provide additional protection against estimation error in manipulation costs.
Practical and Theoretical Implications
The rigorous characterization reveals that robust strategic performance in algorithmic decision systems requires joint tuning of feature selection and regularization. Blanket exclusion policies, prevalent in resource allocation regimes, fail to exploit nuanced opportunities for balancing predictive utility and strategic exposure. The framework provides actionable guidance for policymakers and algorithm designers in regulated domains: structure-aware subset selection yields superior robustness and preserves signal.
Theoretically, the fine-grained decomposition offers new insights into manipulation-aware model selection, contrasting sharply with adversarial robustness paradigms focused on exogenous perturbations. The approach bridges institutional constraints with incentive-aware statistical learning.
Future Directions
Further research should extend the analysis to dynamic, multi-shot strategic interactions, explore robust optimization over cost uncertainty, and investigate grouped or per-feature regularization. Understanding endogenous deployment shifts and their impact on risk estimator tuning remains key for real-world deployments.
Conclusion
This work provides a principled, actionable framework for robust algorithmic design in strategic environments, establishing that optimal feature selection must be grounded in the joint structure of predictability and manipulability. The formal characterization and practical algorithm empower policymakers and researchers to advance incentive-aware machine learning beyond legacy heuristics.
References: "Strategic Feature Selection" (2606.18867).