Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 102 tok/s
Gemini 2.5 Pro 58 tok/s Pro
GPT-5 Medium 25 tok/s
GPT-5 High 35 tok/s Pro
GPT-4o 99 tok/s
GPT OSS 120B 472 tok/s Pro
Kimi K2 196 tok/s Pro
2000 character limit reached

Optimal Data Reduction under Information-Theoretic Criteria (2508.16123v1)

Published 22 Aug 2025 in math.OC

Abstract: Selecting an optimal subset of features or instances under an information theoretic criterion has become an effective preprocessing strategy for reducing data complexity while preserving essential information. This study investigates two representative problems within this paradigm: feature selection based on the maximum relevance minimum redundancy criterion, and instance selection grounded in the Kullback Leibler divergence. To address the intrinsic nonconvexities of these problems, we develop polyhedral relaxations that yield exact mixed integer linear programming formulations, thereby enabling globally optimal data reduction. By leveraging modern optimization techniques, we further design efficient algorithmic implementations capable of solving practically sized instances. Extensive numerical experiments on both real world and synthetic datasets demonstrate that our method efficiently solves data reduction problems to global optimality, significantly outperforming existing benchmark approaches.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com