Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Minimizing Impurity Partition Under Constraints (1912.13141v1)

Published 31 Dec 2019 in cs.IT, cs.IR, eess.SP, and math.IT

Abstract: Set partitioning is a key component of many algorithms in machine learning, signal processing, and communications. In general, the problem of finding a partition that minimizes a given impurity (loss function) is NP-hard. As such, there exists a wealth of literature on approximate algorithms and theoretical analyses of the partitioning problem under different settings. In this paper, we formulate and solve a variant of the partition problem called the minimum impurity partition under constraint (MIPUC). MIPUC finds an optimal partition that minimizes a given loss function under a given concave constraint. MIPUC generalizes the recently proposed deterministic information bottleneck problem which finds an optimal partition that maximizes the mutual information between the input and partition output while minimizing the partition output entropy. Our proposed algorithm is developed based on a novel optimality condition, which allows us to find a locally optimal solution efficiently. Moreover, we show that the optimal partition produces a hard partition that is equivalent to the cuts by hyperplanes in the probability space of the posterior probability that finally yields a polynomial time complexity algorithm to find the globally optimal partition. Both theoretical and numerical results are provided to validate the proposed algorithm.

Citations (9)

Summary

We haven't generated a summary for this paper yet.