A submodular-supermodular procedure with applications to discriminative structure learning (1207.1404v1)

Published 4 Jul 2012 in cs.LG, cs.DS, and stat.ML

Abstract: In this paper, we present an algorithm for minimizing the difference between two submodular functions using a variational framework which is based on (an extension of) the concave-convex procedure [17]. Because several commonly used metrics in machine learning, like mutual information and conditional mutual information, are submodular, the problem of minimizing the difference of two submodular problems arises naturally in many machine learning applications. Two such applications are learning discriminatively structured graphical models and feature selection under computational complexity constraints. A commonly used metric for measuring discriminative capacity is the EAR measure which is the difference between two conditional mutual information terms. Feature selection taking complexity considerations into account also fall into this framework because both the information that a set of features provide and the cost of computing and using the features can be modeled as submodular functions. This problem is NP-hard, and we give a polynomial time heuristic for it. We also present results on synthetic data to show that classifiers based on discriminative graphical models using this algorithm can significantly outperform classifiers based on generative graphical models.

Citations (169)

View on Semantic Scholar

Summary

The paper introduces a novel submodular-supermodular procedure, a polynomial-time heuristic for minimizing the difference between two submodular functions, which is an NP-hard problem common in machine learning.
The algorithm iteratively finds local minima using a variational framework based on the concave-convex procedure, employing modular approximations derived from permutations to refine optimization.
Numerical results demonstrate that discriminatively structured graphical models learned with this procedure outperform generative models, showing potential in feature selection and learning under computational constraints.

A Submodular-Supermodular Procedure with Applications to Discriminative Structure Learning: An Academic Overview

The paper by Mukund Narasimhan and Jeff Bilmes introduces an algorithm for minimizing the difference between two submodular functions, leveraging a variational framework based on the concave-convex procedure. The authors focus on defining and solving the problem of minimizing the difference between submodular set functions, which is a prevalent issue in machine learning applications, including learning discriminative graphical models and feature selection under computational constraints.

Key Concepts and Algorithmic Development

Submodular Functions: The paper underscores the significance of submodular functions often utilized in machine learning for metrics such as mutual information. These functions encapsulate diminishing returns properties that are beneficial in optimizing different structural and feature selection scenarios.
Submodular-Supermodular Optimization: The research outlines the challenge of minimizing a function represented as the difference between two submodular functions and provides a heuristic to solve this NP-hard problem in polynomial time. The methodology extends beyond previously established approaches by integrating variational techniques to address constraints on computational complexity alongside discriminative capacity.
Algorithm Design: The authors propose a submodular-supermodular procedure which iteratively finds local minima with respect to the objective function formed by the difference in submodular functions. The algorithm cycles through approximations of supermodular and submodular functions and determines modular approximations using permutations that help refine the optimization process.

Strong Numerical Results

The paper provides compelling simulation results on synthetic datasets demonstrating that classifiers based on discriminatively structured graphical models formulated with the proposed algorithm outperform those employing generative models. This data exemplifies superior classification performance when distinguishing features that offer discriminative potential not adequately represented through generative traditions.

Implications and Future Directions

The authors explore implications in feature selection and learning discriminative graphical models where submodularity enables intuitive trade-offs between computational cost and information gain. The insights from their approach shed light on practical challenges and advocate for further explorations into the bounds of performance, convergence rates, and the scope of symmetrical sets within submodular functions.

Conclusion

The paper provides a foundational treatment of submodular-supermodular optimization within machine learning, presenting a methodology that extends existing concave-convex frameworks. While the algorithm shows potential for elevating discriminative learning outcomes, further validation on real-world datasets and refined models could further substantiate its efficacy across tasks. The approach sets a precedent for future developments in AI where structural learning and feature selection under computational constraints are of paramount concern.