Papers
Topics
Authors
Recent
Search
2000 character limit reached

Learning Hierarchical Interactions at Scale: A Convex Optimization Approach

Published 5 Feb 2019 in stat.ML, cs.LG, math.OC, and stat.CO | (1902.01542v5)

Abstract: In many learning settings, it is beneficial to augment the main features with pairwise interactions. Such interaction models can be often enhanced by performing variable selection under the so-called strong hierarchy constraint: an interaction is non-zero only if its associated main features are non-zero. Existing convex optimization based algorithms face difficulties in handling problems where the number of main features $p \sim 103$ (with total number of features $\sim p2$). In this paper, we study a convex relaxation which enforces strong hierarchy and develop a highly scalable algorithm based on proximal gradient descent. We introduce novel screening rules that allow for solving the complicated proximal problem in parallel. In addition, we introduce a specialized active-set strategy with gradient screening for avoiding costly gradient computations. The framework can handle problems having dense design matrices, with $p = 50,000$ ($\sim 109$ interactions)---instances that are much larger than current state of the art. Experiments on real and synthetic data suggest that our toolkit hierScale outperforms the state of the art in terms of prediction and variable selection and can achieve over a 4900x speed-up.

Citations (18)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.