Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models (1905.07325v1)
Abstract: With an eye toward understanding complexity control in deep learning, we study how infinitesimal regularization or gradient descent optimization lead to margin maximizing solutions in both homogeneous and non-homogeneous models, extending previous work that focused on infinitesimal regularization only in homogeneous models. To this end we study the limit of loss minimization with a diverging norm constraint (the "constrained path"), relate it to the limit of a "margin path" and characterize the resulting solution. For non-homogeneous ensemble models, which output is a sum of homogeneous sub-models, we show that this solution discards the shallowest sub-models if they are unnecessary. For homogeneous models, we show convergence to a "lexicographic max-margin solution", and provide conditions under which max-margin solutions are also attained as the limit of unconstrained gradient descent.
- Mor Shpigel Nacson (10 papers)
- Suriya Gunasekar (34 papers)
- Jason D. Lee (151 papers)
- Nathan Srebro (145 papers)
- Daniel Soudry (76 papers)