Conditional Gradients for the Approximate Vanishing Ideal (2202.03349v16)
Abstract: The vanishing ideal of a set of points $X\subseteq \mathbb{R}n$ is the set of polynomials that evaluate to $0$ over all points $\mathbf{x} \in X$ and admits an efficient representation by a finite set of polynomials called generators. To accommodate the noise in the data set, we introduce the pairwise conditional gradients approximate vanishing ideal algorithm (PCGAVI) that constructs a set of generators of the approximate vanishing ideal. The constructed generators capture polynomial structures in data and give rise to a feature map that can, for example, be used in combination with a linear classifier for supervised learning. In PCGAVI, we construct the set of generators by solving constrained convex optimization problems with the pairwise conditional gradients algorithm. Thus, PCGAVI not only constructs few but also sparse generators, making the corresponding feature transformation robust and compact. Furthermore, we derive several learning guarantees for PCGAVI that make the algorithm theoretically better motivated than related generator-constructing methods.
- On the equivalence between herding and conditional gradient algorithms. In Proceedings of the International Conference on Machine Learning, pages 1355–1362. PMLR.
- Decomposition-invariant conditional gradient for general polytopes with line search. In Proceedings of Advances in Neural Information Processing Systems, pages 2687–2697.
- Skin segmentation dataset. UCI Machine Learning Repository.
- Multi-output polynomial networks and factorization machines. In Proceedings of Advances in Neural Information Processing Systems, pages 3349–3359.
- Weakly-supervised alignment of video with text. In Proceedings of the International Conference on Computer Vision, pages 4462–4470. IEEE.
- Lazifying conditional gradient algorithms. Journal of Machine Learning Research, 20(71):1–42.
- Boosting frank-wolfe by chasing gradients. In Proceedings of the International Conference on Machine Learning, pages 2111–2121. PMLR.
- Optimal transport for domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(9):1853–1865.
- Ideals, varieties, and algorithms: an introduction to computational algebraic geometry and commutative algebra. Springer Science & Business Media.
- UCI machine learning repository.
- Fassino, C. (2010). Almost vanishing polynomials for sets of limited precision points. Journal of Symbolic Computation, 45(1):19–37.
- An algorithm for quadratic programming. Naval Research Logistics Quarterly, 3(1-2):95–110.
- An extended Frank-Wolfe method with “in-face” directions, and its application to low-rank matrix completion. SIAM Journal on optimization, 27(1):319–346.
- Faster rates for the Frank-Wolfe method over strongly-convex sets. In Proceedings of the International Conference on Machine Learning. PMLR.
- A linearly convergent variant of the conditional gradient algorithm under strong convexity, with applications to online and stochastic optimization. SIAM Journal on Optimization, 26(3):1493–1528.
- Linear-memory and decomposition-invariant linearly convergent conditional gradient algorithm for structured polytopes. volume 29, pages 1001–1009.
- Optimizing over the growing spectrahedron. In European Symposium on Algorithms, pages 503–514. Springer.
- Some comments on Wolfe’s ‘away step’. Mathematical Programming, 35(1):110–119.
- Large-scale image classification with trace-norm regularization. In Proceedings of the Conference on Computer Vision and Pattern Recognition, pages 3386–3393. IEEE.
- Approximate computation of zero-dimensional polynomial ideals. Journal of Symbolic Computation, 44(11):1566–1591.
- Principal variety analysis. In Proceedings of the Conference on Robot Learning, pages 97–108.
- Jaggi, M. (2013). Revisiting Frank-Wolfe: Projection-free sparse convex optimization. In Proceedings of the International Conference on Machine Learning, number CONF, pages 427–435. PMLR.
- A simple algorithm for nuclear norm regularized problems. In Proceedings of the International Conference on Machine Learning. PMLR.
- Efficient image and video co-localization with frank-wolfe algorithm. In Computer Vision – ECCV 2014, pages 253–268, Cham. Springer International Publishing.
- Spurious vanishing problem in approximate vanishing ideal. IEEE Access, 7:178961–178976.
- Gradient boosts the approximate vanishing ideal. In Proceedings of the Conference on Artificial Intelligence, number 04, pages 4428–4435.
- Vanishing ideal genetic programming. In 2016 IEEE Congress on Evolutionary Computation (CEC), pages 5018–5025.
- Affine invariant analysis of frank-wolfe on strongly convex sets. In Proceedings of the International Conference on Machine Learning, pages 5398–5408. PMLR.
- Computational Commutative Algebra. Springer Berlin Heidelberg.
- An affine invariant linear convergence analysis for Frank-Wolfe algorithms. arXiv preprint arXiv:1312.7864.
- On the global linear convergence of Frank-Wolfe optimization variants. In Proceedings of Advances in Neural Information Processing Systems, pages 496–504.
- Probability in Banach Spaces: isoperimetry and processes, volume 23. Springer Science & Business Media.
- Constrained minimization methods. USSR Computational Mathematics and Mathematical Physics, 6(5):1–50.
- Limbeck, J. (2013). Computation of approximate border bases and applications.
- Vanishing component analysis. In Proceedings of the International Conference on Machine Learning, pages 597–605. PMLR.
- Sinkhorn barycenters with free support via Frank-Wolfe algorithm. In Proceedings of Advances in Neural Information Processing Systems, pages 9318–9329.
- Fifty years of pulsar candidate selection: from simple filters to a new principled real-time classification approach. Monthly Notices of the Royal Astronomical Society, 459(1):1104–1123.
- Foundations of machine learning. MIT press.
- The construction of multivariate polynomials with preassigned zeros. In European Computer Algebra Conference, pages 24–31. Springer.
- Nesterov, Y. (1983). A method for unconstrained convex minimization problem with the rate of convergence O(1/k2)𝑂1superscript𝑘2O(1/k^{2})italic_O ( 1 / italic_k start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). In Doklady an USSR, volume 269, pages 543–547.
- Subspace robust wasserstein distances. In Proceedings of the International Conference on Machine Learning, pages 5072–5081. PMLR.
- Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12:2825–2830.
- Weakly-supervised learning of visual relations. In Proceedings of the International Conference on Computer Vision, pages 5179–5188. IEEE.
- Learning infinite rbms with Frank-Wolfe. In Proceedings of Advances in Neural Information Processing Systems, pages 3063–3071.
- Least squares support vector machine classifiers. Neural Processing Letters, 9(3):293–300.
- Nonlinear blind source separation unifying vanishing component analysis and temporal structure. IEEE Access, 6:42837–42850.
- The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems with Applications, 36(2):2473–2480.
- Zhang, X. (2018). Improvement on the vanishing component analysis by grouping strategy. EURASIP Journal on Wireless Communications and Networking, 2018(1):111.
- Hand posture recognition using approximate vanishing ideal generators. In Proceedings of the International Conference on Image Processing, pages 1525–1529. IEEE.