A Divide-and-Conquer Solver for Kernel Support Vector Machines (1311.0914v1)

Published 4 Nov 2013 in cs.LG

Abstract: The kernel support vector machine (SVM) is one of the most widely used classification methods; however, the amount of computation required becomes the bottleneck when facing millions of samples. In this paper, we propose and analyze a novel divide-and-conquer solver for kernel SVMs (DC-SVM). In the division step, we partition the kernel SVM problem into smaller subproblems by clustering the data, so that each subproblem can be solved independently and efficiently. We show theoretically that the support vectors identified by the subproblem solution are likely to be support vectors of the entire kernel SVM problem, provided that the problem is partitioned appropriately by kernel clustering. In the conquer step, the local solutions from the subproblems are used to initialize a global coordinate descent solver, which converges quickly as suggested by our analysis. By extending this idea, we develop a multilevel Divide-and-Conquer SVM algorithm with adaptive clustering and early prediction strategy, which outperforms state-of-the-art methods in terms of training speed, testing accuracy, and memory usage. As an example, on the covtype dataset with half-a-million samples, DC-SVM is 7 times faster than LIBSVM in obtaining the exact SVM solution (to within $10^{-6}$ relative error) which achieves 96.15% prediction accuracy. Moreover, with our proposed early prediction strategy, DC-SVM achieves about 96% accuracy in only 12 minutes, which is more than 100 times faster than LIBSVM.

Authors (3)

Cho-Jui Hsieh (211 papers)
Si Si (24 papers)
Inderjit S. Dhillon (62 papers)

Citations (166)

View on Semantic Scholar

Summary

The paper introduces DC-SVM, a novel framework using a divide-and-conquer approach with kernel k-means clustering to efficiently solve large-scale kernel SVM optimization problems.
Empirical results show DC-SVM finds exact solutions for large datasets up to 7 times faster than LIBSVM and achieves high accuracy approximation over 100 times faster.
DC-SVM offers a scalable solution for kernel SVMs with potential applications and future research avenues in other complex models, distributed learning, and advanced clustering methods.

A Divide-and-Conquer Approach for Efficient Kernel SVM Optimization

This paper introduces a novel algorithmic framework, the Divide-and-Conquer Solver for Kernel Support Vector Machines (DC-SVM), aimed at addressing computational bottlenecks associated with large-scale kernel SVMs, specifically when dealing with datasets containing millions of samples. Kernel SVMs are notably powerful classification tools but pose challenges in scaling due to intensive computational demands and substantial memory usage, primarily due to the dense nature of kernel matrices.

Methodology

The DC-SVM employs a divide-and-conquer strategy, partitioning the kernel SVM optimization problem into smaller, manageable subproblems via data clustering. The clustering is performed using a kernel kmeans algorithm that seeks to minimize within-cluster variance and is designed to ensure that support vectors identified in subproblems are likely support vectors in the global problem. This hierarchical clustering accelerates the computation by reducing individual subproblems to sizes where traditional solvers like LIBSVM cannot effectively operate due to resource constraints.

In the "divide" phase, the data is clustered into smaller subgroups, with subproblems solved independently. The "conquer" phase involves aggregating local solutions to initialize a global coordinate descent solver that achieves convergence rapidly, theoretically due to the closeness of the subproblem solutions to the global solution.

Computational Results

Empirical results demonstrate significant performance enhancements. For example, when applied to datasets containing half-a-million samples, DC-SVM is shown to find an exact SVM solution within a $10^{-6}$ relative error tolerance, seven times faster than leading methods like LIBSVM. With an early prediction mechanism, DC-SVM further accelerates computation by achieving approximately 96% accuracy in merely 12 minutes, more than 100 times faster than LIBSVM.

Implications and Future Directions

The implications of DC-SVM are profound, providing a scalable approach to kernel SVM optimization that balances computational efficiency with predictive accuracy. This work sets the stage for further advancements in machine learning, particularly in scalable methods suited to large datasets. The algorithm’s adaptability suggests potential for extensions into other machine learning models burdened by similar computation and memory constraints. Moreover, the divide-and-conquer framework might inspire developments in distributed learning and parallel computing environments.

Future research could explore adaptive strategies for clustering based on real-time feedback from the solver performance or extend these models to non-stationary datasets typical in real-world applications, where data distribution changes over time. Another avenue for investigation is the integration of DC-SVM within ensembles with other machine learning techniques to enhance performance on specific tasks like feature selection or dimensionality reduction.

In summary, this paper makes a compelling case for DC-SVM as an effective strategy for large-scale kernel SVM optimization, combining theoretical rigor with robust empirical results. Future work in AI and machine learning could substantially build upon this framework to tackle scalability challenges inherent in other complex models and data structures.