Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A MapReduce based distributed SVM algorithm for binary classification (1312.4108v1)

Published 15 Dec 2013 in cs.LG and cs.DC

Abstract: Although Support Vector Machine (SVM) algorithm has a high generalization property to classify for unseen examples after training phase and it has small loss value, the algorithm is not suitable for real-life classification and regression problems. SVMs cannot solve hundreds of thousands examples in training dataset. In previous studies on distributed machine learning algorithms, SVM is trained over a costly and preconfigured computer environment. In this research, we present a MapReduce based distributed parallel SVM training algorithm for binary classification problems. This work shows how to distribute optimization problem over cloud computing systems with MapReduce technique. In the second step of this work, we used statistical learning theory to find the predictive hypothesis that minimize our empirical risks from hypothesis spaces that created with reduce function of MapReduce. The results of this research are important for training of big datasets for SVM algorithm based classification problems. We provided that iterative training of split dataset with MapReduce technique; accuracy of the classifier function will converge to global optimal classifier function's accuracy in finite iteration size. The algorithm performance was measured on samples from letter recognition and pen-based recognition of handwritten digits dataset.

Citations (26)

Summary

We haven't generated a summary for this paper yet.