2000 character limit reached
A Linear Time Active Learning Algorithm for Link Classification -- Full Version -- (1301.4767v2)
Published 21 Jan 2013 in cs.LG, cs.SI, and stat.ML
Abstract: We present very efficient active learning algorithms for link classification in signed networks. Our algorithms are motivated by a stochastic model in which edge labels are obtained through perturbations of a initial sign assignment consistent with a two-clustering of the nodes. We provide a theoretical analysis within this model, showing that we can achieve an optimal (to whithin a constant factor) number of mistakes on any graph G = (V,E) such that |E| = \Omega(|V|{3/2}) by querying O(|V|{3/2}) edge labels. More generally, we show an algorithm that achieves optimality to within a factor of O(k) by querying at most order of |V| + (|V|/k){3/2} edge labels. The running time of this algorithm is at most of order |E| + |V|\log|V|.