Simultaneous Feature and Expert Selection within Mixture of Experts

Published 29 May 2014 in cs.LG | (1405.7624v1)

Abstract: A useful strategy to deal with complex classification scenarios is the "divide and conquer" approach. The mixture of experts (MOE) technique makes use of this strategy by joinly training a set of classifiers, or experts, that are specialized in different regions of the input space. A global model, or gate function, complements the experts by learning a function that weights their relevance in different parts of the input space. Local feature selection appears as an attractive alternative to improve the specialization of experts and gate function, particularly, for the case of high dimensional data. Our main intuition is that particular subsets of dimensions, or subspaces, are usually more appropriate to classify instances located in different regions of the input space. Accordingly, this work contributes with a regularized variant of MoE that incorporates an embedded process for local feature selection using $L1$ regularization, with a simultaneous expert selection. The experiments are still pending.