Concrete Autoencoders for Differentiable Feature Selection and Reconstruction (1901.09346v2)

Published 27 Jan 2019 in cs.LG and stat.ML

Abstract: We introduce the concrete autoencoder, an end-to-end differentiable method for global feature selection, which efficiently identifies a subset of the most informative features and simultaneously learns a neural network to reconstruct the input data from the selected features. Our method is unsupervised, and is based on using a concrete selector layer as the encoder and using a standard neural network as the decoder. During the training phase, the temperature of the concrete selector layer is gradually decreased, which encourages a user-specified number of discrete features to be learned. During test time, the selected features can be used with the decoder network to reconstruct the remaining input features. We evaluate concrete autoencoders on a variety of datasets, where they significantly outperform state-of-the-art methods for feature selection and data reconstruction. In particular, on a large-scale gene expression dataset, the concrete autoencoder selects a small subset of genes whose expression levels can be use to impute the expression levels of the remaining genes. In doing so, it improves on the current widely-used expert-curated L1000 landmark genes, potentially reducing measurement costs by 20%. The concrete autoencoder can be implemented by adding just a few lines of code to a standard autoencoder.

Citations (213)

View on Semantic Scholar

Summary

The paper introduces a differentiable feature selection method using a concrete selector layer integrated within an autoencoder for effective data reconstruction.
It demonstrates significant improvements on datasets like MNIST and gene expression, reducing measured genes by 20% while maintaining high fidelity.
The method outperforms traditional approaches by enabling end-to-end, gradient-based optimization for scalable and accurate feature selection.

Overview of Concrete Autoencoders for Differentiable Feature Selection and Reconstruction

The paper "Concrete Autoencoders for Differentiable Feature Selection and Reconstruction" presents a novel approach for unsupervised feature selection using concrete autoencoders, which enable the simultaneous selection of the most informative features and reconstruction of the original data from these features. The proposed method leverages the Concrete distribution, a continuous relaxation of discrete random variables that allows for differentiation and gradient-based optimization through backpropagation.

Methodology

The core innovation in the paper is the integration of a concrete selector layer within an autoencoder architecture, trained in an end-to-end fashion. The selector layer employs the Concrete distribution to stochastically select a subset of discrete features during training. The distribution is gradually annealed to achieve discrete feature selection by the end of training. This approach differentiates itself from traditional feature selection methods by focusing on a fully differentiable process that aligns with the overarching framework of neural networks, allowing the selected features to be tuned for specific reconstruction tasks.

Numerical Results and Implications

The paper provides robust empirical evidence for the efficacy of the concrete autoencoder across a range of datasets, including MNIST and gene expression datasets. On the MNIST dataset, the concrete autoencoder effectively selects a subset of pixels that maintains high reconstruction fidelity, demonstrating its potential in image data analysis. More notably, in a large-scale gene expression dataset, the method improves upon the existing expert-curated L1000 landmark genes by potentially reducing the number of genes measured by 20%, thereby suggesting practical adaptations in biological measurement techniques before downstream analysis.

Comparison with Existing Methods

Concrete autoencoders show superior performance in comparison to other unsupervised feature selection approaches like UDFS, MCFS, and Principal Feature Analysis (PFA). The key strength of this approach lies in its ability to outperform these methods in terms of providing lower reconstruction errors and achieving competitive classification accuracy on selected features, as observed in various benchmark datasets.

Future Directions

The use of concrete autoencoders opens a multitude of avenues for theoretical and practical advancements in machine learning, particularly in terms of efficient data handling in high-dimensional spaces. Future work could extend the framework to supervised feature selection tasks or adapt it to domains where feature selection comes with different costs or constraints. Moreover, the potential for incorporating domain-specific knowledge or customizing the annealing schedule to tailor feature selection to specific applications could offer further improvements.

Conclusion

This paper's introduction of concrete autoencoders represents a significant step forward in the field of feature selection, providing a flexible, scalable solution that integrates seamlessly with existing neural network frameworks while offering considerable improvements in efficiency and performance. The implications of this work are substantial, offering a reduction in computational cost and storage without compromising on data quality or analytical performance.

PDF Markdown