Papers
Topics
Authors
Recent
2000 character limit reached

Dataset Meta-Learning from Kernel Ridge-Regression

Published 30 Oct 2020 in cs.LG and stat.ML | (2011.00050v3)

Abstract: One of the most fundamental aspects of any machine learning algorithm is the training data used by the algorithm. We introduce the novel concept of $\epsilon$-approximation of datasets, obtaining datasets which are much smaller than or are significant corruptions of the original training data while maintaining similar model performance. We introduce a meta-learning algorithm called Kernel Inducing Points (KIP) for obtaining such remarkable datasets, inspired by the recent developments in the correspondence between infinitely-wide neural networks and kernel ridge-regression (KRR). For KRR tasks, we demonstrate that KIP can compress datasets by one or two orders of magnitude, significantly improving previous dataset distillation and subset selection methods while obtaining state of the art results for MNIST and CIFAR-10 classification. Furthermore, our KIP-learned datasets are transferable to the training of finite-width neural networks even beyond the lazy-training regime, which leads to state of the art results for neural network dataset distillation with potential applications to privacy-preservation.

Citations (207)

Summary

  • The paper's primary contribution is the KIP meta-learning algorithm, which compresses datasets significantly while preserving performance in kernel ridge-regression tasks.
  • KIP achieves precise test accuracies, such as 99.3% on MNIST and 66.3% on CIFAR-10, with a reduction in data size by one to two orders of magnitude.
  • The method also shows potential for privacy-preserving data generation, maintaining robust model performance even under high levels of pixel corruption.

Dataset Meta-Learning from Kernel Ridge-Regression

The paper "Dataset Meta-Learning from Kernel Ridge-Regression" presents an innovative approach to dataset optimization, focusing on the concept of ϵ\epsilon-approximation to achieve reduced data sizes while preserving model performance. This study introduces the Kernel Inducing Points (KIP) algorithm, rooted in the principles of Kernel Ridge-Regression (KRR), to optimize and distill datasets effectively.

Core Contribution

The primary innovation of this research is the KIP meta-learning algorithm, which leverages the structural relationship between infinitely-wide neural networks and KRR. Through KIP, the authors demonstrate a significant advancement in dataset distillation methods by achieving data compression of one to two orders of magnitude for KRR tasks on MNIST and CIFAR-10 datasets. The results surpass previous methods both in terms of data efficiency and accuracy. Moreover, KIP-learned datasets maintain robust performance when applied to finite-width neural networks beyond the typical lazy-training regime.

Detailed Results

The paper provides compelling numerical evidence to support its claims. For instance, KIP allows reduction of dataset sizes while achieving test accuracies of 99.3% and 66.3% on the MNIST and CIFAR-10 datasets, respectively, with only 10,000 learned images. This performance is comparable to what is achieved with the full dataset in traditional setups, thus demonstrating the method's efficacy in dataset distillation.

Privacy and Corruption

An intriguing application of the methods described is the domain of privacy-preserving data generation. By manipulating pixels within the dataset, KIP can introduce noise without severely degrading performance -- a step toward enhancing data privacy. Empirical tests show that up to 90% pixel corruption only moderately affects model performance while preserving data integrity in comparison to natural datasets.

Theoretical Implications and Future Directions

The theoretical architecture of KIP, involving optimization over kernel embeddings, provides a robust framework for analyzing data distillation processes. It also opens avenues for further exploration in dataset meta-learning, particularly in areas where data collection might be expensive or constrained by privacy considerations. Additionally, future work can focus on scaling KIP to more complex architectures and datasets, which is promising given the current results on basic configurations.

Speculation on AI Development

The techniques described in this paper have potential implications for the future scaling of AI systems. As we move towards increasingly data-centric models, the ability to optimize training datasets efficiently could lead to better resource allocation, faster training phases, and ultimately, more scalable AI deployments. This could further facilitate AI applications in scenarios where data is limited, thus extending the capabilities of AI into new and challenging domains.

Overall, this study makes substantial contributions to the field of machine learning by providing a framework for dataset optimization that balances size and efficacy effectively. The use of KRR as a backbone for training underlines the potential for kernel methods in practical AI applications, ranging from traditional classification tasks to privacy-conscious learning environments.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.