Structure Discovery in Nonparametric Regression through Compositional Kernel Search (1302.4922v4)

Published 20 Feb 2013 in stat.ML, cs.LG, and stat.ME

Abstract: Despite its importance, choosing the structural form of the kernel in nonparametric regression remains a black art. We define a space of kernel structures which are built compositionally by adding and multiplying a small number of base kernels. We present a method for searching over this space of structures which mirrors the scientific discovery process. The learned structures can often decompose functions into interpretable components and enable long-range extrapolation on time-series datasets. Our structure search method outperforms many widely used kernels and kernel combination methods on a variety of prediction tasks.

Citations (495)

View on Semantic Scholar

Summary

The paper introduces a compositional kernel search method that automates the discovery of effective kernel structures in Gaussian process regression.
It employs a discrete search over a space of base kernels using marginal likelihood, achieving lower mean squared error and higher predictive likelihood.
The method enhances model interpretability by decomposing complex signals and shows promise for broader applications in nonparametric regression.

Structure Discovery in Nonparametric Regression through Compositional Kernel Search

This paper addresses the intricate challenge of selecting structural forms for kernels in nonparametric regression, particularly within Gaussian processes (GPs). The authors propose a novel method to automate kernel selection through a process mimicking scientific discovery, significantly enhancing regression models' extrapolation capabilities, especially in time-series datasets.

Compositional Kernel Search Methodology

The core contribution is constructing a space of kernel structures by compositionally adding and multiplying a limited set of base kernels: squared exponential (SE), periodic (Per), linear (Lin), and rational quadratic (RQ). This compositional approach forms a flexible and expressive modeling language which captures diverse data structures.

The authors employ a discrete search over this space using marginal likelihood as the criterion, akin to successful strategies in equation discovery and unsupervised learning. This search identifies kernels that not only generalize well but also yield interpretable functional decompositions.

Numerical Outcomes

Empirical results demonstrate the superiority of the proposed method over several existing kernel learning methods, such as the widely used automatic relevance determination (ARD) and generalized additive models. On various benchmark datasets, the compositional kernel search consistently achieved lower mean squared error and higher predictive likelihood. For instance, on the airline passenger data, the learned kernel effectively captures long-term trends and seasonal variations, providing robust extrapolations beyond observed data.

Implications and Future Directions

The implications of this research are multifaceted:

Practical Applications: The ability to automatically discover useful kernel structures can democratize nonparametric regression, making it accessible to practitioners without deep expertise in kernel engineering.
Interpretability: By decomposing signals into interpretable components, the method facilitates a deeper understanding of the underlying data-generating processes, fostering effective model checking.
Broader Applications: Although the current focus is on Gaussian processes, the framework could be adapted to other learning scenarios, such as classification or ordinal regression.
Extension to Other Kernels: Incorporating additional kernel types, such as those capturing negative correlations, could further augment the expressive power of the method.

Conclusion

This paper makes a significant methodological contribution to the field of machine learning by automating kernel structure discovery, thus pushing the boundaries of what kernel-based nonparametric methods can achieve. Future research may explore integrating richer base kernels and extending the approach to a wider array of machine learning problems, enhancing both prediction accuracy and model interpretability.

PDF Markdown