Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On Negative Transfer and Structure of Latent Functions in Multi-output Gaussian Processes (2004.02382v1)

Published 6 Apr 2020 in stat.ML and cs.LG

Abstract: The multi-output Gaussian process ($\mathcal{MGP}$) is based on the assumption that outputs share commonalities, however, if this assumption does not hold negative transfer will lead to decreased performance relative to learning outputs independently or in subsets. In this article, we first define negative transfer in the context of an $\mathcal{MGP}$ and then derive necessary conditions for an $\mathcal{MGP}$ model to avoid negative transfer. Specifically, under the convolution construction, we show that avoiding negative transfer is mainly dependent on having a sufficient number of latent functions $Q$ regardless of the flexibility of the kernel or inference procedure used. However, a slight increase in $Q$ leads to a large increase in the number of parameters to be estimated. To this end, we propose two latent structures that scale to arbitrarily large datasets, can avoid negative transfer and allow any kernel or sparse approximations to be used within. These structures also allow regularization which can provide consistent and automatic selection of related outputs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Sparse convolved gaussian processes for multi-output regression. In Advances in neural information processing systems, pp. 57–64, 2009.
  2. Computationally efficient convolved multiple output gaussian processes. Journal of Machine Learning Research, 12(May):1459–1500, 2011.
  3. Kernels for vector-valued functions: A review. Foundations and Trends® in Machine Learning, 4(3):195–266, 2012.
  4. Non-linear process convolutions for multi-output gaussian processes. Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), 2019.
  5. Asymptotic properties of maximum likelihood estimators for stochastic processes. Sankhyā: The Indian Journal of Statistics, Series A, pp. 259–270, 1976.
  6. Basawa, I. V. Statistical Inferences for Stochasic Processes: Theory and Methods. Elsevier, 1980.
  7. Rates of convergence for sparse variational gaussian process regression. arXiv preprint arXiv:1903.03571, 2019.
  8. Caruana, R. Multitask learning. Machine learning, 28(1):41–75, 1997.
  9. Spectral mixture kernels with time and phase delay dependencies. arXiv preprint arXiv:1808.00560, 2018.
  10. Multioutput convolution spectral mixture for gaussian processes. IEEE Transactions on Neural Networks and Learning Systems, 2019.
  11. Variational inference for latent variables and uncertain inputs in gaussian processes. The Journal of Machine Learning Research, 17(1):1425–1486, 2016.
  12. Distributed gaussian processes. Proceedings of the 32 nd International Conference on Machine Learning, 2015.
  13. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American statistical Association, 96(456):1348–1360, 2001.
  14. Pairwise fitting of mixed models for the joint modeling of multivariate longitudinal profiles. Biometrics, 62(2):424–431, 2006.
  15. Multivariate gaussian process emulators with nonseparable covariance structures. Technometrics, 55(1):47–56, 2013.
  16. The elements of statistical learning, volume 1. Springer series in statistics New York, 2001.
  17. Local gaussian process approximation for large computer experiments. Journal of Computational and Graphical Statistics, 24(2):561–578, 2015.
  18. Higdon, D. Space and space-time modeling using process convolutions. In Quantitative methods for current environmental issues, pp. 37–56. Springer, 2002.
  19. Langley, P. Crafting papers on machine learning. In Langley, P. (ed.), Proceedings of the 17th International Conference on Machine Learning (ICML 2000), pp.  1207–1216, Stanford, CA, 2000. Morgan Kaufmann.
  20. Theory of point estimation. Springer Science & Business Media, 2006.
  21. Pairwise meta-modeling of multivariate output computer models using nonseparable covariance function. Technometrics, 58(4):483–494, 2016.
  22. Pairwise estimation of multivariate gaussian process models with replicated observations: Application to multivariate profile monitoring. Technometrics, 60(1):70–78, 2018.
  23. Gaussian process random fields. In Advances in Neural Information Processing Systems, pp. 3357–3365, 2015.
  24. Heterogeneous multi-output gaussian process prediction. In Advances in neural information processing systems, pp. 6711–6720, 2018.
  25. Collaborative multi-output gaussian processes. In UAI, pp.  643–652, 2014.
  26. A multiresolution gaussian process model for the analysis of large spatial datasets. Journal of Computational and Graphical Statistics, 24(2):579–599, 2015.
  27. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10):1345–1359, 2009.
  28. Spectral mixture kernels for multi-output gaussian processes. In Advances in Neural Information Processing Systems, pp. 6681–6690, 2017.
  29. Tresp, V. A bayesian committee machine. Neural computation, 12(11):2719–2741, 2000.
  30. Gp kernels for cross-spectrum analysis. In Advances in neural information processing systems, pp. 1999–2007, 2015.
  31. Constructing and fitting models for cokriging and multivariable spatial prediction. Journal of Statistical Planning and Inference, 69(2):275–294, 1998.
  32. Exact gaussian processes on a million data points. In Advances in Neural Information Processing Systems, pp. 14622–14632, 2019.
  33. Whittaker, J. Graphical models in applied multivariate statistics. Wiley Publishing, 2009.
  34. Variational dependent multi-output gaussian process dynamical systems. The Journal of Machine Learning Research, 17(1):4134–4169, 2016.
Citations (6)

Summary

We haven't generated a summary for this paper yet.