Multiple output samples per input in a single-output Gaussian process (2306.02719v2)
Abstract: The standard Gaussian Process (GP) only considers a single output sample per input in the training set. Datasets for subjective tasks, such as spoken language assessment, may be annotated with output labels from multiple human raters per input. This paper proposes to generalise the GP to allow for these multiple output samples in the training set, and thus make use of available output uncertainty information. This differs from a multi-output GP, as all output samples are from the same task here. The output density function is formulated to be the joint likelihood of observing all output samples, and latent variables are not repeated to reduce computation cost. The test set predictions are inferred similarly to a standard GP, with a difference being in the optimised hyper-parameters. This is evaluated on speechocean762, showing that it allows the GP to compute a test set output distribution that is more similar to the collection of reference outputs from the multiple human raters.
- Gaussian processes for machine learning, MIT Press, 2006.
- A. Malinin and M. Gales, “Predictive uncertainty estimation via prior networks,” in NeurIPS, Montréal, Canada, Dec 2018, pp. 7047–7058.
- A. Kendall and Y. Gal, “What uncertainties do we need in Bayesian deep learning for computer vision?,” in NIPS, Long Beach, USA, Dec 2017, pp. 5574–5584.
- D. J. C. MacKay, “Bayesian interpolation,” Neural Computation, vol. 4, no. 3, pp. 415–447, May 1992.
- “The pitfalls of using Gaussian process regression for normative modeling,” PLoS ONE, vol. 16, no. 9, Sep 2021.
- “speechocean762: an open-source non-native English speech corpus for pronunciation assessment,” in Interspeech, Brno, Czechia, Aug 2021, pp. 3710–3714.
- “Learning Gaussian processes from multiple tasks,” in ICML, Bonn, Germany, Aug 2005, pp. 1012–1019.
- “Multi-task Gaussian process prediction,” in NIPS, Vancouver, Canada, Dec 2007, pp. 153–160.
- R. Caruana, “Multitask learning,” Machine Learning, vol. 28, no. 1, pp. 41–75, Jul 1997.
- “Generalized matrix inversion is not harder than matrix multiplication,” Journal of Computational and Applied Mathematics, vol. 230, no. 1, pp. 270–282, Aug 2009.
- “Automatically grading learners’ English using a Gaussian process,” in SLaTE, Leipzig, Germany, Sep 2015, pp. 7–12.
- “Variational Gaussian process data uncertainty,” in ASRU, Taipei, Taiwan, Dec 2023.
- J. Quin~~n\tilde{\text{n}}over~ start_ARG n end_ARGonero-Candela and C. E. Rasmussen, “A unifying view of sparse approximate Gaussian process regression,” JMLR, vol. 6, no. 65, pp. 1939–1959, Dec 2005.
- J. H. Steiger, “Tests for comparing elements of a correlation matrix,” Psychological Bulletin, vol. 87, no. 2, pp. 245–251, 1980.
- O. J. Dunn and V. Clark, “Correlation coefficients measured on the same individuals,” Journal of the American Statistical Association, vol. 64, no. 325, pp. 366–377, Mar 1969.
- “Multilingual speech evaluation: case studies on English, Malay and Tamil,” in Interspeech, Brno, Czechia, Aug 2021, pp. 4443–4447.
- “The Kaldi speech recognition toolkit,” in ASRU, Hawaii, USA, Dec 2011.
- “Librispeech: an ASR corpus based on public domain audio books,” in ICASSP, Brisbane, Australia, Apr 2015, pp. 5206–5210.
- “Phone-level pronunciation scoring and assessment for interactive language learning,” Speech Communication, vol. 30, no. 2-3, pp. 95–108, Feb 2000.
- “Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers,” Speech Communication, vol. 67, pp. 154–166, Mar 2015.
- “A pitch extraction algorithm tuned for automatic speech recognition,” in ICASSP, Florence, Italy, May 2014, pp. 2494–2498.
- “Efficient estimation of word representations in vector space,” in ICLR, Scottsdale, USA, May 2013.
- J. L. Elman, “Finding structure in time,” Cognitive Science, vol. 14, no. 2, pp. 179–211, Apr 1990.
- “Text classification using string kernels,” in NIPS, Denver, USA, Nov 2000, pp. 563–569.
- A. Graves and J. Schmidhuber, “Framewise phoneme classification with bidirectional LSTM networks,” in IJCNN, Montreal, Canada, Jul 2005, pp. 2047–2052.
- R. Salakhutdinov and G. E. Hinton, “Using deep belief nets to learn covariance kernels for Gaussian processes,” in NIPS, Vancouver, Canada, Dec 2007, pp. 1249–1256.
- “The promises and pitfalls of deep kernel learning,” in UAI, Jul 2021, pp. 1206–1216.
- “Dropout: a simple way to prevent neural networks from overfitting,” Journal of Machine Learning Research, vol. 15, no. 56, pp. 1929–1958, Jun 2014.
Collections
Sign up for free to add this paper to one or more collections.