Automatic design optimization of preference-based subjective evaluation with online learning in crowdsourcing environment (2403.06100v1)
Abstract: A preference-based subjective evaluation is a key method for evaluating generative media reliably. However, its huge combinations of pairs prohibit it from being applied to large-scale evaluation using crowdsourcing. To address this issue, we propose an automatic optimization method for preference-based subjective evaluation in terms of pair combination selections and allocation of evaluation volumes with online learning in a crowdsourcing environment. We use a preference-based online learning method based on a sorting algorithm to identify the total order of evaluation targets with minimum sample volumes. Our online learning algorithm supports parallel and asynchronous execution under fixed-budget conditions required for crowdsourcing. Our experiment on preference-based subjective evaluation of synthetic speech shows that our method successfully optimizes the test by reducing pair combinations from 351 to 83 and allocating optimal evaluation volumes for each pair ranging from 30 to 663 without compromising evaluation accuracies and wasting budget allocations.
- “MOS naturalness and the quest for human-like speech,” in 2018 IEEE Spoken Language Technology Workshop (SLT), 2018, pp. 346–352.
- “Natural TTS synthesis by conditioning WaveNet on Mel spectrogram predictions,” in Proc. ICASSP, 2018, pp. 4779–4783.
- “Close to human quality TTS with transformer,” CoRR, vol. abs/1809.08895, 2018.
- “Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech,” in ICML. 2021, vol. 139, pp. 5530–5540, PMLR.
- “On some biases encountered in modern audio quality listening tests-a review,” J. Audio Eng. Soc, vol. 56, no. 6, pp. 427–451, 2008.
- Slawomir Zielinski, “On some biases encountered in modern audio quality listening tests (part 2): Selected graphical examples and discussion,” J. Audio Eng. Soc, vol. 64, no. 1/2, pp. 55–74, 2016.
- “The limits of the mean opinion score for speech synthesis evaluation,” Computer Speech & Language, vol. 84, pp. 101577, 2024.
- “Investigating range-equalizing bias in mean opinion score ratings of synthesized speech,” 2023.
- “Bias and Statistical Significance in Evaluating Speech Synthesis with Mean Opinion Scores,” in Proc. Interspeech 2017, 2017, pp. 3976–3980.
- “Analysis of Mean Opinion Scores in Subjective Evaluation of Synthetic Speech Based on Tail Probabilities,” in Proc. INTERSPEECH 2023, 2023, pp. 5491–5495.
- “When is it better to compare than to score?,” CoRR, vol. abs/1406.6618, 2014.
- “Preference-based online learning with dueling bandits: A survey,” J. Mach. Learn. Res., vol. 22, pp. 7:1–7:108, 2021.
- Andrew C. Berry, “The accuracy of the gaussian approximation to the sum of independent variates,” Transactions of the American Mathematical Society, vol. 49, no. 1, pp. 122–136, 1941.
- “MOS vs. AB: Evaluating Text-to-Speech Systems Reliably Using Clustered Standard Errors,” in Proc. INTERSPEECH 2023, 2023, pp. 1090–1094.
- “The use of confidence or fiducial limits illustrated in the case of the binomial,” Biometrika, vol. 26, no. 4, pp. 404–413, 1934.
- “Maximum selection and ranking under noisy comparisons,” in ICML. 2017, vol. 70 of Proceedings of Machine Learning Research, pp. 1088–1096, PMLR.
- “Planning and coding of problems for an electronic computing instrument, part II, volume 2, reprinted in John von Neumann Collected Works, volume V: Design of computers, theory of automata and numerical analysis,” 1963.
- Wassily Hoeffding, “Probability inequalities for sums of bounded random variables,” Journal of the American Statistical Association, vol. 58, no. 301, pp. 13–30, 1963.
- “Voice conversion challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion,” CoRR, vol. abs/2008.12527, 2020.
- “The VoiceMOS Challenge 2022,” in Proc. Interspeech 2022, 2022, pp. 4536–4540.
- “The singing voice conversion challenge 2023,” CoRR, vol. abs/2306.14422, 2023.
- “On a test of whether one of two random variables is stochastically larger than the other,” The Annals of Mathematical Statistics, vol. 18, no. 1, pp. 50–60, 1947.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.