Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 172 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 38 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 73 tok/s Pro
Kimi K2 231 tok/s Pro
GPT OSS 120B 427 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

Automatic design optimization of preference-based subjective evaluation with online learning in crowdsourcing environment (2403.06100v1)

Published 10 Mar 2024 in cs.HC, cs.CL, cs.LG, eess.AS, and stat.ML

Abstract: A preference-based subjective evaluation is a key method for evaluating generative media reliably. However, its huge combinations of pairs prohibit it from being applied to large-scale evaluation using crowdsourcing. To address this issue, we propose an automatic optimization method for preference-based subjective evaluation in terms of pair combination selections and allocation of evaluation volumes with online learning in a crowdsourcing environment. We use a preference-based online learning method based on a sorting algorithm to identify the total order of evaluation targets with minimum sample volumes. Our online learning algorithm supports parallel and asynchronous execution under fixed-budget conditions required for crowdsourcing. Our experiment on preference-based subjective evaluation of synthetic speech shows that our method successfully optimizes the test by reducing pair combinations from 351 to 83 and allocating optimal evaluation volumes for each pair ranging from 30 to 663 without compromising evaluation accuracies and wasting budget allocations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. “MOS naturalness and the quest for human-like speech,” in 2018 IEEE Spoken Language Technology Workshop (SLT), 2018, pp. 346–352.
  2. “Natural TTS synthesis by conditioning WaveNet on Mel spectrogram predictions,” in Proc. ICASSP, 2018, pp. 4779–4783.
  3. “Close to human quality TTS with transformer,” CoRR, vol. abs/1809.08895, 2018.
  4. “Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech,” in ICML. 2021, vol. 139, pp. 5530–5540, PMLR.
  5. “On some biases encountered in modern audio quality listening tests-a review,” J. Audio Eng. Soc, vol. 56, no. 6, pp. 427–451, 2008.
  6. Slawomir Zielinski, “On some biases encountered in modern audio quality listening tests (part 2): Selected graphical examples and discussion,” J. Audio Eng. Soc, vol. 64, no. 1/2, pp. 55–74, 2016.
  7. “The limits of the mean opinion score for speech synthesis evaluation,” Computer Speech & Language, vol. 84, pp. 101577, 2024.
  8. “Investigating range-equalizing bias in mean opinion score ratings of synthesized speech,” 2023.
  9. “Bias and Statistical Significance in Evaluating Speech Synthesis with Mean Opinion Scores,” in Proc. Interspeech 2017, 2017, pp. 3976–3980.
  10. “Analysis of Mean Opinion Scores in Subjective Evaluation of Synthetic Speech Based on Tail Probabilities,” in Proc. INTERSPEECH 2023, 2023, pp. 5491–5495.
  11. “When is it better to compare than to score?,” CoRR, vol. abs/1406.6618, 2014.
  12. “Preference-based online learning with dueling bandits: A survey,” J. Mach. Learn. Res., vol. 22, pp. 7:1–7:108, 2021.
  13. Andrew C. Berry, “The accuracy of the gaussian approximation to the sum of independent variates,” Transactions of the American Mathematical Society, vol. 49, no. 1, pp. 122–136, 1941.
  14. “MOS vs. AB: Evaluating Text-to-Speech Systems Reliably Using Clustered Standard Errors,” in Proc. INTERSPEECH 2023, 2023, pp. 1090–1094.
  15. “The use of confidence or fiducial limits illustrated in the case of the binomial,” Biometrika, vol. 26, no. 4, pp. 404–413, 1934.
  16. “Maximum selection and ranking under noisy comparisons,” in ICML. 2017, vol. 70 of Proceedings of Machine Learning Research, pp. 1088–1096, PMLR.
  17. “Planning and coding of problems for an electronic computing instrument, part II, volume 2, reprinted in John von Neumann Collected Works, volume V: Design of computers, theory of automata and numerical analysis,” 1963.
  18. Wassily Hoeffding, “Probability inequalities for sums of bounded random variables,” Journal of the American Statistical Association, vol. 58, no. 301, pp. 13–30, 1963.
  19. “Voice conversion challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion,” CoRR, vol. abs/2008.12527, 2020.
  20. “The VoiceMOS Challenge 2022,” in Proc. Interspeech 2022, 2022, pp. 4536–4540.
  21. “The singing voice conversion challenge 2023,” CoRR, vol. abs/2306.14422, 2023.
  22. “On a test of whether one of two random variables is stochastically larger than the other,” The Annals of Mathematical Statistics, vol. 18, no. 1, pp. 50–60, 1947.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: