AI (r)evolution -- where are we heading? Thoughts about the future of music and sound technologies in the era of deep learning (2310.18320v1)
Abstract: AI technologies such as deep learning are evolving very quickly bringing many changes to our everyday lives. To explore the future impact and potential of AI in the field of music and sound technologies a doctoral day was held between Queen Mary University of London (QMUL, UK) and Sciences et Technologies de la Musique et du Son (STMS, France). Prompt questions about current trends in AI and music were generated by academics from QMUL and STMS. Students from the two institutions then debated these questions. This report presents a summary of the student debates on the topics of: Data, Impact, and the Environment; Responsible Innovation and Creative Practice; Creativity and Bias; and From Tools to the Singularity. The students represent the future generation of AI and music researchers. The academics represent the incumbent establishment. The student debates reported here capture visions, dreams, concerns, uncertainties, and contentious issues for the future of AI and music as the establishment is rightfully challenged by the next generation.
- MusicLM: Generating music from text. arXiv preprint arXiv:2301.11325, 2023.
- Musiclm: Generating music from text, 2023.
- Kofi Agawu. Representing african music. Critical Inquiry, 18(2):245–266, 1992.
- Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359, 2021.
- Deep learning techniques for music generation–a survey. arXiv preprint arXiv:1709.01620, 2017.
- Lasaft: Latent source attentive frequency transformation for conditioned source separation. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 171–175. IEEE, 2021.
- Alexandre Défossez. Hybrid spectrogram and waveform source separation. arXiv preprint arXiv:2111.03600, 2021.
- Jukebox: A generative model for music. arXiv preprint arXiv:2005.00341, 2020.
- The insistence of possibles: Towards a speculative pragmatism. Parse Journal, (7):13–19, 2017.
- Creativity in the era of artificial intelligence. In Journées d’Informatique Musicale, Strasbourg, France, October 2020. Keynote paper - JIM Conference 2020 - 12 pages.
- Singularity hypotheses. The Frontiers Collection. Springer, Berlin, 2012.
- Melon playlist dataset: A public dataset for audio-based playlist generation and music tagging. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 536–540. IEEE, 2021.
- T-recsys: A novel music recommendation system using deep learning. In 2019 IEEE international conference on consumer electronics (ICCE), pages 1–6. IEEE, 2019.
- Inductive biases for deep learning of higher-level cognition. Proceedings of the Royal Society A, 478(2266):20210068, 2022.
- Donna Jeanne Haraway. The Companion Species Manifesto: Dogs, People, and Significant Otherness. Prickly Paradigm Press, Chicago, IL, 2003.
- Donna Jeanne Haraway. Staying with the Trouble: Making Kin in the Chthulucene. Experimental Futures: Technological Lives, Scientific Arts, Anthropological Voices. Duke University Press, Durham, NC, 2016.
- Stereotyping norwegian salmon: An inventory of pitfalls in fairness benchmark datasets for natural language processing. In Proceedings of the Third Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda, pages 80–90, 2020.
- Spleeter: a fast and efficient music source separation tool with pre-trained models. Journal of Open Source Software, 5(50):2154, 2020. Deezer Research.
- Singing voice separation with deep u-net convolutional networks. In Sally Jo Cunningham, Zhiyao Duan, Xiao Hu, and Douglas Turnbull, editors, Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 2017, Suzhou, China, October 23-27, 2017, pages 745–751, 2017.
- Large pre-trained language models contain human-like biases of what is right and wrong to do. arXiv preprint arXiv:2103.11790, 2021.
- ChatGPT: Jack of all trades, master of none. arXiv preprint arXiv:2302.10724, 2023.
- Ethical, legal and social challenges of predictive policing. SSRN Electronic Journal, 2019.
- Scaling laws for neural language models. arXiv preprint arXiv:2001.08361, 2020.
- Thomas S. Kuhn. The Structure of Scientific Revolutions. University of Chicago Press, Chicago, IL, third edition, 1996.
- Fréchet audio distance: A metric for evaluating music enhancement algorithms. In 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 2019.
- Christian List. Group agency and artificial intelligence. Philosophy & technology, 34(4):1213–1242, 2021.
- The ethics of algorithms: key problems and solutions. AI & SOCIETY, pages 1–20, 2021.
- Six human-centered artificial intelligence grand challenges. International Journal of Human–Computer Interaction, 39(3):391–437, 2023.
- Will Machinic Art Lay Beyond Our Ability to Understand It? In Proceedings of the 24th International Symposium on Electronic Art, pages 22–30, 2018.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
- Jean-Claude Risset. Computer music: Why? Technical report, France-University of Texas Institute, 2003.
- Romano Santos. Can AI-Generated Art Replace Creative Humans? Vice, November 2022.
- Green ai. Communications of the ACM, 63(12):54–63, 2020.
- Energy and policy considerations for deep learning in nlp. arXiv preprint arXiv:1906.02243, 2019.
- Open-unmix-a reference implementation for music source separation. Journal of Open Source Software, 4(41):1667, 2019.
- Matthew Thompson. AI art generators face backlash from artists - but could they unlock creative potential? Sky News, April 2023.
- Improving music source separation based on deep neural networks through data augmentation and network blending. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 261–265. IEEE, 2017.
- R-net: Machine reading comprehension with self-matching networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 189–198, 2017.
- Explainable AI: A brief survey on history, research areas, approaches and challenges. In Natural Language Processing and Chinese Computing: 8th CCF International Conference, NLPCC 2019, Dunhuang, China, October 9–14, 2019, Proceedings, Part II 8, pages 563–574. Springer, 2019.
- Studio report: sound synthesis with ddsp and network bending techniques. July 2021.
- Visual to sound: Generating natural sound for videos in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3550–3558, 2018.