2000 character limit reached
Creative Beam Search: LLM-as-a-Judge For Improving Response Generation (2405.00099v4)
Published 30 Apr 2024 in cs.AI, cs.CL, cs.HC, and cs.LG
Abstract: LLMs are revolutionizing several areas, including artificial creativity. However, the process of generation in machines profoundly diverges from that observed in humans. In particular, machine generation is characterized by a lack of intentionality and an underlying creative process. We propose a method called Creative Beam Search that uses Diverse Beam Search and LLM-as-a-Judge to perform response generation and response validation. The results of a qualitative experiment show how our approach can provide better output than standard sampling techniques. We also show that the response validation step is a necessary complement to the response generation step.
- 2019. Gradio: Hassle-free sharing and testing of ML models in the wild. arXiv:1906.02569 [cs.LG].
- Amabile, T. M. 1983. The social psychology of creativity: A componential conceptualization. Journal of Personality and Social Psychology 45(2):357–376.
- 2022. Constitutional AI: Harmlessness from AI Feedback. arXiv:2212.08073 [cs.CL].
- 2020. Bridging generative deep learning and computational creativity. In Proceedings of the 11th International Conference on Computational Creativity (ICCC’20).
- 2021. On the opportunities and risks of foundation models. arXiv:2108.07258 [cs.LG].
- 2023. Quality-Diversity through AI feedback. arXiv:2310.13032 [cs.CL].
- 2020. Language models are few-shot learners. In Advances in Neural Information Processing Systems (NIPS’20).
- 2020. Language GANs falling short. In Proceedings of the 8th International Conference on Learning Representations (ICLR’20).
- 2024. Self-play fine-tuning converts weak language models to strong language models. arXiv:2401.01335 [cs.LG].
- 2023. Can large language models be an alternative to human evaluations? In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23).
- 2017. Deep reinforcement learning from human preferences. In Advances in Neural Information Processing Systems (NeurIPS’17).
- 2023. Quality diversity through human feedback. In Proceedings of the NeurIPS’23 ALOE Workshop.
- 2023. On the creativity of large language models. arXiv:2304.00008 [cs.AI].
- 2024. Creativity and machine learning. ACM Computing Surveys. Accepted for Publication. To Appear.
- 2023. Pushing GPT’s creativity to its limits: Alternative Uses and Torrance Tests. In Proceedings of the 14th International Conference on Computational Creativity (ICCC’23).
- 2017. Lexically constrained decoding for sequence generation using grid beam search. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL’17).
- 2020. The curious case of neural text degeneration. In Proceedings of the 8th International Conference on Learning Representations (ICLR’20).
- 2023. RLAIF: Scaling reinforcement learning from human feedback with AI feedback. arXiv:2309.00267 [cs.CL].
- 2023. Is AI art another industrial revolution in the making? In Proceedings of the AAAI’23 Creative AI Across Modalities Workshop.
- OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL].
- 2018. Analyzing uncertainty in neural machine translation. In Proceedings of the 35th International Conference on Machine Learning (ICML’18).
- 2023. Leveraging human preferences to master poetry. In Proceedings of the AAAI’23 Workshop on Creative AI Across Modalities.
- 2023a. Bits of Grass: Does GPT already know how to write like Whitman? In Proceedings of the 14th International Conference on Computational Creativity (ICCC’23).
- 2023b. On the power of special-purpose GPT models to create and evaluate new poetry in old styles. In Proc. of the 14th International Conference on Computational Creativity (ICCC’23).
- Shanahan, M. 2024. Talking about large language models. Communications of the ACM 67(2):68–79.
- 2022. Putting GPT-3’s creativity to the (Alternative Uses) Test. In Proceedings of the 13th International Conference on Computational Creativity (ICCC’22).
- 2023. Brainstorm, then select: a generative language model improves its creativity score. In Proceedings of the AAAI’23 Workshop on Creative AI Across Modalities.
- 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv:2307.09288 [cs.CL].
- 2018. Diverse beam search for improved description of complex scenes. Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI’18).
- 2023. Large language models are not fair evaluators. arXiv:2305.17926 [cs.CL].
- 2022. Taxonomy of risks posed by language models. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT’22).
- 2023. Self-evaluation guided beam search for reasoning. In Proceedings of the 37th Conference on Neural Information Processing Systems (NIPS’23).
- 2024. Self-rewarding language models. arXiv:2401.10020 [cs.CL].
- 2023. Judging LLM-as-a-judge with MT-bench and chatbot arena. In Proceedings of the 37th Conference on Neural Information Processing Systems Datasets and Benchmarks Track (NIPS’23).
- Giorgio Franceschelli (11 papers)
- Mirco Musolesi (81 papers)