AI Safety in Generative AI Large Language Models: A Survey (2407.18369v1)

Published 6 Jul 2024 in cs.CY and cs.CL

Abstract: LLM (LLMs) such as ChatGPT that exhibit generative AI capabilities are facing accelerated adoption and innovation. The increased presence of Generative AI (GAI) inevitably raises concerns about the risks and safety associated with these models. This article provides an up-to-date survey of recent trends in AI safety research of GAI-LLMs from a computer scientist's perspective: specific and technical. In this survey, we explore the background and motivation for the identified harms and risks in the context of LLMs being generative LLMs; our survey differentiates by emphasising the need for unified theories of the distinct safety challenges in the research development and applications of LLMs. We start our discussion with a concise introduction to the workings of LLMs, supported by relevant literature. Then we discuss earlier research that has pointed out the fundamental constraints of generative models, or lack of understanding thereof (e.g., performance and safety trade-offs as LLMs scale in number of parameters). We provide a sufficient coverage of LLM alignment -- delving into various approaches, contending methods and present challenges associated with aligning LLMs with human preferences. By highlighting the gaps in the literature and possible implementation oversights, our aim is to create a comprehensive analysis that provides insights for addressing AI safety in LLMs and encourages the development of aligned and secure models. We conclude our survey by discussing future directions of LLMs for AI safety, offering insights into ongoing research in this critical area.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (5)

Jaymari Chua (3 papers)
Yun Li (154 papers)
Shiyi Yang (8 papers)
Chen Wang (599 papers)
Lina Yao (194 papers)

Citations (8)

View on Semantic Scholar

Tweets

https://twitter.com/fly51fly/status/1818039878007001350

https://twitter.com/MindBranches/status/1822104180665852277

https://twitter.com/JaymariChua/status/1831374908343107714

https://twitter.com/JaymariChua/status/1824641367571878260

https://twitter.com/JaymariChua/status/1839648733115592884

AI Safety in Generative AI Large Language Models: A Survey (2407.18369v1)

Related Papers

Tweets