Signed-Prompt: A New Approach to Prevent Prompt Injection Attacks Against LLM-Integrated Applications (2401.07612v1)

Published 15 Jan 2024 in cs.CR and cs.AI

Abstract: The critical challenge of prompt injection attacks in LLMs integrated applications, a growing concern in the AI field. Such attacks, which manipulate LLMs through natural language inputs, pose a significant threat to the security of these applications. Traditional defense strategies, including output and input filtering, as well as delimiter use, have proven inadequate. This paper introduces the 'Signed-Prompt' method as a novel solution. The study involves signing sensitive instructions within command segments by authorized users, enabling the LLM to discern trusted instruction sources. The paper presents a comprehensive analysis of prompt injection attack patterns, followed by a detailed explanation of the Signed-Prompt concept, including its basic architecture and implementation through both prompt engineering and fine-tuning of LLMs. Experiments demonstrate the effectiveness of the Signed-Prompt method, showing substantial resistance to various types of prompt injection attacks, thus validating its potential as a robust defense strategy in AI security.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (1)

Xuchen Suo (1 paper)

Citations (18)

View on Semantic Scholar

Tweets

https://twitter.com/SarHaidar/status/1877368337396597138

Signed-Prompt: A New Approach to Prevent Prompt Injection Attacks Against LLM-Integrated Applications (2401.07612v1)

Related Papers

Tweets