Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs (2406.10802v1)

Published 16 Jun 2024 in cs.CL and cs.AI

Abstract: Existing frameworks for assessing robustness of LLMs overly depend on specific benchmarks, increasing costs and failing to evaluate performance of LLMs in professional domains due to dataset limitations. This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs). Our framework generates original prompts from the triplets of knowledge graphs and creates adversarial prompts by poisoning, assessing the robustness of LLMs through the results of these adversarial attacks. We systematically evaluate the effectiveness of this framework and its modules. Experiments show that adversarial robustness of the ChatGPT family ranks as GPT-4-turbo > GPT-4o > GPT-3.5-turbo, and the robustness of LLMs is influenced by the professional domains in which they operate.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Aihua Pei (3 papers)
  2. Zehua Yang (4 papers)
  3. Shunan Zhu (4 papers)
  4. Ruoxi Cheng (9 papers)
  5. Ju Jia (4 papers)
  6. Lina Wang (29 papers)

Summary

We haven't generated a summary for this paper yet.