Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Defense of Adversarial Ranking Attack in Text Retrieval: Benchmark and Baseline via Detection (2307.16816v1)

Published 31 Jul 2023 in cs.IR and cs.CL

Abstract: Neural ranking models (NRMs) have undergone significant development and have become integral components of information retrieval (IR) systems. Unfortunately, recent research has unveiled the vulnerability of NRMs to adversarial document manipulations, potentially exploited by malicious search engine optimization practitioners. While progress in adversarial attack strategies aids in identifying the potential weaknesses of NRMs before their deployment, the defensive measures against such attacks, like the detection of adversarial documents, remain inadequately explored. To mitigate this gap, this paper establishes a benchmark dataset to facilitate the investigation of adversarial ranking defense and introduces two types of detection tasks for adversarial documents. A comprehensive investigation of the performance of several detection baselines is conducted, which involve examining the spamicity, perplexity, and linguistic acceptability, and utilizing supervised classifiers. Experimental results demonstrate that a supervised classifier can effectively mitigate known attacks, but it performs poorly against unseen attacks. Furthermore, such classifier should avoid using query text to prevent learning the classification on relevance, as it might lead to the inadvertent discarding of relevant documents.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Xuanang Chen (14 papers)
  2. Ben He (37 papers)
  3. Le Sun (111 papers)
  4. Yingfei Sun (29 papers)
Citations (5)