Papers
Topics
Authors
Recent
Search
2000 character limit reached

ORCD: Opposing Reasoning-based Clickbait Detection

Updated 24 January 2026
  • The paper introduces ORCD, a neural framework that employs a novel two-stage SORG method to generate dual (agree/disagree) rationales from LLMs for robust clickbait detection.
  • It uses a tri-encoder BERT architecture combining title-aware and title-free learners with contrastive and classification losses to integrate auxiliary context.
  • Empirical evaluations show ORCD significantly improves accuracy and F1 scores on benchmark datasets, outperforming traditional and prompting-based baselines.

Opposing Reasoning-based Clickbait Detection (ORCD) is a neural framework designed to improve automatic identification of clickbait, leveraging contrastive reasoning from LLMs through a novel two-stage prompt engineering methodology. Unlike prior detection systems that depend on direct judgments or pattern learning over labeled data, ORCD synergistically exploits LLM "sycophancy"—the tendency to generate reasoning that aligns with a target stance—to produce high-fidelity, mutually oppositional rationales for and against the plausibility of news headlines. The resultant model integrates these rationales as auxiliary context, using a multi-view BERT encoding strategy and contrastive learning objectives guided by soft labels derived from LLM-generated credibility ratings. Recent empirical evaluations demonstrate that ORCD achieves state-of-the-art results on multiple benchmark datasets, consistently surpassing both prompting-based and traditional supervised baselines (Zhang et al., 17 Jan 2026).

1. Self-renewal Opposing-stance Reasoning Generation (SORG)

At the core of the ORCD pipeline is the Self-renewal Opposing-stance Reasoning Generation (SORG) framework, which systematizes the elicitation of two high-quality, stance-specific rationales for each candidate headline: one advocating agreement and one advocating disagreement. SORG operates in two major stages:

  • Stage I: Initial Rating The pipeline issues an initial prompt £I(x)\pounds_I(x) to an LLM instructing it to rate the "public agreement" credibility of headline xx on a [0,100][0,100] scale, returning both a scalar rating V1V_1 and an explanation R1R_1. If V1V_1 falls outside the tunable interval [α,100α][\alpha,100-\alpha] (typically α=30\alpha=30), recursive re-rating prompts £Ir\pounds_{Ir} are issued to calibrate the score within bounds, up to MM iterations (M=3M=3).
  • Stage II: Self-Renewal Opposing-stance Reasoning SORG prompts the LLM for an "agree" rationale RAR_A (targeting VA>V1V_A>V_1) and a "disagree" rationale RDR_D (VD<V1V_D<V_1), each accompanied by a new credibility score. A series of polarity and margin checks enforce constraints: VA50+γV_A \geq 50+\gamma, (VAV1)β(V_A-V_1)\geq\beta; VD50γV_D\leq 50-\gamma, (V1VD)β(V_1-V_D)\geq\beta, for typical β=10\beta=10, γ=5\gamma=5. If unsatisfied, critique and regeneration sub-prompts iteratively refine the rationales until both stances pass all checks or MM max rounds are exhausted. Core prompts are engineered to focus rationales across four aspects: common sense, logic, completeness, and objectivity.

This process ensures the generation of robust, contrastive agreement/disagreement rationales on each headline, each labeled with a continuously valued, LLM-internal credibility score.

2. Model Architecture and Representation

ORCD employs a tri-encoder BERT architecture, consisting of three independently parameterized encoders (e.g., bert-base-uncased):

  • Title Encoder: Processes the input headline xx to produce sequence features FxRLx×dF_x \in \mathbb{R}^{L_x \times d}.
  • Agree-Rationale Encoder: Encodes RAFyR_A \rightarrow F_y.
  • Disagree-Rationale Encoder: Encodes RDFzR_D \rightarrow F_z.

Two complementary reasoning branches are constructed:

  • Title-aware Reasoning Learner: Cross-attention modules align title and rationale representations, producing context-sensitive vector pairs (f[xy],f[yx])(f_{[x|y]}, f_{[y|x]}) and (f[xz],f[zx])(f_{[x|z]}, f_{[z|x]}) for agree/disagree rationales, respectively, as well as a pure title embedding fxf_x.
  • Title-free Reasoning Learner: Independently summarizes agreement and disagreement rationales as fyf_y and fzf_z without direct title infusion.

Seven vectors [fx,fy,fz,f[xy],f[yx],f[xz],f[zx]][f_x, f_y, f_z, f_{[x|y]}, f_{[y|x]}, f_{[x|z]}, f_{[z|x]}] are concatenated and passed through a multilayer perceptron (MLP), yielding class logits for clickbait and non-clickbait categories.

3. Loss Functions and Training Objectives

Model training leverages both contrastive and categorical objectives, tightly integrating LLM-generated soft-labels for supervision. Given LLM-generated VAV_A and VDV_D scores (normalized as $s^\plus=V_A/100$, s=VD/100s^- = V_D/100), the losses are constructed as follows:

  • Title-aware Contrastive Loss:

Ltat=s+(1cos(fx,f[xy])),Ltat=max(0,cos(fx,f[xz])s)L_{tat} = s^{+} \cdot (1 - \cos(f_x, f_{[x|y]})), \quad L_{tat}^{-} = \max(0, \cos(f_x, f_{[x|z]}) - s^{-})

Similar expressions define the reasoning-title branch.

  • Title-free Loss:

Ltf=s+(1cos(fx,fy))+max(0,cos(fx,fz)s)L_{tf} = s^+ \cdot (1 - \cos(f_x, f_y)) + \max(0, \cos(f_x, f_z) - s^-)

  • Total Contrastive Loss:

Lcontrast=Lta+Ltf\mathcal{L}_{contrast} = L_{ta} + L_{tf}

  • Classification Loss:

Lclf=c{clickbait,non}yclogsoftmaxc(MLP(ffinal))\mathcal{L}_{clf} = -\sum_{c \in \{\text{clickbait},\,\text{non}\}} y_c \log \mathrm{softmax}_c(\mathrm{MLP}(f_{\text{final}}))

Total training loss is L=Lcontrast+λLclf\mathcal{L} = \mathcal{L}_{contrast} + \lambda \mathcal{L}_{clf} with λ=1\lambda=1.

Soft-label scores s+,ss^+, s^- directly modulate contrastive weights; no further temperature scaling is applied, as normalization by $100$ proved stable in practice. SORG parameters enforce that agreement and disagreement scores are well separated by absolute value and polarity.

4. Hyperparameters, Implementation, and Learning Dynamics

The system is implemented using bert-base-uncased encoders, AdamW optimizer (learning rate 3×1053 \times 10^{-5}, L2 regularization 1×1051 \times 10^{-5}), dropout of $0.3$ before classification, batch size $8$, epoch count $50$, cross-attention head count $8$, and hidden dimension d=768d=768 per vector. Margin-based contrastive loss terms are enforced via max(0,)\max(0, \ldots) construction.

The SORG prompting system uses α=30\alpha=30 for thresholding, β=10\beta=10 for rationale margin, γ=5\gamma=5 for polarity separation, and M=2M=2 to $3$ self-renewal iterations per rationale, ensuring consistent cue separation and robust rationale quality.

5. Empirical Results and Comparative Evaluation

ORCD was benchmarked on three public datasets: DL-Clickbait (DLC), CD-Clickbait (CDC), and NC-Clickbait (NC). Metrics include accuracy, macro-F1, and clickbait-F1. On DL-Clickbait (DLC), ORCD achieved 94.45%94.45\% accuracy (+1.8%+1.8\% over previous best), macro-F1 93.76%93.76\%, and clickbait-F1 95.83%95.83\%. On CDC: accuracy 87.51%87.51\%, macro-F1 87.45%87.45\%, clickbait-F1 88.32%88.32\%, and on NC: accuracy 73.84%73.84\%, macro-F1 73.75%73.75\%, clickbait-F1 75.32%75.32\%.

ORCD consistently outperformed zero/few-shot GPT-4o, fine-tuned variants of BERT, RoBERTa, BART, and specialized models such as MCDM, CVM, MUSER, SheepDog, and NRFE-D (Zhang et al., 17 Jan 2026).

6. Ablation Studies and Sycophancy Analysis

Several ablations were performed to isolate component contributions:

Component Disabled Accuracy Δ on DLC
Title-free Learner –3.1%
Title-aware Learner –2.7%
Soft-labeling (V_A,V_D) –1.5%
Both TF & soft-label –5.2%

Increasing SORG self-renewal rounds (sycophancy depth) monotonically improved model F1: 93.7%93.7\% (1 round), 94.6%94.6\% (2 rounds), 95.3%95.3\% (3 rounds). This suggests that iterative, contrastively supervised stance sampling cultivates finer reasoning representations and enhances discriminative capacity.

Case studies reveal that rationales generated via SORG for mutually opposing stances often highlight subtle stylistic and semantic nuances—such as emotional priming, logical incompleteness, or context gaps—that direct, single-pass LLM judgment fails to disentangle.

7. Representative Example and Nuanced Model Output

For a highly clickbaity headline, such as “Her Baby Is Asleep. Now Watch What The Puppy Does… Oh My God! (VIDEO),” SORG generated:

  • Agree rationale: “[This scenario is plausible—pets often surprise with small children—logical, complete, and minimally manipulative.]” (VA=88V_A=88)
  • Disagree rationale: “[The title exaggerates—‘Oh My God!’ is emotional bait, logic leap with no detail, incomplete context.]” (VD=22V_D=22)
  • Direct LLM judgment (w/o SORG): “Highly clickbaity due to vague suspense and emotional cues.”

These paired rationales allow ORCD to attend not only to the superficial framing but also to deeper, context-sensitive cues often missed by direct supervision. A plausible implication is that ORCD learns to encode and contrast both manipulative and legitimate explanatory patterns, improving resilience to adversarial headline construction.


For further methodology and empirical details, see "Acting Flatterers via LLMs Sycophancy: Combating Clickbait with LLMs Opposing-Stance Reasoning" (Zhang et al., 17 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Opposing Reasoning-based Clickbait Detection (ORCD).