Tailoring Vaccine Messaging with Common-Ground Opinions (2405.10861v2)

Published 17 May 2024 in cs.CL, cs.AI, and cs.CY

Abstract: One way to personalize chatbot interactions is by establishing common ground with the intended reader. A domain where establishing mutual understanding could be particularly impactful is vaccine concerns and misinformation. Vaccine interventions are forms of messaging which aim to answer concerns expressed about vaccination. Tailoring responses in this domain is difficult, since opinions often have seemingly little ideological overlap. We define the task of tailoring vaccine interventions to a Common-Ground Opinion (CGO). Tailoring responses to a CGO involves meaningfully improving the answer by relating it to an opinion or belief the reader holds. In this paper we introduce TAILOR-CGO, a dataset for evaluating how well responses are tailored to provided CGOs. We benchmark several major LLMs on this task; finding GPT-4-Turbo performs significantly better than others. We also build automatic evaluation metrics, including an efficient and accurate BERT model that outperforms finetuned LLMs, investigate how to successfully tailor vaccine messaging to CGOs, and provide actionable recommendations from this investigation. Code and model weights: https://github.com/rickardstureborg/tailor-cgo Dataset: https://huggingface.co/datasets/DukeNLP/tailor-cgo

References (50)

Summary

The paper introduces a novel evaluation framework and Tailor-CGO dataset of 22,400 responses to assess personalized vaccine messaging.
It benchmarks various LLMs and finds that GPT-4-Turbo outperforms others in integrating opinions authentically into vaccine responses.
It demonstrates that using conducive opinion topics enhances message quality, providing actionable insights for improving public health communication.

Tailoring Vaccine Messaging to Common-Ground Opinions: An Evaluation Framework and Dataset for LLMs

Introduction

The paper addresses a pertinent issue in vaccine communication: the personalization of responses by tailoring them to common-ground opinions (CGOs). Through the development of a novel evaluation framework, the research investigates the ability of several LLMs to generate responses that resonate with users' beliefs and opinions concerning vaccines.

Task Definition and Dataset Creation

The paper introduces a task for generating vaccine-related responses tailored to specific CGOs. A successful response is defined by its ability to:

Address the vaccine concern comprehensively.
Integrate the provided opinion.
Accept the opinion authentically.
Link the opinion meaningfully to the concern.
Strengthen the response by the incorporation of the opinion.

To support this task, the authors created the Tailor-CGO dataset, comprising 22,400 responses from six different LLMs. This dataset is pivotal for evaluating the models' performance in tailoring messages to diverse opinion types.

Model Evaluation and Findings

The paper benchmarks several LLMs including Llama-2, Vicuna, WizardLM, GPT-3.5, GPT-4, and GPT-4-Turbo. The findings indicate that GPT-4-Turbo outperforms others in generating well-tailored responses. A notable conclusion is the variance in performance across different models and the enhancement in tailoring efficacy with increasing model complexity.

Annotation and Automatic Evaluation

Human annotations indicated a preference for relative scoring over absolute scoring due to higher inter-annotator agreement. The annotated data was used to train automatic evaluators, including GPT-4-Turbo, BERT, and Llama-2 models. The fine-tuned BERT model demonstrated superior performance in automatic evaluation, highlighting the feasibility of leveraging distilled models for cost-effective assessment.

Analysis of Tailoring Strategies

The research explores which opinions are most effective for tailoring vaccine messages. Topics such as 'self-perception' yielded higher-quality responses, while polarized topics like 'religion' and 'race' were less effective. This nuanced analysis underscores the importance of strategic opinion selection for impactful public health messaging.

Practical Implications

The paper's findings have significant implications for public health professionals:

Leveraging LLMs can enhance personalized communication in vaccine advocacy.
Identifying and utilizing conducive opinion topics can improve message reception and engagement.
Employing automatic evaluation metrics can streamline the assessment process, facilitating scalable deployment of tailored messaging.

Future Directions

Future research could explore:

Advanced methods for identifying audience-specific opinions.
More intricate tasks for LLMs to generate nuanced and persuasive vaccine communication.
Further optimization of automatic evaluators to enhance performance and reliability.

Conclusion

The paper provides a comprehensive framework for tailoring vaccine messaging using LLMs, highlighting the efficacy of current models and the importance of strategic opinion selection. It opens avenues for practical application in public health and lays the groundwork for future advancements in personalized AI-driven communication.

Overall, the research advances our understanding of the interplay between AI and human communication, offering valuable insights into how sophisticated models can be harnessed to address critical societal challenges.