Ask the experts: sourcing high-quality datasets for nutritional counselling through Human-AI collaboration (2401.08420v1)

Published 16 Jan 2024 in cs.CL

Abstract: LLMs, with their flexible generation abilities, can be powerful data sources in domains with few or no available corpora. However, problems like hallucinations and biases limit such applications. In this case study, we pick nutrition counselling, a domain lacking any public resource, and show that high-quality datasets can be gathered by combining LLMs, crowd-workers and nutrition experts. We first crowd-source and cluster a novel dataset of diet-related issues, then work with experts to prompt ChatGPT into producing related supportive text. Finally, we let the experts evaluate the safety of the generated text. We release HAI-coaching, the first expert-annotated nutrition counselling dataset containing ~2.4K dietary struggles from crowd workers, and ~97K related supportive texts generated by ChatGPT. Extensive analysis shows that ChatGPT while producing highly fluent and human-like text, also manifests harmful behaviours, especially in sensitive topics like mental health, making it unsuitable for unsupervised use.

PDF Abstract

Introduction

Innovations in NLP such as LLMs have opened up possibilities for generating synthetic data across various domains including those with limited available corpora. However, these applications come with challenges. Issues such as data hallucinations and biases present risks, particularly in sensitive areas like healthcare. An exploratory paper brings a unique perspective on how human expertise combined with LLM capabilities can be leveraged to generate and evaluate high-quality datasets in the nutrition counseling domain, where public data scarcity presents a challenge. This paper introduces the creation of HAI-Coaching, a dataset formulated through Human-AI (HAI) collaboration designed to facilitate nutritional counseling.

Methodology

The approach taken involves collaboration between nutrition experts, crowd workers, and an LLM – specifically, ChatGPT. Initially, crowd workers were prompted to describe personal dietary struggles, which were then clustered into relevant categories with expert help. Using these categorized struggles, ChatGPT was prompted to produce supporting texts that aligned with the predefined categories of reflective listening, comfort statements, reframing perspectives, and offering practical suggestions. Experts played a critical role in the iterative process of prompt engineering and in evaluating the generated text's safety.

Results

The HAI-Coaching dataset released comprises approximately 2,400 dietary struggles sourced from crowd workers and about 97,000 supporting texts generated by ChatGPT, each evaluated by nutrition experts. An extensive analysis indicates that while ChatGPT exhibits high fluency and human-like text, it also produces content with problematic elements such as harmful behaviors and reinforcement of dangerous stereotypes. The results prompt caution and suggest that LLMs are not yet suitable for unsupervised use in sensitive domains.

Conclusion

The case paper underscores the power and limitation of LLMs in generating datasets in the sensitive field of healthcare. While LLMs such as ChatGPT can generate human-like text with expert guidance and evaluation, the responsibility of ensuring safety particularly in domains like nutrition counseling cannot yet be fully entrusted to AI. The development and release of HAI-Coaching offer a foundation for future research and applications, emphasizing the need for human expertise to guide and evaluate AI-generated content.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Simone Balloccu (8 papers)
Ehud Reiter (31 papers)
Vivek Kumar (62 papers)
Diego Reforgiato Recupero (18 papers)
Daniele Riboni (6 papers)

Related Papers

Find Related Papers

Tweets

https://twitter.com/simoneballoccu/status/1747649757567172649

https://twitter.com/ufal_cuni/status/1843676749630189604

https://twitter.com/EhudReiter/status/1749778126098362381