Comparing Traditional and LLM-based Search for Consumer Choice: A Randomized Experiment (2307.03744v2)

Published 7 Jul 2023 in cs.HC

Abstract: Recent advances in the development of LLMs are rapidly changing how online applications function. LLM-based search tools, for instance, offer a natural language interface that can accommodate complex queries and provide detailed, direct responses. At the same time, there have been concerns about the veracity of the information provided by LLM-based tools due to potential mistakes or fabrications that can arise in algorithmically generated text. In a set of online experiments we investigate how LLM-based search changes people's behavior relative to traditional search, and what can be done to mitigate overreliance on LLM-based output. Participants in our experiments were asked to solve a series of decision tasks that involved researching and comparing different products, and were randomly assigned to do so with either an LLM-based search tool or a traditional search engine. In our first experiment, we find that participants using the LLM-based tool were able to complete their tasks more quickly, using fewer but more complex queries than those who used traditional search. Moreover, these participants reported a more satisfying experience with the LLM-based search tool. When the information presented by the LLM was reliable, participants using the tool made decisions with a comparable level of accuracy to those using traditional search, however we observed overreliance on incorrect information when the LLM erred. Our second experiment further investigated this issue by randomly assigning some users to see a simple color-coded highlighting scheme to alert them to potentially incorrect or misleading information in the LLM responses. Overall we find that this confidence-based highlighting substantially increases the rate at which users spot incorrect information, improving the accuracy of their overall decisions while leaving most other measures unaffected.

PDF Abstract

An Analytical Overview of "Comparing Traditional and LLM-based Search for Consumer Choice: A Randomized Experiment"

The paper "Comparing Traditional and LLM-based Search for Consumer Choice: A Randomized Experiment" offers an empirical examination of the impact of LLM-based (LLM) search tools on consumer decision-making tasks when compared to traditional search engines. This research is grounded in the current advancements in AI technologies, specifically emphasizing how LLMs are transforming search engine functionalities, which are crucial in shaping consumer behavior online.

Experimental Design and Methodology

The paper proceeds with a thorough experimental approach to evaluate the differences in user interaction between traditional search engines and an LLM-based search tool. Participants were exposed to a decision-making scenario in which they assumed the role of purchasing an SUV based on specific numeric attributes. The paper leveraged a randomized design to assign participants to use either a traditional search engine interface or an LLM-powered tool built on GPT-3.5, examining variables such as task completion time, query complexity, and decision accuracy.

Key Findings

Efficiency and User Experience:
- The experiments unveiled that users leveraging the LLM-based tool demonstrated a remarkable reduction in task completion time, about 50% faster compared to traditional search users. This increased efficiency was associated with a higher complexity of queries, suggesting that LLMs facilitate more direct and nuanced interactions.
- Users expressed greater satisfaction with the LLM interface, as indicated by elevated user experience ratings, underscoring a positive correlation between tool usage and perceived usefulness.
Accuracy Concerns:
- While routine tasks saw no significant accuracy differential as compared to traditional search, challenging tasks revealed the susceptibility of LLMs to propagate inaccuracies. This was particularly evident when the tool provided factually incorrect data, a reflection of the known issue of LLMs producing plausible but incorrect information (often termed "hallucinations").
Overreliance and Error Detection Mitigation:
- The second experiment further explored error detection strategies by incorporating confidence-based color-coded highlights to alert users to potentially unreliable information. This intervention effectively increased decision accuracy by prompting users to verify uncertain outputs. Notably, these cues doubled the detection of errors without negatively impacting the perceived reliability of the LLM results.

Theoretical and Practical Implications

The findings have substantial implications, both theoretically and practically. Theoretically, the research contributes to the evolving discourse on human-AI interaction, particularly concerning the adeptness of LLMs in nuanced information retrieval. Practically, the paper suggests avenues for enhancing search technologies by integrating mechanisms that convey response uncertainty effectively, thus addressing the critical issue of overdependence on AI-provided information.

Future Research Directions

Although the paper robustly outlines the step forward for LLM-based search engines, it leaves room for several future research trajectories. Firstly, exploring alternative methods of signaling confidence could refine user interaction techniques further. Secondly, extending the scope beyond the simplified decision tasks employed could provide insights into consumer behavior in more complex decision processes and varied product categories.

In conclusion, the paper by Spatharioti et al. provides a comprehensive inquiry into the potential and pitfalls of LLM-driven search tools. This work underscores the promise of LLMs in enhancing decision-making processes while also highlighting the necessity for innovation in communicating uncertainties to mitigate erroneous trust in AI outputs. Such research is vital for steering the future of AI in search technologies and their applications in consumer environments.