Random Silicon Sampling: Simulating Human Sub-Population Opinion Using a Large Language Model Based on Group-Level Demographic Information (2402.18144v1)

Published 28 Feb 2024 in cs.AI and cs.CY

Abstract: LLMs exhibit societal biases associated with demographic information, including race, gender, and others. Endowing such LLMs with personalities based on demographic data can enable generating opinions that align with those of humans. Building on this idea, we propose "random silicon sampling," a method to emulate the opinions of the human population sub-group. Our study analyzed 1) a LLM that generates the survey responses that correspond with a human group based solely on its demographic distribution and 2) the applicability of our methodology across various demographic subgroups and thematic questions. Through random silicon sampling and using only group-level demographic information, we discovered that LLMs can generate response distributions that are remarkably similar to the actual U.S. public opinion polls. Moreover, we found that the replicability of LLMs varies depending on the demographic group and topic of the question, and this can be attributed to inherent societal biases in the models. Our findings demonstrate the feasibility of mirroring a group's opinion using only demographic distribution and elucidate the effect of social biases in LLMs on such simulations.

Authors (7)

Seungjong Sun (2 papers)
Eungu Lee (2 papers)
Dongyan Nan (2 papers)
Xiangying Zhao (1 paper)
Wonbyung Lee (2 papers)
Bernard J. Jansen (8 papers)
Jang Hyun Kim (2 papers)

Citations (13)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Random Silicon Sampling: Simulating Human Sub-Population Opinion Using a Large Language Model Based on Group-Level Demographic Information (2402.18144v1)

Summary

Related Papers