Directed Diversity: Leveraging Language Embedding Distances for Collective Creativity in Crowd Ideation (2101.06030v1)

Published 15 Jan 2021 in cs.HC

Abstract: Crowdsourcing can collect many diverse ideas by prompting ideators individually, but this can generate redundant ideas. Prior methods reduce redundancy by presenting peers' ideas or peer-proposed prompts, but these require much human coordination. We introduce Directed Diversity, an automatic prompt selection approach that leverages LLM embedding distances to maximize diversity. Ideators can be directed towards diverse prompts and away from prior ideas, thus improving their collective creativity. Since there are diverse metrics of diversity, we present a Diversity Prompting Evaluation Framework consolidating metrics from several research disciplines to analyze along the ideation chain - prompt selection, prompt creativity, prompt-ideation mediation, and ideation creativity. Using this framework, we evaluated Directed Diversity in a series of a simulation study and four user studies for the use case of crowdsourcing motivational messages to encourage physical activity. We show that automated diverse prompting can variously improve collective creativity across many nuanced metrics of diversity.

PDF Abstract

Okay, here's a detailed summary of the paper "Directed Diversity: Leveraging Language Embedding Distances for Collective Creativity in Crowd Ideation," broken down into key sections and focusing on the core concepts, methods, and findings.

Core Problem and Motivation

The paper addresses a fundamental challenge in crowdsourcing creative tasks (ideation): redundancy. While crowdsourcing can generate a large quantity of ideas, many of those ideas are often very similar, limiting the overall diversity and therefore the value of the collective output. Traditional methods to improve diversity (e.g., showing participants other people's ideas) can be helpful, but they also risk cognitive fixation (people getting stuck on similar ideas) or require significant manual effort to manage and organize. The authors aim to develop a more scalable and automated approach.

Proposed Solution: Directed Diversity

The core contribution is a technique called "Directed Diversity," which automatically selects prompts to guide crowdworkers (ideators) toward generating more diverse ideas. It's based on these key ideas:

Phrase Extraction: Instead of relying on free-form ideation or manually created prompts, the system starts by extracting relevant phrases from a corpus of text related to the ideation domain (in this case, motivational messages for physical activity). They used news articles and online forum posts, tokenized them, extracted noun phrases, verb phrases, and prepositional phrases, and then filtered them for length and quality.
Phrase Embedding (using Universal Sentence Encoder - USE): This is the crucial step. Each extracted phrase is converted into a high-dimensional vector representation using a pre-trained LLM called the Universal Sentence Encoder (USE). USE places semantically similar phrases close together in a vector space, and dissimilar phrases farther apart. The distance between these vectors is used as a measure of semantic difference. Crucially, they use angular distance because USE embeddings are unit vectors.
Phrase Selection (Maximizing Diversity): This is where the "directed" part comes in. The system doesn't just randomly pick phrases. It uses algorithms to select a subset of phrases that are maximally diverse based on their distances in the embedding space. They specifically use the "Remote-MST" diversity formulation (related to Minimum Spanning Trees), which aims to find a set of points where the sum of the edge weights in a minimum spanning tree connecting them is maximized. They use a greedy algorithm based on hierarchical clustering to approximate this NP-hard problem.
Directing Towards and Away: The system can do two things:
- Directing Towards: Select phrases that are far apart from each other, encouraging exploration of diverse areas of the idea space.
- Directing Away: Select phrases that are far apart from existing ideas (submitted previously), actively reducing redundancy. This involves a thresholding step to exclude phrases too close to prior submissions.
Prompting with Groups of Phrases: The system can also group multiple related phrases together into a single prompt. This is done by finding nearest neighbors in the embedding space for a selected "seed" phrase. The embedding of a multi-phrase prompt is the average of the individual phrase embeddings.

Diversity Prompting Evaluation Framework

The authors don't just propose a technique; they also develop a comprehensive framework for evaluating it (and similar techniques). This is a significant contribution in itself. The framework considers the entire "ideation chain":

Prompt Selection: The algorithm used to choose prompts.
Prompt Creativity: How diverse, understandable, and relevant the selected prompts are. This is assessed both computationally (using embedding distances) and subjectively (through user ratings).
Prompt-Ideation Mediation: How the prompts affect the ideation process itself. This includes measures of effort (time taken), ease of ideation, and how much the ideators actually use the prompt content (prompt adoption).
Ideation Creativity: How diverse and creative the resulting ideas are. This is also assessed both computationally and subjectively, including through thematic analysis.

Key Constructs and Measures

The framework uses a wide range of measures, organized into constructs derived from factor analysis:

Embedding-based Diversity Metrics:
- Individual: Mean Pairwise Distance, Minimum Pairwise Distance, Sparseness.
- Collective: Remote-Clique, Chamfer Distance, MST Dispersion, Span, Sparseness, Entropy.
Thematic Category Metrics: Flexibility (number of unique categories/themes) and Originality (how rare each category/theme is). These are derived from manual coding of the ideas.
Perceived Creativity: Ideators rate prompts for unexpectedness, understandability, relevance, and quality. Validators rate ideations for quality, informativeness, helpfulness, and (for collections of ideations) unrepetitiveness.
Ideation Effort: Fluency (inverse of ideation time) and Ease (self-reported).
Prompt Adoption: Prompt Recall, Prompt Precision, and Prompt-Ideation Distance.

Experiments and Results

The authors conducted a series of studies:

Characterization Simulation Study: This was a computational paper (no humans) to test the prompt selection algorithm itself, varying parameters like prompt count and prompt size. It confirmed that Directed Diversity could select more diverse prompts than random selection, especially for smaller numbers of prompts.
Ideation User Study: This involved human participants generating motivational messages under different prompting conditions (None, Random, Directed, with 1 or 3 phrases per prompt). Key findings:
- Manipulation Check: Directed prompts were perceived as more unexpected, but also less understandable and of slightly lower quality than random prompts.
- Mediation: Directed prompts were harder to use (lower ease, less adoption), but prompt diversity did positively influence ideation diversity.
- Ideation Diversity: Directed prompts led to higher ideation diversity (both computationally and thematically) compared to random or no prompts, without compromising ideation quality.
Validation User Studies (Individual and Collective): These studies used independent crowdworkers to rate the quality and diversity of the ideas generated in the Ideation Study. They used different rating methods (individual ratings, ranking groups of ideas, and pairwise comparisons). Key findings:
- Ideations from Directed prompts were rated as more different and less repetitive than those from Random or None.
- Prompted ideations (both Directed and Random) were rated as more informative and helpful than unprompted ideations.
- There was no significant difference in overall ideation quality between conditions.

Key Conclusions and Discussion

Directed Diversity Works: The technique successfully improves the diversity of crowdsourced ideas, even though it makes the ideation task slightly harder for individuals.
Importance of Comprehensive Evaluation: The framework highlights the need for nuanced measures of diversity and a mechanistic understanding of how prompting affects ideation. Different diversity metrics can yield different results.
Generalizability: The approach is generalizable to other domains and ideation tasks beyond text, as long as concepts can be represented as vectors. Different LLMs and diversity maximization algorithms could be used.
Future Work: Improving prompt understandability and relevance, exploring domain-specific models, and validating the framework with other ideation support methods.

In Simple Terms

The paper presents a way to make crowdsourcing more creative by automatically suggesting diverse starting points (prompts) for ideas. It uses AI (a LLM) to understand the meaning of phrases and pick ones that are different from each other and from ideas that have already been submitted. It shows that this approach works, even though it makes the task a bit harder for the people coming up with the ideas. The evaluation framework is also a major, generalized contribution. It allows different techniques to be rigorously evaluated in a comparable fashion.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Samuel Rhys Cox (10 papers)
Yunlong Wang (91 papers)
Ashraf Abdul (3 papers)
Christian von der Weth (8 papers)
Brian Y. Lim (14 papers)

Citations (26)

View on Semantic Scholar

Directed Diversity: Leveraging Language Embedding Distances for Collective Creativity in Crowd Ideation (2101.06030v1)

Related Papers