Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Vernacular? I Barely Know Her: Challenges with Style Control and Stereotyping (2406.12679v1)

Published 18 Jun 2024 in cs.CL

Abstract: LLMs are increasingly being used in educational and learning applications. Research has demonstrated that controlling for style, to fit the needs of the learner, fosters increased understanding, promotes inclusion, and helps with knowledge distillation. To understand the capabilities and limitations of contemporary LLMs in style control, we evaluated five state-of-the-art models: GPT-3.5, GPT-4, GPT-4o, Llama-3, and Mistral-instruct- 7B across two style control tasks. We observed significant inconsistencies in the first task, with model performances averaging between 5th and 8th grade reading levels for tasks intended for first-graders, and standard deviations up to 27.6. For our second task, we observed a statistically significant improvement in performance from 0.02 to 0.26. However, we find that even without stereotypes in reference texts, LLMs often generated culturally insensitive content during their tasks. We provide a thorough analysis and discussion of the results.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Ankit Aich (5 papers)
  2. Tingting Liu (114 papers)
  3. Salvatore Giorgi (18 papers)
  4. Kelsey Isman (2 papers)
  5. Lyle Ungar (54 papers)
  6. Brenda Curtis (7 papers)
Citations (2)