Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 154 tok/s

Gemini 2.5 Pro 43 tok/s Pro

GPT-5 Medium 23 tok/s Pro

GPT-5 High 32 tok/s Pro

GPT-4o 119 tok/s Pro

Kimi K2 175 tok/s Pro

GPT OSS 120B 362 tok/s Pro

Claude Sonnet 4.5 35 tok/s Pro

2000 character limit reached

Neuron-based Personality Trait Induction in Large Language Models (2410.12327v1)

Published 16 Oct 2024 in cs.CL

Abstract: LLMs have become increasingly proficient at simulating various personality traits, an important capability for supporting related applications (e.g., role-playing). To further improve this capacity, in this paper, we present a neuron-based approach for personality trait induction in LLMs, with three major technical contributions. First, we construct PersonalityBench, a large-scale dataset for identifying and evaluating personality traits in LLMs. This dataset is grounded in the Big Five personality traits from psychology and is designed to assess the generative capabilities of LLMs towards specific personality traits. Second, by leveraging PersonalityBench, we propose an efficient method for identifying personality-related neurons within LLMs by examining the opposite aspects of a given trait. Third, we develop a simple yet effective induction method that manipulates the values of these identified personality-related neurons. This method enables fine-grained control over the traits exhibited by LLMs without training and modifying model parameters. Extensive experiments validate the efficacy of our neuron identification and trait induction methods. Notably, our approach achieves comparable performance as fine-tuned models, offering a more efficient and flexible solution for personality trait induction in LLMs. We provide access to all the mentioned resources at https://github.com/RUCAIBox/NPTI.

Citations (1)

View on Semantic Scholar