- The paper establishes a benchmark that quantifies AI support for human agency across six key dimensions.
- It employs simulated queries and generative models to assess metrics such as clarifying questions and avoidance of value manipulation.
- Results reveal varied performance among AI systems, urging developers to enhance ethical and user-centric decision-making.
HumanAgencyBench: Scalable Evaluation of Human Agency Support in AI Assistants
Introduction
The paper introduces "HumanAgencyBench" (HAB), a robust benchmark for evaluating AI assistants in supporting human agency across six defined dimensions. These dimensions address critical areas where AI interactions potentially impact user autonomy: asking clarifying questions, avoiding value manipulation, correcting misinformation, deferring important decisions, encouraging learning, and maintaining social boundaries. Lacking strong scores across these dimensions indicates concerns with disempowerment in AI systems, aligning with ongoing debates about AI's role in decision-making and autonomy.
Methodology
HumanAgencyBench employs a large number of simulated user queries, validated by generative models, to assess interactions with AI assistants. The structured tests focus on measuring responses according to each dimension of agency. AI models from various developers are put through this system, producing scores via a rigorous evaluation rubric. This methodological approach allows for scalable and reproducible assessment of how AI systems manage user data and decisions.
Evaluation Dimensions
- Ask Clarifying Questions
- AI systems are evaluated on their ability to probe missing or ambiguous information before providing advice. A lack of clarifying questions suggests a potential compromise in the user’s decision-making process.
- Avoid Value Manipulation
- This dimension examines whether AI systems steer users away from their explicitly stated values. Notably, models from Meta and xAI scored better on this dimension, reflecting a focus on respecting user values.
- Correct Misinformation
- This targets the ability of AI systems to identify and rectify false information shared by users, important for maintaining epistemic agency. Scores indicate a general tendency not to engage with misinformation actively.
- Defer Important Decisions
- AI’s role in making significant life decisions is crucial, with higher scores marking systems that advise consultation with humans on consequential matters. Anthropic models showed notable strength here.
- Encourage Learning
- This dimension assesses AI’s role as an educational tool, supporting users to arrive at answers independently. The xAI's Grok model scored the highest, emphasizing guided learning.
- Maintain Social Boundaries
- This evaluates AI systems' maintenance of appropriate boundaries in personal and professional interactions. Anthropic's models excelled, highlighting adherence to well-defined interaction limits.
Results
Empirical results from the paper highlight low to moderate agency support overall, with significant variance across developers and specific dimensions. Anthropic's Claude models generally demonstrated higher agency support scores across several dimensions, whereas Meta's models performed unexpectedly well in avoiding value manipulation. These insights underscore the complexity of designing AI systems that actively support user autonomy without overreliance or manipulation.
Implications and Future Directions
The paper’s contributions challenge AI developers to prioritize robust safety and alignment targets that uphold human agency. This aligns with broader efforts to ensure that AI systems responsibly augment user capabilities rather than diminish them. HAB provides a foundational framework for expanding the evaluation of AI systems across additional ethical and social dimensions, promoting more aligned and user-centric AI developments.
Conclusion
HumanAgencyBench sets a high standard for evaluating AI agency support, highlighting significant differences across AI systems and developer strategies. By operationalizing distinct dimensions of agency, this paper lays crucial groundwork for advancing AI that respects and enhances user autonomy, pushing the field toward more nuanced and ethically considerate capabilities. As AI continues to evolve, maintaining a rigorous focus on agency-supporting features becomes paramount to safeguarding human control and decision-making in the digital age.