Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 63 tok/s

Gemini 2.5 Pro 49 tok/s Pro

GPT-5 Medium 11 tok/s Pro

GPT-5 High 10 tok/s Pro

GPT-4o 83 tok/s Pro

Kimi K2 139 tok/s Pro

GPT OSS 120B 438 tok/s Pro

Claude Sonnet 4 38 tok/s Pro

2000 character limit reached

Towards Socially and Morally Aware RL agent: Reward Design With LLM (2401.12459v2)

Published 23 Jan 2024 in cs.AI

Abstract: When we design and deploy an Reinforcement Learning (RL) agent, reward functions motivates agents to achieve an objective. An incorrect or incomplete specification of the objective can result in behavior that does not align with human values - failing to adhere with social and moral norms that are ambiguous and context dependent, and cause undesired outcomes such as negative side effects and exploration that is unsafe. Previous work have manually defined reward functions to avoid negative side effects, use human oversight for safe exploration, or use foundation models as planning tools. This work studies the ability of leveraging LLMs (LLM)' understanding of morality and social norms on safe exploration augmented RL methods. This work evaluates LLM's result against human feedbacks and demonstrates LLM's capability as direct reward signals.

References (11)