Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DSAC: Distributional Soft Actor Critic for Risk-Sensitive Reinforcement Learning (2004.14547v2)

Published 30 Apr 2020 in cs.LG and cs.AI

Abstract: In this paper, we present a new reinforcement learning (RL) algorithm called Distributional Soft Actor Critic (DSAC), which exploits the distributional information of accumulated rewards to achieve better performance. Seamlessly integrating SAC (which uses entropy to encourage exploration) with a principled distributional view of the underlying objective, DSAC takes into consideration the randomness in both action and rewards, and beats the state-of-the-art baselines in several continuous control benchmarks. Moreover, with the distributional information of rewards, we propose a unified framework for risk-sensitive learning, one that goes beyond maximizing only expected accumulated rewards. Under this framework we discuss three specific risk-related metrics: percentile, mean-variance and distorted expectation. Our extensive experiments demonstrate that with distribution modeling in RL, the agent performs better for both risk-averse and risk-seeking control tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Xiaoteng Ma (24 papers)
  2. Li Xia (25 papers)
  3. Zhengyuan Zhou (60 papers)
  4. Jun Yang (357 papers)
  5. Qianchuan Zhao (28 papers)
Citations (16)