Evaluating Agents without Rewards (2012.11538v2)

Published 21 Dec 2020 in cs.LG, cs.AI, and cs.RO

Abstract: Reinforcement learning has enabled agents to solve challenging tasks in unknown environments. However, manually crafting reward functions can be time consuming, expensive, and error prone to human error. Competing objectives have been proposed for agents to learn without external supervision, but it has been unclear how well they reflect task rewards or human behavior. To accelerate the development of intrinsic objectives, we retrospectively compute potential objectives on pre-collected datasets of agent behavior, rather than optimizing them online, and compare them by analyzing their correlations. We study input entropy, information gain, and empowerment across seven agents, three Atari games, and the 3D game Minecraft. We find that all three intrinsic objectives correlate more strongly with a human behavior similarity metric than with task reward. Moreover, input entropy and information gain correlate more strongly with human similarity than task reward does, suggesting the use of intrinsic objectives for designing agents that behave similarly to human players.

Citations (11)

View on Semantic Scholar

Summary

The paper introduces intrinsic objectives—input entropy, information gain, and empowerment—to evaluate agent behavior without relying on traditional extrinsic rewards.
It employs a retrospective analysis on pre-collected datasets from environments like Atari and Minecraft to correlate intrinsic behaviors with human-like performance.
The study finds that input entropy and information gain strongly correlate with human behavior while empowerment reveals unique exploratory dynamics.

Evaluating Agents without Rewards

The paper "Evaluating Agents without Rewards" presents a paper on reinforcement learning (RL) agents with a focus on intrinsic motivation and the evaluation of agent behavior without relying solely on extrinsic rewards. The authors target a crucial challenge in RL: designing reward functions that are informative and cost-effective, both in terms of time and resource expenditure. They propose and evaluate intrinsic objectives as an alternative for understanding agent behavior in tasks, particularly when extrinsic rewards are sparse or unavailable.

Objectives and Methodology

The paper investigates three intrinsic objectives: input entropy, information gain, and empowerment, across several environments, including three Atari games (Breakout, Seaquest, Montezuma's Revenge) and Minecraft. These objectives aim to mimic human-like behavior better than conventional task rewards. The authors DO NOT directly optimize these objectives online; rather, they compute them retrospectively on pre-collected datasets of agent behavior. This methodological choice allows the comparison of intrinsic objectives through correlation analysis, effectively expediting the evaluation process without the need for training new agents for each intrinsic objective.

Intrinsic Objectives:

Input Entropy: Encourages agents to seek out rare sensory inputs, indicating exploration.
Information Gain: Rewards behaviors that help the agent better predict or understand its environment.
Empowerment: Measures how much influence an agent has over its environment.

The paper uses these intrinsic objectives to correlate agent behaviors with task rewards and with a measure of similarity to human behavior across the defined environments.

Key Findings and Analysis

The empirical analysis provides several insights into the relative efficacy of intrinsic motivation strategies:

Correlation with Human Behavior: All intrinsic objectives demonstrated higher correlations with a human behavior similarity metric than with extrinsic task rewards. Notably, input entropy and information gain showed stronger correlations with human-like behavior than typical task-based rewards, suggesting their potential utility in developing agents that behave more like human players.
Input Entropy and Information Gain: These objectives exhibited high correlation with each other, indicating that they might share behavioral attributes beneficial for exploration, distinct from empowerment.
Empowerment's Unique Role: Empowerment, although less correlated with the other two intrinsic objectives, suggests a separate behavioral trait that could complement input entropy and information gain in multi-objective exploration methods.

Implications and Future Directions

The findings emphasize the potential for intrinsic objectives as a tool for designing agents in environments where human-like exploration and behavior are desired beyond task-specific successes. The retrospective methodology of comparing objectives opens avenues for more efficient research in developing effective intrinsic motivations without the prohibitive costs associated with frequent agent retraining.

Future research can build upon these findings to explore more sophisticated preprocessing techniques, such as neural network-based representations for better semantic similarity boards. Moreover, increasing the scope of human data for detailed behavior metrics can also contribute to more precise evaluations. There is also potential in exploring combinations of intrinsic and extrinsic objectives to address environments that display complex dynamics and sparse rewards.

In summary, the paper advances the understanding of intrinsic motivation's role in RL and proposes an effective framework for assessing agent behavior without sole reliance on extrinsic rewards. This work is foundational for the pursuit of RL models that align more closely with the adaptability and exploratory traits observed in human behavior.

PDF Markdown

Related Papers

YouTube

Show All Videos