Dice Question Streamline Icon: https://streamlinehq.com

Scalability of prior unsupervised RL approaches in complex, high-dimensional environments

Determine whether pure exploration-based unsupervised reinforcement learning methods (such as Random Network Distillation, APT, and Plan2Explore) and mutual information-based unsupervised skill discovery methods (such as DIAYN, DADS, and CIC) scale to complex environments with high intrinsic dimensionality, establishing conditions or evidence for their scalability or lack thereof.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper categorizes prior unsupervised RL methods into pure exploration approaches and unsupervised skill discovery approaches, noting that these methods have shown effectiveness on existing benchmarks. However, the authors question their scalability when confronted with environments that have large or high intrinsic dimensionality, where covering the entire state space or fully capturing dynamics may be infeasible. This unresolved question motivates the development of METRA, which aims to address scalability by covering a compact latent space metrically connected to the original state space.

References

While these approaches have been shown to be effective in several unsupervised RL benchmarks, it is not entirely clear whether such methods can indeed be scalable to complex environments with high intrinsic dimensionality.

METRA: Scalable Unsupervised RL with Metric-Aware Abstraction (2310.08887 - Park et al., 2023) in Section 1 (Introduction)