Papers
Topics
Authors
Recent
2000 character limit reached

Which Features are Best for Successor Features?

Published 15 Feb 2025 in cs.LG, math.OC, and stat.ML | (2502.10790v1)

Abstract: In reinforcement learning, universal successor features (SFs) are a way to provide zero-shot adaptation to new tasks at test time: they provide optimal policies for all downstream reward functions lying in the linear span of a set of base features. But it is unclear what constitutes a good set of base features, that could be useful for a wide set of downstream tasks beyond their linear span. Laplacian eigenfunctions (the eigenfunctions of $\Delta+\Delta\ast$ with $\Delta$ the Laplacian operator of some reference policy and $\Delta\ast$ that of the time-reversed dynamics) have been argued to play a role, and offer good empirical performance. Here, for the first time, we identify the optimal base features based on an objective criterion of downstream performance, in a non-tautological way without assuming the downstream tasks are linear in the features. We do this for three generic classes of downstream tasks: reaching a random goal state, dense random Gaussian rewards, and random ``scattered'' sparse rewards. The features yielding optimal expected downstream performance turn out to be the \emph{same} for these three task families. They do not coincide with Laplacian eigenfunctions in general, though they can be expressed from $\Delta$: in the simplest case (deterministic environment and decay factor $\gamma$ close to $1$), they are the eigenfunctions of $\Delta{-1}+(\Delta{-1})\ast$. We obtain these results under an assumption of large behavior cloning regularization with respect to a reference policy, a setting often used for offline RL. Along the way, we get new insights into KL-regularized\option{natural} policy gradient, and into the lack of SF information in the norm of Bellman gaps.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.