A Survey of Multi-Objective Sequential Decision-Making (1402.0590v1)

Published 4 Feb 2014 in cs.AI

Abstract: Sequential decision-making problems with multiple objectives arise naturally in practice and pose unique challenges for research in decision-theoretic planning and learning, which has largely focused on single-objective settings. This article surveys algorithms designed for sequential decision-making problems with multiple objectives. Though there is a growing body of literature on this subject, little of it makes explicit under what circumstances special methods are needed to solve multi-objective problems. Therefore, we identify three distinct scenarios in which converting such a problem to a single-objective one is impossible, infeasible, or undesirable. Furthermore, we propose a taxonomy that classifies multi-objective methods according to the applicable scenario, the nature of the scalarization function (which projects multi-objective values to scalar ones), and the type of policies considered. We show how these factors determine the nature of an optimal solution, which can be a single policy, a convex hull, or a Pareto front. Using this taxonomy, we survey the literature on multi-objective methods for planning and learning. Finally, we discuss key applications of such methods and outline opportunities for future work.

Citations (611)

View on Semantic Scholar

Summary

The paper presents a comprehensive taxonomy that categorizes methods based on scenarios like unknown weights, decision support, and predetermined nonlinear scalarizations.
It evaluates both planning and learning paradigms, discussing the use of dynamic and linear programming for linear scalarization and contrasting model-free with model-based approaches.
It identifies practical applications in fields such as environmental management, finance, and industrial control while pointing to future research in many-objective decision-making.

A Survey of Multi-Objective Sequential Decision-Making

The paper "A Survey of Multi-Objective Sequential Decision-Making" provides a comprehensive examination of algorithms tailored for sequential decision-making problems that inherently involve multiple objectives. These problems naturally arise in varied practical domains and pose unique challenges distinct from the more traditional single-objective settings.

Key Contributions and Taxonomy

The authors identify the scenarios in which a simple conversion of a multi-objective problem to a single-objective one is either impossible, infeasible, or undesirable. They outline three primary situations:

Unknown Weights Scenario: In this context, weights that determine the scalarization of multiple objectives into a single objective are not known prior to the learning or planning phase.
Decision Support Scenario: Here, the scalarization function is not clearly defined due to factors such as “fuzzy” user preferences or committee-based decision processes.
Known Weights Scenario: Although weights are known in advance, the scalarization may be nonlinear, rendering a direct conversion intractable or inappropriate.

To navigate these scenarios, the paper presents a detailed taxonomy. This taxonomy categorizes methods based on the type of scalarization function (linear vs. strictly monotonically increasing), whether a singular or multiple policies are necessary, and whether stochastic or deterministic policies are allowed. This nuanced classification enhances understanding of what constitutes optimal solutions under various assumptions, which can range from a single deterministic policy to more complex coverage sets like the Pareto front.

Computational Implications

The detailed discussions on scalarization functions emphasize that the choice of a solution concept should arise from assumptions about the scalarization process. For linear scalarization, coverage sets known as convex coverage sets (CCS) are sufficient, allowing deterministic stationary policies to be optimal.

The work further discusses the challenges in planning and learning under these frameworks. Planning methods are differentiated based on whether the scalarization function is linear or monotonically increasing, with techniques ranging from dynamic programming to linear programming.

Learning Paradigms

Approaches in multi-objective reinforcement learning (MORL) are critically assessed. The research presents model-based and model-free techniques, each with distinct advantages depending on the availability of a model for the decision-making problem. Notably, much of the existing research is grounded in model-free MORL, leaving room for exploration in model-based methods to possibly reduce the sample complexity and computational demands.

Applications and Future Directions

Practical applications span a diverse set of fields including environmental management, financial markets, and industrial control. The paper identifies areas like many-objective decision-making and expectation of scalarized return (ESR) formulations as promising avenues for future research. These complex, real-world applications demonstrate the utility and pressing need for advanced multi-objective decision-making algorithms.

Conclusion

The survey by Roijers et al. is a substantial contribution to the field of multi-objective decision-making. By charting a clear taxonomy and outlining pivotal scenarios, it paves the way for developing more efficient methods fit for a wide range of applications. The work sets a solid foundation for future research and innovation in both theoretical realms and practical implementations of multi-objective sequential decision-making systems.