When does the compositional OVI stopping criterion work well?

Establish whether the compositional optimistic value iteration stopping criterion performs well when the likelihood of reaching all individual open MDPs is high, and delineate the conditions under which this advantage holds.

Background

The optimistic global stopping criterion adapts OVI to the compositional setting. The authors observe experimentally that this criterion appears favorable when all component parts are likely to be reached.

A theoretical explanation of this phenomenon would help practitioners select termination strategies based on model characteristics.

References

Based on the experiments, we conjecture that the compositional OVI stopping criterion (/) works well when the likelihood of reaching all individual open MDPs is high, such as can be seen in the ChainsLoop500-Dice4 model.

— Compositional Value Iteration with Pareto Caching (2405.10099 - Watanabe et al., 2024) in Section 6, Discussion (Interpretation of Results)

When does the compositional OVI stopping criterion work well?

Background

References

Related Problems