Utility Maximizing Sequential Sensing Over a Finite Horizon (1705.05960v1)

Published 17 May 2017 in cs.SY

Abstract: We consider the problem of optimally utilizing $N$ resources, each in an unknown binary state. The state of each resource can be inferred from state-dependent noisy measurements. Depending on its state, utilizing a resource results in either a reward or a penalty per unit time. The objective is a sequential strategy governing the decision of sensing and exploitation at each time to maximize the expected utility (i.e., total reward minus total penalty and sensing cost) over a finite horizon $L$. We formulate the problem as a Partially Observable Markov Decision Process (POMDP) and show that the optimal strategy is based on two time-varying thresholds for each resource and an optimal selection rule for which resource to sense. Since a full characterization of the optimal strategy is generally intractable, we develop a low-complexity policy that is shown by simulations to offer near optimal performance. This problem finds applications in opportunistic spectrum access, marketing strategies and other sequential resource allocation problems.

Citations (10)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Utility Maximizing Sequential Sensing Over a Finite Horizon (1705.05960v1)

Summary

Related Papers