Constrained Thompson Sampling for Real-Time Electricity Pricing with Grid Reliability Constraints
Abstract: We consider the problem of an aggregator attempting to learn customers' load flexibility models while implementing a load shaping program by means of broadcasting daily dispatch signals. We adopt a multi-armed bandit formulation to account for the stochastic and unknown nature of customers' responses to dispatch signals. We propose a constrained Thompson sampling heuristic, Con-TS-RTP, that accounts for various possible aggregator objectives (e.g., to reduce demand at peak hours, integrate more intermittent renewable generation, track a desired daily load profile, etc) and takes into account the operational constraints of a distribution system to avoid potential grid failures as a result of uncertainty in the customers' response. We provide a discussion on the regret bounds for our algorithm as well as a discussion on the operational reliability of the distribution system's constraints being upheld throughout the learning process.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.