Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Least Squares Temporal Difference Actor-Critic Methods with Applications to Robot Motion Control (1108.4698v2)

Published 23 Aug 2011 in cs.RO, cs.SY, and math.OC

Abstract: We consider the problem of finding a control policy for a Markov Decision Process (MDP) to maximize the probability of reaching some states while avoiding some other states. This problem is motivated by applications in robotics, where such problems naturally arise when probabilistic models of robot motion are required to satisfy temporal logic task specifications. We transform this problem into a Stochastic Shortest Path (SSP) problem and develop a new approximate dynamic programming algorithm to solve it. This algorithm is of the actor-critic type and uses a least-square temporal difference learning method. It operates on sample paths of the system and optimizes the policy within a pre-specified class parameterized by a parsimonious set of parameters. We show its convergence to a policy corresponding to a stationary point in the parameters' space. Simulation results confirm the effectiveness of the proposed solution.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Reza Moazzez Estanjini (1 paper)
  2. Xu Chu Ding (11 papers)
  3. Morteza Lahijanian (59 papers)
  4. Jing Wang (740 papers)
  5. Calin A. Belta (16 papers)
  6. Ioannis Ch. Paschalidis (66 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.