Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Generating Realistic Arm Movements in Reinforcement Learning: A Quantitative Comparison of Reward Terms and Task Requirements (2402.13949v2)

Published 21 Feb 2024 in cs.RO

Abstract: The mimicking of human-like arm movement characteristics involves the consideration of three factors during control policy synthesis: (a) chosen task requirements, (b) inclusion of noise during movement execution and (c) chosen optimality principles. Previous studies showed that when considering these factors (a-c) individually, it is possible to synthesize arm movements that either kinematically match the experimental data or reproduce the stereotypical triphasic muscle activation pattern. However, to date no quantitative comparison has been made on how realistic the arm movement generated by each factor is; as well as whether a partial or total combination of all factors results in arm movements with human-like kinematic characteristics and a triphasic muscle pattern. To investigate this, we used reinforcement learning to learn a control policy for a musculoskeletal arm model, aiming to discern which combination of factors (a-c) results in realistic arm movements according to four frequently reported stereotypical characteristics. Our findings indicate that incorporating velocity and acceleration requirements into the reaching task, employing reward terms that encourage minimization of mechanical work, hand jerk, and control effort, along with the inclusion of noise during movement, leads to the emergence of realistic human arm movements in reinforcement learning. We expect that the gained insights will help in the future to better predict desired arm movements and corrective forces in wearable assistive devices.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. A. Cieza, K. Causey, K. Kamenov, S. W. Hanson, S. Chatterji, and T. Vos, “Global estimates of the need for rehabilitation based on the global burden of disease study 2019: a systematic analysis for the global burden of disease study 2019,” The Lancet, vol. 396, no. 10267, pp. 2006–2017, 2020.
  2. P. Maciejasz, J. Eschweiler, K. Gerlach-Hahn, A. Jansen-Troy, and S. Leonhardt, “A survey on robotic devices for upper limb rehabilitation,” Journal of neuroengineering and rehabilitation, vol. 11, no. 1, pp. 1–29, 2014.
  3. P. Schumacher, T. Geijtenbeek, V. Caggiano, V. Kumar, S. Schmitt, G. Martius, and D. F. Haeufle, “Natural and robust walking using reinforcement learning without demonstrations in high-dimensional musculoskeletal models,” arXiv preprint arXiv:2309.02976, 2023.
  4. F. Fischer, M. Bachinski, M. Klar, A. Fleig, and J. Müller, “Reinforcement learning control of a biomechanical model of the upper extremity,” Scientific Reports, vol. 11, no. 1, p. 14445, 2021.
  5. P. Morasso, “Spatial control of arm movements,” Experimental brain research, vol. 42, no. 2, pp. 223–227, 1981.
  6. W. Abend, E. Bizzi, and P. Morasso, “Human arm trajectory formation.” Brain: a journal of neurology, vol. 105, no. Pt 2, pp. 331–348, 1982.
  7. M. M. Wierzbicka, A. W. Wiegner, and B. T. Shahani, “Role of agonist and antagonist muscles in fast arm movements in man,” Experimental Brain Research, vol. 63, pp. 331–340, 1986.
  8. D. A. Kistemaker, A. K. J. Van Soest, and M. F. Bobbert, “Is equilibrium point control feasible for fast goal-directed single-joint movements?” Journal of Neurophysiology, vol. 95, no. 5, pp. 2898–2912, 2006.
  9. I. S. MacKenzie, “A note on the information-theoretic basis for fitts’ law,” Journal of motor behavior, vol. 21, no. 3, pp. 323–330, 1989.
  10. B. Berret, E. Chiovetto, F. Nori, and T. Pozzo, “Evidence for composite cost functions in arm movement planning: an inverse optimal control approach,” PLoS computational biology, vol. 7, no. 10, p. e1002183, 2011.
  11. T. Flash and N. Hogan, “The coordination of arm movements: an experimentally confirmed mathematical model,” Journal of neuroscience, vol. 5, no. 7, pp. 1688–1703, 1985.
  12. I. Wochner, D. Driess, H. Zimmermann, D. F. Haeufle, M. Toussaint, and S. Schmitt, “Optimality principles in human point-to-manifold reaching accounting for muscle dynamics,” Frontiers in computational neuroscience, vol. 14, p. 38, 2020.
  13. Y. Ueyama, “Costs of position, velocity, and force requirements in optimal control induce triphasic muscle activation during reaching movement,” Scientific Reports, vol. 11, no. 1, p. 16815, 2021.
  14. A. Abdolmaleki, J. T. Springenberg, Y. Tassa, R. Munos, N. Heess, and M. Riedmiller, “Maximum a posteriori policy optimisation,” in International Conference on Learning Representations, 2018.
  15. P. Schumacher, D. Haeufle, D. Büchler, S. Schmitt, and G. Martius, “DEP-RL: Embodied exploration for reinforcement learning in overactuated and musculoskeletal systems,” in The Eleventh International Conference on Learning Representations, 2023.
  16. F. Pardo, “Tonic: A deep reinforcement learning library for fast prototyping and benchmarking,” arXiv preprint arXiv:2011.07537, 2020.
  17. R. J. Van Beers, P. Haggard, and D. M. Wolpert, “The role of execution noise in movement variability,” Journal of neurophysiology, vol. 91, no. 2, pp. 1050–1063, 2004.
  18. E. Todorov, T. Erez, and Y. Tassa, “Mujoco: A physics engine for model-based control,” in 2012 IEEE/RSJ international conference on intelligent robots and systems.   IEEE, 2012, pp. 5026–5033.
  19. C. M. Harris and D. M. Wolpert, “Signal-dependent noise determines motor planning,” Nature, vol. 394, no. 6695, pp. 780–784, 1998.

Summary

We haven't generated a summary for this paper yet.