Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MARLUI: Multi-Agent Reinforcement Learning for Adaptive UIs (2209.12660v3)

Published 26 Sep 2022 in cs.HC

Abstract: Adaptive user interfaces (UIs) automatically change an interface to better support users' tasks. Recently, machine learning techniques have enabled the transition to more powerful and complex adaptive UIs. However, a core challenge for adaptive user interfaces is the reliance on high-quality user data that has to be collected offline for each task. We formulate UI adaptation as a multi-agent reinforcement learning problem to overcome this challenge. In our formulation, a user agent mimics a real user and learns to interact with a UI. Simultaneously, an interface agent learns UI adaptations to maximize the user agent's performance. The interface agent learns the task structure from the user agent's behavior and, based on that, can support the user agent in completing its task. Our method produces adaptation policies that are learned in simulation only and, therefore, does not need real user data. Our experiments show that learned policies generalize to real users and achieve on par performance with data-driven supervised learning baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Visual Menu Techniques. ACM Comput. Surv. 49, 4, Article 60 (dec 2016), 41 pages. https://doi.org/10.1145/3002171
  2. PTIME: Personalized assistance for calendaring. ACM Transactions on Intelligent Systems and Technology (TIST) 2, 4 (2011), 1–22.
  3. Wauter Bosma and Elisabeth André. 2004. Exploiting Emotions to Disambiguate Dialogue Acts. In Proceedings of the 9th International Conference on Intelligent User Interfaces (Funchal, Madeira, Portugal) (IUI ’04). Association for Computing Machinery, New York, NY, USA, 85–92. https://doi.org/10.1145/964442.964459
  4. Matthew Michael Botvinick. 2012. Hierarchical reinforcement learning and decision making. Current opinion in neurobiology 22, 6 (2012), 956–962.
  5. OpenAI Gym. arXiv:1606.01540 [cs.LG]
  6. Adaptive user interfaces. Elsevier.
  7. Top-K Off-Policy Correction for a REINFORCE Recommender System. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (WSDM ’19). ACM, 456–464. https://doi.org/10.1145/3289600.3290999
  8. A Predictive Model of Menu Performance. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’07). Association for Computing Machinery, New York, NY, USA, 627–636. https://doi.org/10.1145/1240624.1240723
  9. POMDP-based control of workflows for crowdsourcing. Artificial Intelligence 202 (09 2013), 52–85. https://doi.org/10.1016/j.artint.2013.06.002
  10. Rational variability in children’s causal inferences: The sampling hypothesis. Cognition 126, 2 (2013), 285–300.
  11. Agent-Assisted Task Management That Reduces Email Overload. In Proceedings of the 15th International Conference on Intelligent User Interfaces (Hong Kong, China) (IUI ’10). Association for Computing Machinery, New York, NY, USA, 61–70. https://doi.org/10.1145/1719970.1719980
  12. Ephemeral Adaptation: The Use of Gradual Onset to Improve Menu Selection Performance. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Boston, MA, USA) (CHI ’09). Association for Computing Machinery, New York, NY, USA, 1655–1664. https://doi.org/10.1145/1518701.1518956
  13. Michael J Frank and David Badre. 2012. Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis. Cerebral cortex 22, 3 (2012), 509–526.
  14. Milica Gašić and Steve Young. 2014. Gaussian processes for POMDP-based dialogue manager optimization. IEEE Transactions on Audio, Speech and Language Processing 22, 1 (2014), 28–40. https://doi.org/10.1109/TASL.2013.2282190
  15. Learning Cooperative Personalized Policies from Gaze Data. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology (New Orleans, LA, USA) (UIST ’19). ACM, New York, NY, USA, 10. https://doi.org/10.1145/3332165.3347933
  16. Christoph Gebhardt and Otmar Hilliges. 2021. Optimal Control to Support High-Level User Goals in Human-Computer Interaction. In Artificial Intelligence for Human Computer Interaction: A Modern Approach. Springer, 33–72.
  17. Hierarchical Reinforcement Learning as a Model of Human Task Interleaving. Computational Brain and Behavior (2021). https://arxiv.org/pdf/2001.02122.pdf
  18. Computational rationality: A converging paradigm for intelligence in brains, minds, and machines. Science 349, 6245 (2015), 273–278.
  19. Multistability and perceptual inference. Neural computation 24, 1 (2012), 1–24.
  20. Yves Guiard and Olivier Rioul. 2015. A mathematical description of the speed/accuracy trade-off of aimed movement. In Proceedings of the 2015 British HCI Conference. 91–100.
  21. William E Hick. 1952. On the rate of gain of information. Quarterly Journal of experimental psychology 4, 1 (1952), 11–26.
  22. The LumièRe Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (Madison, Wisconsin) (UAI’98). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 256–265.
  23. Ronald A Howard. 1960. Dynamic programming and markov processes. (1960).
  24. Inference aided reinforcement learning for incentive mechanism design in crowdsourcing. In Advances in Neural Information Processing Systems (NIPS ’18). 5508–5518. https://arxiv.org/abs/1806.00206
  25. Overview and importance of data quality for machine learning tasks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3561–3562.
  26. Social influence as intrinsic motivation for multi-agent deep reinforcement learning. In International Conference on Machine Learning. PMLR, 3040–3049.
  27. Touchscreen Typing as Optimal Supervisory Control. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI ’21). ACM. https://userinterfaces.aalto.fi/touchscreen-typing/
  28. Learning style recognition using artificial neural network for adaptive user interface in e-learning. In 2010 IEEE International conference on computational intelligence and computing research. IEEE, 1–5.
  29. Crowd-powered parameter analysis for visual design exploration. Proceedings of the 27th annual ACM symposium on User interface software and technology - UIST ’14 (2014), 65–74. https://doi.org/10.1145/2642918.2647386
  30. SelPh : Progressive Learning and Support of Manual Photo Color Enhancement. Proc. of CHI ’16 (2016). https://doi.org/10.1145/2858036.2858111
  31. Pat Langley. 1997. Machine learning for adaptive user interfaces. In Annual Conference on Artificial Intelligence. Springer, 53–62.
  32. Collaborative interface agents. Readings in agents (1997), 111–116.
  33. Computer-Supported Form Design Using Keystroke-Level Modeling with Reinforcement Learning. In Proceedings of the 24th International Conference on Intelligent User Interfaces: Companion (Marina del Ray, California) (IUI ’19). Association for Computing Machinery, New York, NY, USA, 85–86. https://doi.org/10.1145/3308557.3308704
  34. RLlib: Abstractions for Distributed Reinforcement Learning. arXiv:1712.09381 [cs.AI]
  35. DJ-MC: A Reinforcement-Learning Agent for Music Playlist Recommendation. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems (AAMAS ’15). 591–599. https://arxiv.org/abs/1401.1880
  36. Context-aware online adaptation of mixed reality interfaces. In Proceedings of the 32nd annual ACM symposium on user interface software and technology. 147–160.
  37. Deep reinforcement learning based recommendation with explicit user-item interactions modeling. arXiv preprint arXiv:1810.12027 (2018). https://arxiv.org/abs/1810.12027
  38. Interface design optimization as a multi-armed bandit problem. In Proceedings of the 2016 CHI conference on human factors in computing systems. 4142–4153.
  39. Pattie Maes. 1995. Agents that reduce work and information overload. In Readings in human–computer interaction. Elsevier, 811–821.
  40. IEMS-an approach that combines handcrafted rules with learnt instance based rules. Aust. J. Intell. Inf. Process. Syst. 9, 1 (2006), 40–53.
  41. Abhinav Mehrotra and Robert Hendley. 2015. Designing Content-driven Intelligent Notification Mechanisms for Mobile Applications. (2015), 813–824.
  42. A mathematical model for the two-learners problem. Journal of neural engineering 14, 3 (2017), 036005.
  43. Computational Rationality as a Theory of Interaction. In CHI Conference on Human Factors in Computing Systems. 1–14.
  44. Computational interaction. Oxford University Press.
  45. Adam: Adapting multi-user interfaces for collaborative environments in real-time. In Proceedings of the 2018 CHI conference on human factors in computing systems. 1–14.
  46. Veljko Pejovic and Mirco Musolesi. 2014. InterruptMe: Designing Intelligent Prompting Mechanisms for Pervasive Applications. Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing (2014), 897–908. https://doi.org/10.1145/2632048.2632062
  47. Athanasios S Polydoros and Lazaros Nalpantidis. 2017. Survey of model-based reinforcement learning: Applications on robotics. Journal of Intelligent & Robotic Systems 86, 2 (2017), 153–173.
  48. DiamondHelp: A collaborative interface framework for networked home appliances. In 25th IEEE International Conference on Distributed Computing Systems Workshops. IEEE, 514–519.
  49. Charles Rich and Candace L Sidner. 1998. COLLAGEN: A collaboration manager for software interface agents. In Computational Models of Mixed-Initiative Interaction. Springer, 149–184.
  50. Dario D Salvucci. 2001. An integrated model of eye movements and visual encoding. Cognitive Systems Research 1, 4 (2001), 201–220.
  51. Proximal Policy Optimization Algorithms. arXiv:1707.06347 [cs.LG]
  52. Andrew Sears and Ben Shneiderman. 1994. Split Menus: Effectively Using Selection Frequency to Organize Menus. ACM Trans. Comput.-Hum. Interact. 1, 1 (mar 1994), 27–51. https://doi.org/10.1145/174630.174632
  53. Lloyd S Shapley. 1953. Stochastic games. Proceedings of the national academy of sciences 39, 10 (1953), 1095–1100.
  54. Discovering frequent work procedures from resource connections. In Proceedings of the 14th international conference on Intelligent user interfaces. 277–286.
  55. Detecting and correcting user activity switches: algorithms and interfaces. In Proceedings of the 14th international conference on Intelligent user interfaces. 117–126.
  56. Herbert A Simon. 1955. A behavioral model of rational choice. The quarterly journal of economics 69, 1 (1955), 99–118.
  57. Dustin A Smith and Henry Lieberman. 2010. The why UI: using goal networks to improve user interfaces. In Proceedings of the 15th international conference on Intelligent user interfaces. 377–380.
  58. Deep sequential recommendation for personalized adaptive user interfaces. In Proceedings of the 22nd international conference on intelligent user interfaces. 589–593.
  59. Decision making in intelligent user interfaces. In Proceedings of the 2nd international conference on Intelligent user interfaces. 195–202.
  60. Sample-efficient actor-critic reinforcement learning with supervised data for dialogue management. arXiv preprint arXiv:1707.00130 (2017). https://arxiv.org/abs/1707.00130
  61. Introduction to reinforcement learning. (1998).
  62. Adapting User Interfaces with Model-based Reinforcement Learning. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI ’21). ACM. https://userinterfaces.aalto.fi/adaptive/
  63. The design of a proactive personal agent for task management. International Journal on Artificial Intelligence Tools 21, 01 (2012), 1250004.
  64. POMDP-Based Statistical Spoken Dialog Systems: A Review. Proc. IEEE 101, 5 (2013), 1160–1179. https://doi.org/10.1109/JPROC.2012.2225812
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Thomas Langerak (4 papers)
  2. Sammy Christen (21 papers)
  3. Mert Albaba (4 papers)
  4. Christoph Gebhardt (11 papers)
  5. Otmar Hilliges (120 papers)