FARPLS: A Feature-Augmented Robot Trajectory Preference Labeling System to Assist Human Labelers' Preference Elicitation (2403.06267v1)
Abstract: Preference-based learning aims to align robot task objectives with human values. One of the most common methods to infer human preferences is by pairwise comparisons of robot task trajectories. Traditional comparison-based preference labeling systems seldom support labelers to digest and identify critical differences between complex trajectories recorded in videos. Our formative study (N = 12) suggests that individuals may overlook non-salient task features and establish biased preference criteria during their preference elicitation process because of partial observations. In addition, they may experience mental fatigue when given many pairs to compare, causing their label quality to deteriorate. To mitigate these issues, we propose FARPLS, a Feature-Augmented Robot trajectory Preference Labeling System. FARPLS highlights potential outliers in a wide variety of task features that matter to humans and extracts the corresponding video keyframes for easy review and comparison. It also dynamically adjusts the labeling order according to users' familiarities, difficulties of the trajectory pair, and level of disagreements. At the same time, the system monitors labelers' consistency and provides feedback on labeling progress to keep labelers engaged. A between-subjects study (N = 42, 105 pairs of robot pick-and-place trajectories per person) shows that FARPLS can help users establish preference criteria more easily and notice more relevant details in the presented trajectories than the conventional interface. FARPLS also improves labeling consistency and engagement, mitigating challenges in preference elicitation without raising cognitive loads significantly
- Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback. https://doi.org/10.48550/arXiv.2211.11602 arXiv:2211.11602Â [cs]
- Julie A Adams. 2002. Critical considerations for human-robot interface development. In Proceedings of 2002 AAAI Fall Symposium. AAAI Press, North Falmouth, Massachusetts, USA, 1–8.
- APRIL: Active Preference Learning-Based Reinforcement Learning. In Machine Learning and Knowledge Discovery in Databases (Lecture Notes in Computer Science), Peter A. Flach, Tijl De Bie, and Nello Cristianini (Eds.). Springer, Berlin, Heidelberg, 116–131. https://doi.org/10.1007/978-3-642-33486-3_8
- Effects of Gaze and Arm Motion Kinesics on a Humanoid’s Perceived Confidence, Eagerness to Learn, and Attention to the Task in a Teaching Scenario. In Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction (HRI ’21). Association for Computing Machinery, New York, NY, USA, 197–206. https://doi.org/10.1145/3434073.3444651
- Asking Easy Questions: A User-Friendly Approach to Active Reward Learning. In Proceedings of the Conference on Robot Learning. PMLR, Virtual, 1177–1190.
- Aligning Robot and Human Representations. https://doi.org/10.48550/arxiv.2302.01928 arXiv:2302.01928Â [cs]
- Feature Expansive Reward Learning: Rethinking Human Input. In Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction (HRI ’21). Association for Computing Machinery, New York, NY, USA, 216–224. https://doi.org/10.1145/3434073.3444667
- Ralph Allan Bradley and Milton E. Terry. 1952. Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons. Biometrika 39, 3/4 (1952), 324–345. https://doi.org/10.2307/2334029 arXiv:2334029
- Safe imitation learning via fast bayesian reward inference from preferences. In Proceedings of the 37th International Conference on Machine Learning (2020-11-21). PMLR, PMLR, Virtual, 1165–1177. https://proceedings.mlr.press/v119/brown20a.html
- Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, Long Beach, California, USA, 783–792. https://proceedings.mlr.press/v97/brown19a.html
- ISSE: An Interactive Source Separation Editor. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’14). Association for Computing Machinery, New York, NY, USA, 257–266. https://doi.org/10.1145/2556288.2557253
- Supporting Interface Customization Using a Mixed-Initiative Approach. In Proceedings of the 12th International Conference on Intelligent User Interfaces. ACM, Honolulu Hawaii USA, 92–101. https://doi.org/10.1145/1216295.1216317
- Here or There. In Advances in Information Retrieval (Lecture Notes in Computer Science), Craig Macdonald, Iadh Ounis, Vassilis Plachouras, Ian Ruthven, and Ryen W. White (Eds.). Springer, Berlin, Heidelberg, 16–27. https://doi.org/10.1007/978-3-540-78646-7_5
- Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback. https://doi.org/10.48550/arXiv.2307.15217 arXiv:2307.15217Â [cs]
- Learning Visualization Policies of Augmented Reality for Human-Robot Collaboration. In Proceedings of The 6th Conference on Robot Learning. PMLR, Auckland, New Zealand, 1233–1243.
- Human Performance Issues and User Interface Design for Teleoperated Robots. IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews) 37, 6 (Nov. 2007), 1231–1245. https://doi.org/10.1109/TSMCC.2007.905819
- AILA: Attentive Interactive Labeling Assistant for Document Classification through Attention-Based Deep Neural Networks. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3290605.3300460
- Deep Reinforcement Learning from Human Preferences. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc., Long Beach, California, USA. https://proceedings.neurips.cc/paper_files/paper/2017/file/d5e2c0adad503c91f91df240d0cd4e49-Paper.pdf
- William S Cleveland. 1979. Robust locally weighted regression and smoothing scatterplots. Journal of the American statistical association 74, 368 (1979), 829–836.
- Vincent Conitzer. 2007. Eliciting Single-Peaked Preferences Using Comparison Queries. In Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS ’07). Association for Computing Machinery, New York, NY, USA, 1–8. https://doi.org/10.1145/1329125.1329204
- Firemap: A Dynamic Data-Driven Predictive Wildfire Modeling and Visualization Environment. Procedia Computer Science 108 (Jan. 2017), 2230–2239. https://doi.org/10.1016/j.procs.2017.05.174
- EasyAlbum: An Interactive Photo Annotation System Based on Face Clustering and Re-Ranking. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’07). Association for Computing Machinery, New York, NY, USA, 367–376. https://doi.org/10.1145/1240624.1240684
- The mystery of the Z-score. Aorta 4, 04 (2016), 124–130.
- Semiautomatic Labeling for Deep Learning in Robotics. IEEE Transactions on Automation Science and Engineering 17, 2 (2020), 611–620. https://doi.org/10.1109/TASE.2019.2938316
- Evaluation Criteria for Trajectories of Robotic Arms. Robotics 11, 1 (2022), 29. https://doi.org/10.3390/robotics11010029
- Legibility and Predictability of Robot Motion. In Proceedings of the 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI ’13). IEEE Press, Tokyo, Japan, 301–308. https://doi.org/10.1109/HRI.2013.6483603
- A Checklist to Combat Cognitive Biases in Crowdsourcing. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 9 (2021), 48–59. https://doi.org/10.1609/hcomp.v9i1.18939
- Abhishek Dutta and Andrew Zisserman. 2019. The VIA Annotation Software for Images, Audio and Video. In Proceedings of the 27th ACM International Conference on Multimedia (MM ’19). Association for Computing Machinery, New York, NY, USA, 2276–2279. https://doi.org/10.1145/3343031.3350535
- Retrospective Think-Aloud Method: Using Eye Movements as an Extra Cue for Participants’ Verbalizations. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’11). Association for Computing Machinery, New York, NY, USA, 1161–1170. https://doi.org/10.1145/1978942.1979116
- Data Labeling: An Empirical Investigation into Industrial Challenges and Mitigation Strategies. In Product-Focused Software Process Improvement (Lecture Notes in Computer Science), Maurizio Morisio, Marco Torchiano, and Andreas Jedlitschka (Eds.). Springer International Publishing, Cham, 202–216. https://doi.org/10.1007/978-3-030-64148-1_13
- Using Worker Self-Assessments for Competence-Based Pre-Selection in Crowdsourcing Microtasks. ACM Transactions on Computer-Human Interaction 24, 4 (2017), 1–26. https://doi.org/10.1145/3119930
- V-Awake: 21st Eurographics/IEEE VGTC Conference on Visualization. Computer Graphics Forum 38, 3 (March 2019), 1–12. https://doi.org/10.1111/cgf.13667
- Mark E. Glickman and Shane T. Jensen. 2005. Adaptive Paired Comparison Design. Journal of Statistical Planning and Inference 127, 1 (Jan. 2005), 279–293. https://doi.org/10.1016/j.jspi.2003.09.022
- Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning. In Proceedings of the Conference on Robot Learning (Proceedings of Machine Learning Research, Vol. 100), Leslie Pack Kaelbling, Danica Kragic, and Komei Sugiura (Eds.). PMLR, Virtual, 1025–1037. https://proceedings.mlr.press/v100/gupta20a.html
- Here’s What I’ve Learned: Asking Questions That Reveal Reward Learning. ACM Transactions on Human-Robot Interaction 11, 4 (2022), 40:1–40:28. https://doi.org/10.1145/3526107
- Nadia Haddara and Dobromir Rahnev. 2022. The Impact of Feedback on Perceptual Decision-Making and Metacognition: Reduction in Bias but No Change in Sensitivity. Psychological Science 33, 2 (2022), 259–275. https://doi.org/10.1177/09567976211032887
- J. A. Hartigan and M. A. Wong. 1979. Algorithm AS 136: A K-Means Clustering Algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics) 28, 1 (1979), 100–108. https://doi.org/10.2307/2346830 arXiv:2346830
- Active Comparison Based Learning Incorporating User Uncertainty and Noise.
- Label Ranking by Learning Pairwise Preferences. Artificial Intelligence 172, 16-17 (2008), 1897–1916. https://doi.org/10.1016/j.artint.2008.08.002
- Label Ranking by Learning Pairwise Preferences. Artificial Intelligence 172, 16 (Nov. 2008), 1897–1916. https://doi.org/10.1016/j.artint.2008.08.002
- Consumer product label information processing: An experiment involving time pressure and distraction. Journal of Economic Psychology 9, 2 (1988), 195–214. https://doi.org/10.1016/0167-4870(88)90051-7
- MyMove: Facilitating Older Adults to Collect In-Situ Activity Labels on a Smartwatch with Speech. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA, 1–21. https://doi.org/10.1145/3491102.3517457
- Waldemar W. Koczkodaj. 1998. Testing the Accuracy Enhancement of Pairwise Comparisons by a Monte Carlo Experiment. Journal of Statistical Planning and Inference 69, 1 (June 1998), 21–31. https://doi.org/10.1016/S0378-3758(97)00131-6
- Justin Kruger and David Dunning. 1999. Unskilled and unaware of it: how difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Journal of personality and social psychology 77, 6 (1999), 1121.
- Active Learning and Visual Analytics for Stance Classification with ALVA. ACM Transactions on Interactive Intelligent Systems 7, 3 (Oct. 2017), 14:1–14:31. https://doi.org/10.1145/3132169
- Evaluating Preference Collection Methods for Interactive Ranking Analytics. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–11. https://doi.org/10.1145/3290605.3300742
- IVO Robot: A New Social Robot for Human-Robot Collaboration. In Proceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction (HRI ’22). IEEE Press, Sapporo, Hokkaido, Japan, 860–864.
- Annotation curricula to implicitly train non-expert annotators. Computational Linguistics 48 (2022), 343–373. Issue 2. https://doi.org/10.1162/coli_a_00436
- Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism. arXiv:2305.18438Â [cs.LG]
- Gabrielle Kaili-May Liu. 2023. Perspectives on the Social Impacts of Reinforcement Learning with Human Feedback. arXiv preprint. https://arxiv.org/abs/2303.02891 arXiv preprint, arXiv:2303.02891.
- Voronoi-Based Trajectory Optimization for UGV Path Planning. In 2017 International Conference on Mechanical, System and Control Engineering (ICMSC). IEEE, St. Petersburg, Russia, 383–387. https://doi.org/10.1109/ICMSC.2017.7959506
- What Matters in Learning from Offline Human Demonstrations for Robot Manipulation. In Proceedings of the 5th Conference on Robot Learning (Proceedings of Machine Learning Research, Vol. 164), Aleksandra Faust, David Hsu, and Gerhard Neumann (Eds.). PMLR, London, UK, 1678–1690. https://proceedings.mlr.press/v164/mandlekar22a.html
- H. B. Mann and D. R. Whitney. 1947. On a test of whether one of two random variables is stochastically larger than the other. The Annals of Mathematical Statistics 18 (1947), 50–60. Issue 1. https://doi.org/10.1214/aoms/1177730491
- Arm-A-Dine: Towards Understanding the Design of Playful Embodied Eating Experiences. In Proceedings of the 2018 Annual Symposium on Computer-Human Interaction in Play (CHI PLAY ’18). Association for Computing Machinery, New York, NY, USA, 299–313. https://doi.org/10.1145/3242671.3242710
- Microsoft. 2021. VoTT. https://github.com/microsoft/VoTT.
- Interactive Robotic Plastering: Augmented Interactive Design and Fabrication for On-Site Robotic Plastering. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI ’22). Association for Computing Machinery, New York, NY, USA, 1–18. https://doi.org/10.1145/3491102.3501842
- Foundations of machine learning. MIT press, Cambridge, MA.
- I Need a Third Arm! Eliciting Body-Based Interactions with a Wearable Robotic Arm. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3544548.3581184
- Jonathan Mumm and Bilge Mutlu. 2011. Human-Robot Proxemics: Physical and Psychological Distancing in Human-Robot Interaction. In Proceedings of the 6th International Conference on Human-Robot Interaction (HRI ’11). Association for Computing Machinery, New York, NY, USA, 331–338. https://doi.org/10.1145/1957656.1957786
- Learning Multimodal Rewards from Rankings. In Proceedings of the 5th Conference on Robot Learning (Proceedings of Machine Learning Research, Vol. 164), Aleksandra Faust, David Hsu, and Gerhard Neumann (Eds.). PMLR, London, UK, 342–352. https://proceedings.mlr.press/v164/myers22a.html
- Transfer Learning of Human Preferences for Proactive Robot Assistance in Assembly Tasks. In Proceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction (Stockholm, Sweden) (HRI ’23). Association for Computing Machinery, New York, NY, USA, 575–583. https://doi.org/10.1145/3568162.3576965
- Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.), Vol. 35. Curran Associates, Inc., New Orleans, LA, USA [hybrid], 27730–27744. https://proceedings.neurips.cc/paper_files/paper/2022/file/b1efde53be364a73914f58805a001731-Paper-Conference.pdf
- Learning User Preferences by Adaptive Pairwise Comparison. Proceedings of the VLDB Endowment 8, 11 (July 2015), 1322–1333. https://doi.org/10.14778/2809974.2809992
- Adaptive Linguistic Style for an Assistive Robotic Health Companion Based on Explicit Human Feedback. In Proceedings of the 12th ACM International Conference on PErvasive Technologies Related to Assistive Environments (PETRA ’19). Association for Computing Machinery, New York, NY, USA, 247–255. https://doi.org/10.1145/3316782.3316791
- MediaTable: Interactive Categorization of Multimedia Collections. IEEE Computer Graphics and Applications 30, 5 (Sept. 2010), 42–51. https://doi.org/10.1109/MCG.2010.66
- Stan Salvador and Philip Chan. 2004. FastDTW: Toward accurate dynamic time warping in linear time and space. In KDD workshop on mining temporal and sequential data, Vol. 6. Seattle, Washington, Association for Computing Machinery, New York, NY, USA, 70–80.
- Individual Differences of Children with Autism in Robot-Assisted Autism Therapy. In Proceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction (HRI ’22). IEEE Press, Sapporo, Hokkaido, Japan, 43–52.
- Lindsay Sanneman and Julie Shah. 2023. Transparent Value Alignment. In Companion of the 2023 ACM/IEEE International Conference on Human-Robot Interaction. ACM, Stockholm Sweden, 557–560. https://doi.org/10.1145/3568294.3580147
- Burr Settles. 2011. Closing the Loop: Fast, Interactive Semi-Supervised Annotation with Queries on Features and Instances. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP ’11). Association for Computational Linguistics, USA, 1467–1478.
- Questioncomb: a gamification approach for the visual explanation of linguistic phenomena through interactive labeling. ACM Transactions on Interactive Intelligent Systems 11 (2021), 1–38. Issue 3-4. https://doi.org/10.1145/3429448
- Benchmarks and Algorithms for Offline Preference-Based Reward Learning. https://openreview.net/forum?id=TGuXXlbKsn
- Bongwon Suh and Benjamin B. Bederson. 2007. Semi-Automatic Photo Annotation Strategies Using Event Based Clustering and Clothing Based Person Recognition. Interacting with Computers 19, 4 (July 2007), 524–544. https://doi.org/10.1016/j.intcom.2007.02.002
- Daniel Szafir and Danielle Albers Szafir. 2021. Connecting Human-Robot Interaction and Data Visualization. In Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction (HRI ’21). Association for Computing Machinery, New York, NY, USA, 281–292. https://doi.org/10.1145/3434073.3444683
- Minimum-Jerk Trajectory Generation for Master-Slave Robotic System. In 2012 4th IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob). IEEE, Rome, Italy, 811–816. https://doi.org/10.1109/BioRob.2012.6290666
- Almar Van Der Stappen and Mathias Funk. 2021. Towards Guidelines for Designing Human-in-the-Loop Machine Training Interfaces. In 26th International Conference on Intelligent User Interfaces (College Station TX USA). ACM, College Station, TX, USA, 514–519. https://doi.org/10.1145/3397481.3450668
- A User Interface for Sense-making of the Reasoning Process While Interacting with Robots. In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg Germany, 2023-04-19). ACM, 1–7. https://doi.org/10/gss3j7
- DRAVA: Aligning Human Concepts with Machine Learning Latent Dimensions for the Visual Exploration of Small Multiples. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI ’23). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3544548.3581127
- The Impact of Shared Financial Decision Making on Overconfidence for Married Adults. FINANCIAL PLANNING REVIEW 2, 1 (2019), e1032. https://doi.org/10.1002/cfp2.1032
- Learning Reward Functions from Scale Feedback. In Proceedings of the 5th Conference on Robot Learning (Proceedings of Machine Learning Research, Vol. 164), Aleksandra Faust, David Hsu, and Gerhard Neumann (Eds.). PMLR, London, UK, 353–362. https://proceedings.mlr.press/v164/wilde22a.html
- Douglas A Wolfe. 2009. Rank methods. Wiley Interdisciplinary Reviews: Computational Statistics 1, 3 (2009), 342–347.
- Preference Learning in Assistive Robotics: Observational Repeated Inverse Reinforcement Learning. In Proceedings of the 3rd Machine Learning for Healthcare Conference (Proceedings of Machine Learning Research, Vol. 85), Finale Doshi-Velez, Jim Fackler, Ken Jung, David Kale, Rajesh Ranganath, Byron Wallace, and Jenna Wiens (Eds.). PMLR, Palo Alto, California, USA, 420–439. https://proceedings.mlr.press/v85/woodworth18a.html
- Chi-Haun Wu and Chi-Cheng Jou. 1988. Design of a Controlled Spatial Curve Trajectory for Robot Manipulators. In Proceedings of the 27th IEEE Conference on Decision and Control. IEEE, Austin, TX, USA, 161–166 vol.1. https://doi.org/10.1109/CDC.1988.194289
- In Situ Bidirectional Human-Robot Value Alignment. Science Robotics 7, 68 (July 2022), eabm4183. https://doi.org/10.1126/scirobotics.abm4183
- Human-Guided Robot Behavior Learning: A GAN-Assisted Preference-Based Reinforcement Learning Approach. IEEE Robotics and Automation Letters 6, 2 (April 2021), 3545–3552. https://doi.org/10.1109/LRA.2021.3063927
- Self-Annotation Methods for Aligning Implicit and Explicit Human Feedback in Human-Robot Interaction. In Proceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction (New York, NY, USA) (HRI ’23). Association for Computing Machinery, Stockholm, SE, 398–407. https://doi.org/10.1145/3568162.3576986
- MI3: Machine-Initiated Intelligent Interaction for Interactive Classification and Data Reconstruction. ACM Transactions on Interactive Intelligent Systems 11, 3-4 (Sept. 2021), 18:1–18:34. https://doi.org/10.1145/3412848
- OneLabeler: A Flexible System for Building Data Labeling Tools. In CHI Conference on Human Factors in Computing Systems. ACM, New Orleans LA USA, 1–22. https://doi.org/10.1145/3491102.3517612
- FSW Robot System Dimensional Optimization and Trajectory Planning Based on Soft Stiffness Indices. Journal of Manufacturing Processes 63 (March 2021), 88–97. https://doi.org/10.1016/j.jmapro.2020.05.004
- Principled Reinforcement Learning with Human Feedback from Pairwise or $K$-wise Comparisons. In ICLR 2023 Workshop on Mathematical and Empirical Understanding of Foundation Models. OpenReview.net, Online + Kigali, Rwanda. https://openreview.net/forum?id=pm_WNYd7SP
- Danny Zhu and Manuela Veloso. 2017. Virtually Adapted Reality and Algorithm Visualization for Autonomous Robots. In RoboCup 2016: Robot World Cup XX (Lecture Notes in Computer Science), Sven Behnke, Raymond Sheh, Sanem Sarıel, and Daniel D. Lee (Eds.). Springer International Publishing, Cham, 452–464. https://doi.org/10.1007/978-3-319-68792-6_38
- Robosuite: A Modular Simulation Framework and Benchmark for Robot Learning. arXiv:2009.12293
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.