Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MarkupLens: An AI-Powered Tool to Support Designers in Video-Based Analysis at Scale (2403.05201v1)

Published 8 Mar 2024 in cs.HC

Abstract: Video-Based Design (VBD) is a design methodology that utilizes video as a primary tool for understanding user interactions, prototyping, and conducting research to enhance the design process. AI can be instrumental in video-based design by analyzing and interpreting visual data from videos to enhance user interaction, automate design processes, and improve product functionality. In this study, we explore how AI can enhance professional video-based design with a State-of-the-Art (SOTA) deep learning model. We developed a prototype annotation platform (MarkupLens) and conducted a between-subjects eye-tracking study with 36 designers, annotating videos with three levels of AI assistance. Our findings indicate that MarkupLens improved design annotation quality and productivity. Additionally, it reduced the cognitive load that designers exhibited and enhanced their User Experience (UX). We believe that designer-AI collaboration can greatly enhance the process of eliciting insights in video-based design.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Onur Asan and Enid Montague. 2014. Using video-based observation research methods in primary care health encounters to evaluate complex interactions. 21, 4 (2014), 161–170. https://doi.org/10.14236/jhi.v21i4.72
  2. Optical brain monitoring for operator training and mental workload assessment. 59, 1 (2012), 36–47. https://doi.org/10.1016/j.neuroimage.2011.06.023
  3. Fabio Babiloni. 2019. Mental Workload Monitoring: New Perspectives from Neuroscience. In Human Mental Workload: Models and Applications (Cham) (Communications in Computer and Information Science), Luca Longo and Maria Chiara Leva (Eds.). Springer International Publishing, 3–19. https://doi.org/10.1007/978-3-030-32423-0_1
  4. Laura Baecher and Bede McCormack. 2015. The impact of video review on supervisory conferencing. 29, 2 (2015), 153–173. https://doi.org/10.1080/09500782.2014.992905
  5. Dazed: measuring the cognitive load of solving technical interview problems at the whiteboard. In Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging Results (Gothenburg Sweden). ACM, 93–96. https://doi.org/10.1145/3183399.3183415
  6. An interactive tool for manual, semi-automatic and automatic video annotation. 131 (2015), 88–99. https://doi.org/10.1016/j.cviu.2014.06.015
  7. Distracted worker: Using pupil size and blink rate to detect cognitive load during manufacturing tasks. 106 (2023), 103867. https://doi.org/10.1016/j.apergo.2022.103867
  8. Crowdsourcing – A Step Towards Advanced Machine Learning. 132 (2018), 632–642. https://doi.org/10.1016/j.procs.2018.05.062
  9. The Role of Explanations on Trust and Reliance in Clinical Decision Support Systems. In 2015 International Conference on Healthcare Informatics (2015-10). 160–169. https://doi.org/10.1109/ICHI.2015.26
  10. To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-assisted Decision-making. 5 (2021), 188:1–188:21. Issue CSCW1. https://doi.org/10.1145/3449287
  11. Experimental evaluation of eye-blink parameters as a drowsiness measure. 89, 3 (2003), 319–325. https://doi.org/10.1007/s00421-003-0807-5
  12. Eye activity as a measure of human mental effort in HCI. In Proceedings of the 16th international conference on Intelligent user interfaces (New York, NY, USA) (IUI ’11). Association for Computing Machinery, 315–318. https://doi.org/10.1145/1943403.1943454
  13. Understanding the Role of Human Intuition on Reliance in Human-AI Decision-Making with Explanations. 7 (2023), 370:1–370:32. Issue CSCW2. https://doi.org/10.1145/3610219
  14. Eye movements in reading and information processing: Keith Rayner’s 40year legacy. 86 (2016), 1–19. https://doi.org/10.1016/j.jml.2015.07.004
  15. Fred D. Davis. 1989. Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. 13, 3 (1989), 319–340. https://doi.org/10.2307/249008 Publisher: Management Information Systems Research Center, University of Minnesota.
  16. User Acceptance of Computer Technology: A Comparison of Two Theoretical Models. 35, 8 (1989), 982–1003. https://doi.org/10.1287/mnsc.35.8.982 Publisher: INFORMS.
  17. Cardiovascular and eye activity measures as indices for momentary changes in mental effort during simulated flight. 51, 9 (2008), 1295–1319. https://doi.org/10.1080/00140130802120267
  18. Evaluating and reducing cognitive load should be a priority for machine learning in healthcare. 28, 7 (2022), 1331–1333. https://doi.org/10.1038/s41591-022-01833-z Number: 7 Publisher: Nature Publishing Group.
  19. The Who in Explainable AI: How AI Background Shapes Perceptions of AI Explanations. https://doi.org/10.48550/arXiv.2107.13509 arXiv:2107.13509 [cs]
  20. The effects of driving environment complexity and dual tasking on drivers’ mental workload and eye blink behavior. 40 (2016), 78–90. https://doi.org/10.1016/j.trf.2016.04.007
  21. A Case Study on the Design and Use of an Annotation and Analytical Tool Tailored To Lead Climbing. In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (New York, NY, USA) (CHI EA ’23). Association for Computing Machinery, 1–8. https://doi.org/10.1145/3544549.3573876
  22. Sketch-based Video A Storytelling for UX Validation in AI Design for Applied Research. CHI Extended Abstracts (2020). https://doi.org/10.1145/3334480.3375221
  23. Video annotation tools: A Review. In 2018 International Conference on Advances in Computing, Communication Control and Networking (ICACCCN). 911–914. https://doi.org/10.1109/ICACCCN.2018.8748669
  24. Joseph H Goldberg and Xerxes P Kotval. 1999. Computer interface evaluation using eye movements: methods and constructs. 24, 6 (1999), 631–645. https://doi.org/10.1016/S0169-8141(98)00068-7
  25. Canadian Centre for Occupational Health Government of Canada and Safety. 2023. CCOHS: Office Ergonomics - Positioning the Monitor. https://www.ccohs.ca/oshanswers/ergonomics/office/monitor_positioning.html Last Modified: 2023-06-13.
  26. Looking Away and Catching Up: Dealing with Brief Attentional Disconnection in Synchronous Groupware. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (Portland Oregon USA). ACM, 2221–2235. https://doi.org/10.1145/2998181.2998226
  27. Sandra G. Hart and Lowell E. Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research. In Advances in Psychology. Vol. 52. Elsevier, 139–183. https://doi.org/10.1016/S0166-4115(08)62386-9
  28. Impulsive decision making and working memory. 29, 2 (2003), 298–306. https://doi.org/10.1037/0278-7393.29.2.298
  29. Human Performance Research Group at NASA’s Ames Research Center. 2022. NASA Task Load Index (NASA-TLX) Paper and Pencil Version Instruction Manual. NASA, Moffett Field, CA. Available at: https://humansystems.arc.nasa.gov/groups/tlx/tlxpaperpencil.php.
  30. How machine-learning recommendations influence clinician treatment selections: the example of antidepressant selection. 11, 1 (2021), 1–9. https://doi.org/10.1038/s41398-021-01224-x Number: 1 Publisher: Nature Publishing Group.
  31. Markup as you talk: establishing effective memory cues while still contributing to a meeting. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work (Seattle Washington USA). ACM, 349–358. https://doi.org/10.1145/2145204.2145260
  32. An innovative web-based collaborative platform for video annotation. 70, 1 (2014), 413–432. https://doi.org/10.1007/s11042-013-1419-7
  33. How to Fine-tune Models with Few Samples: Update, Data Augmentation, and Test-time Augmentation. https://doi.org/10.48550/arXiv.2205.07874 arXiv:2205.07874 [cs]
  34. Alan Latham and Peter R H Wood. 2015. Inhabiting Infrastructure: Exploring the Interactional Spaces of Urban Cycling. 47, 2 (2015), 300–319. https://doi.org/10.1068/a140049p
  35. Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.), Vol. 35. Curran Associates, Inc., 1950–1965. https://proceedings.neurips.cc/paper_files/paper/2022/file/0cde695b83bd186c1fd456302888454c-Paper-Conference.pdf
  36. Supporting task resumption using visual feedback. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing (Baltimore Maryland USA). ACM, 767–777. https://doi.org/10.1145/2531602.2531710
  37. Video artifacts for design: bridging the Gap between abstraction and detail. In Proceedings of the 3rd conference on Designing interactive systems: processes, practices, methods, and techniques (New York City New York USA). ACM, 72–82. https://doi.org/10.1145/347642.347666
  38. Eye blink rate increases as a function of cognitive load during an auditory oddball paradigm. 736 (2020-09-25), 135293. https://doi.org/10.1016/j.neulet.2020.135293
  39. Xiangming Mu. 2010. Towards effective video annotation: An approach to automatically link notes with video content. 55, 4 (2010), 1752–1763. https://doi.org/10.1016/j.compedu.2010.07.021
  40. The Use of Digital Video Annotation in Teacher Training: The Teachers’ Perspectives. 69 (2012), 600–613. https://doi.org/10.1016/j.sbspro.2012.11.452
  41. The Challenge of Data Annotation in Deep Learning—A Case Study on Whole Plant Corn Silage. 22, 4 (2022), 1596. https://doi.org/10.3390/s22041596
  42. Keith Rayner. 1998. Eye movements in reading and information processing: 20 years of research. 124, 3 (1998), 372–422. https://doi.org/10.1037/0033-2909.124.3.372 Place: US Publisher: American Psychological Association.
  43. Dario D Salvucci. 1999. Inferring intent in eye-based interfaces: tracing eye movements with process models. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. 254–261.
  44. Applying the User Experience Questionnaire (UEQ) in Different Evaluation Scenarios. 383–392. https://doi.org/10.1007/978-3-319-07668-3_37
  45. Construction of a Benchmark for the User Experience Questionnaire (UEQ). 4 (2017), 40–44. https://doi.org/10.9781/ijimai.2017.445
  46. Justin Spinney. 2011. A Chance to Catch a Breath: Using Mobile Video Ethnography in Cycling Research. 6, 2 (2011), 161–182. https://doi.org/10.1080/17450101.2011.552771
  47. Deborah Tatar. 1989. Using video-based observation to shape the design of a new technology. ACM SIGCHI Bulletin 21, 2 (1989), 108–111. https://doi.org/10.1145/70609.70628
  48. Russ Tedrake. 2023. CH. 9 - Object Detection and Segmentation, Robotic Manipulation. http://manipulation.mit.edu
  49. J. A. Veltman and A. W. K. Gaillard. 1998. Physiological workload reactions to increasing levels of task difficulty. 41, 5 (1998), 656–669. https://doi.org/10.1080/001401398186829 Publisher: Taylor & Francis  eprint: https://doi.org/10.1080/001401398186829.
  50. Viswanath Venkatesh. 2000. Determinants of Perceived Ease of Use: Integrating Control, Intrinsic Motivation, and Emotion into the Technology Acceptance Model. 11, 4 (2000), 342–365. https://doi.org/10.1287/isre.11.4.342.11872 Publisher: INFORMS.
  51. Laurie Vertelney. 1989. Using video to prototype user interfaces. ACM SIGCHI Bulletin 21, 2 (1989), 57–61. https://doi.org/10.1145/70609.70615
  52. Deep Learning for Computer Vision: A Brief Review. 2018 (2018), e7068349. https://doi.org/10.1155/2018/7068349
  53. MarkIt: A Collaborative Artificial Intelligence Annotation Platform Leveraging Blockchain For Medical Imaging Research. 4 (2021), 10.30953/bhty.v4.176. https://doi.org/10.30953/bhty.v4.176
  54. Salu Ylirisku and Jacob Buur. 2007a. Making sense and editing videos. In Designing with video: Focusing the user-centred design process. Springer London, 86––135. https://doi.org/10.1007/978-1-84628-961-3_2
  55. Salu Ylirisku and Jacob Buur. 2007b. Studying what people do. In Designing with video: Focusing the user-centred design process. Springer London, 36–85. https://doi.org/10.1007/978-1-84628-961-3_2

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets