Unblind Text Inputs: Predicting Hint-text of Text Input in Mobile Apps via LLM (2404.02706v1)
Abstract: Mobile apps have become indispensable for accessing and participating in various environments, especially for low-vision users. Users with visual impairments can use screen readers to read the content of each screen and understand the content that needs to be operated. Screen readers need to read the hint-text attribute in the text input component to remind visually impaired users what to fill in. Unfortunately, based on our analysis of 4,501 Android apps with text inputs, over 0.76 of them are missing hint-text. These issues are mostly caused by developers' lack of awareness when considering visually impaired individuals. To overcome these challenges, we developed an LLM-based hint-text generation model called HintDroid, which analyzes the GUI information of input components and uses in-context learning to generate the hint-text. To ensure the quality of hint-text generation, we further designed a feedback-based inspection mechanism to further adjust hint-text. The automated experiments demonstrate the high BLEU and a user study further confirms its usefulness. HintDroid can not only help visually impaired individuals, but also help ordinary people understand the requirements of input components. HintDroid demo video: https://youtu.be/FWgfcctRbfI.
- 2017. Screen Reader Survey. https://webaim.org/projects/screenreadersurvey7/.
- 2023. Android Developer Accessibility Guideline. https://developer.android.com/guide/topics/ui/accessibility.
- 2023a. Apple App Store. https://www.apple.com/au/ios/app-store/.
- 2023b. Apple Human Interface Guidelines-Accessibility. https://developer.apple.com/design/human-interface-guidelines/accessibility.
- 2023. Blindness and vision impairment. https://www.who.int/zh/news-room/fact-sheets/detail/blindness-and-visual-impairment.
- 2023a. Google MaterialDesign-Accessibility. https://material.io/design/usability/accessibility.html#understanding-accessibility.
- 2023b. Google Play Store. https://play.google.com.
- 2023c. Google TalkBack. https://github.com/google/talkback.
- 2023. Principles for improving app accessibility. https://developer.android.com/guide/topics/ui/accessibility/principles.
- 2023. VoiceOver. https://cloud.google.com/translate/docs.
- Accessibility in native mobile applications for users with disabilities: A scoping review. Applied Sciences 11, 12 (2021), 5707.
- Zebra crossing spotter: Automatic population of spatial databases for increased safety of blind travelers. In Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility. 251–258.
- Yakup Akgül. 2022. Evaluating the performance of websites from a public value, usability, and readability perspectives: a review of Turkish national government websites. Universal Access in the Information Society (2022), 1–16.
- Mrim Alnfiai and Srinivas Sampalli. 2016. SingleTapBraille: Developing a text entry method based on braille patterns using a single tap. Procedia Computer Science 94 (2016), 248–255.
- Accessibility issues in Android apps: state of affairs, sentiments, and ways forward. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 1323–1334.
- Gary Ang and Ee Peng Lim. 2022. Learning User Interface Semantics from Heterogeneous Networks with Multimodal and Positional Attributes. In 27th International Conference on Intelligent User Interfaces. 433–446.
- Shiri Azenkot and Nicole B Lee. 2013. Exploring the use of speech input by blind people on mobile devices. In Proceedings of the 15th international ACM SIGACCESS conference on computers and accessibility. 1–8.
- Input finger detection for nonvisual touch screen text entry in Perkinput. In Proceedings of graphics interface 2012. 121–129.
- Uibert: Learning generic multimodal representations for ui understanding. arXiv preprint arXiv:2107.13731 (2021).
- Study of accessibility guidelines of mobile applications. In Proceedings of the 17th international conference on mobile and ubiquitous multimedia. 305–315.
- Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. 65–72.
- Yoav Benjamini and Yosef Hochberg. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological) 57, 1 (1995), 289–300.
- Falling asleep with Angry Birds, Facebook and Kindle: a large scale study on mobile application usage. In Proceedings of the 13th international conference on Human computer interaction with mobile devices and services. 47–56.
- No-look notes: accessible eyes-free multi-touch text entry. In Pervasive Computing: 8th International Conference, Pervasive 2010, Helsinki, Finland, May 17-20, 2010. Proceedings 8. Springer, 409–426.
- Stephen Brewster. 2002. Overcoming the lack of screen space on mobile computers. Personal and Ubiquitous computing 6 (2002), 188–205.
- Tactile feedback for mobile interactions. In Proceedings of the SIGCHI conference on Human factors in computing systems. 159–162.
- Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
- Interactive mobile app navigation with uncertain or under-specified natural language commands. arXiv preprint arXiv:2202.02312 (2022).
- From ui design image to gui skeleton: a neural machine translator to bootstrap mobile gui implementation. In Proceedings of the 40th International Conference on Software Engineering. 665–676.
- Unblind your apps: Predicting natural-language labels for mobile gui components by deep learning. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 322–334.
- Towards Complete Icon Labeling in Mobile Applications. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–14.
- Big self-supervised models are strong semi-supervised learners. Advances in neural information processing systems 33 (2020), 22243–22255.
- Microsoft coco captions: Data collection and evaluation server. arXiv preprint arXiv:1504.00325 (2015).
- Improving Crowd-Supported GUI Testing with Structural Guidance. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.
- BAGEL: An Approach to Automatically Detect Navigation-Based Web Accessibility Barriers for Keyboard Users. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–17.
- Examining augmented virtuality impairment simulation for mobile app accessibility design. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–11.
- Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311 (2022).
- TaleBrush: visual sketching of story generation with pretrained language models. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. 1–4.
- Accessibility of mHealth self-care apps for individuals with spina bifida. Perspectives in health information management 12, Spring (2015).
- Mobile device accessibility for the visually impaired: problems mapping and recommendations. Universal Access in the Information Society 17 (2018), 421–435.
- Rico: A mobile app dataset for building data-driven design applications. In Proceedings of the 30th annual ACM symposium on user interface software and technology. 845–854.
- A survey for in-context learning. arXiv preprint arXiv:2301.00234 (2022).
- Automated accessibility testing of mobile apps. In 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). IEEE, 116–126.
- Understanding Screen Relationships from Screenshots of Smartphone Applications. In 27th International Conference on Intelligent User Interfaces. 447–458.
- Codebert: A pre-trained model for programming and natural languages. EMNLP (2020).
- Understanding Mobile GUI: from Pixel-Words to Screen-Sentences. arXiv preprint arXiv:2105.11941 (2021).
- Dylan Gaines. 2018. Exploring an ambiguous technique for eyes-free mobile text entry. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility. 471–473.
- FlexType: Flexible Text Input with a Small Set of Input Gestures. In Proceedings of the 28th International Conference on Intelligent User Interfaces. 584–594.
- Impairment of auditory spatial localization in congenitally blind human subjects. Brain 137, 1 (2014), 288–293.
- Caption crawler: Enabling reusable alternative text descriptions using reverse image search. In Proceedings of the 2018 chi conference on human factors in computing systems. 1–11.
- Understanding html with large language models. arXiv preprint arXiv:2210.03945 (2022).
- Transformer in transformer. Advances in Neural Information Processing Systems 34 (2021), 15908–15919.
- Puma: Programmable ui-automation for large-scale dynamic analysis of mobile apps. In Proceedings of the 12th annual international conference on Mobile systems, applications, and services. 204–217.
- Actionbert: Leveraging user actions for semantic understanding of user interfaces. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 5931–5938.
- On the naturalness of software. Commun. ACM 59, 5 (2016), 122–131.
- Perspectives and practices of digital accessibility: A survey of user experience professionals in nordic countries. In Proceedings of the 11th Nordic Conference on Human-Computer Interaction: Shaping Experiences, Shaping Society. 1–11.
- Smartphone usage by expert blind users. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.
- Promptmaker: Prompt-based prototyping with large language models. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. 1–8.
- Discovering the syntax and strategies of natural language programming with generative language models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–19.
- Interactive Link Prediction as a Downstream Task for Foundational GUI Understanding Models. In German Conference on Artificial Intelligence (Künstliche Intelligenz). Springer, 75–89.
- Mobile healthcare and people with disabilities: current state and future needs. International journal of environmental research and public health 15, 3 (2018), 515.
- Freedom to roam: a study of mobile device adoption and accessibility for people with visual and motor disabilities. In Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility. 115–122.
- Akif Khan and Shah Khusro. 2021. An insight into smartphone-based assistive solutions for visually impaired and blind people: issues, challenges and opportunities. Universal Access in the Information Society 20 (2021), 265–298.
- Where’s my stuff? Design and evaluation of a mobile system for locating lost items for the visually impaired. In Proceedings of the 8th international ACM SIGACCESS Conference on Computers and Accessibility. 103–110.
- Stylette: Styling the web with natural language. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–17.
- Bridgett A King and Norman E Youngblood. 2016. E-government in Alabama: An analysis of county voting and election website content, usability, accessibility, and mobile readiness. Government Information Quarterly 33, 4 (2016), 715–726.
- A Review of Design and Evaluation Practices in Mobile Text Entry for Visually Impaired and Blind Persons. Multimodal Technologies and Interaction 7, 2 (2023), 22.
- Context Matters for Image Descriptions for Accessibility: Challenges for Referenceless Evaluation Metrics. arXiv preprint arXiv:2205.10646 (2022).
- Rebecca Krosnick and Steve Oney. 2022. ParamMacros: Creating UI Automation Leveraging End-User Natural Language Parameterization. In 2022 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 1–10.
- Webzeitgeist: design mining the web. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 3083–3092.
- Richard E Ladner. 2015. Design for user empowerment. interactions 22, 2 (2015), 24–29.
- Coauthor: Designing a human-ai collaborative writing dataset for exploring language model capabilities. In Proceedings of the 2022 CHI conference on human factors in computing systems. 1–19.
- Promptiverse: Scalable generation of scaffolding prompts through human-AI hybrid knowledge graph annotation. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–18.
- Interacting with mobile devices via VoiceOver: usability and accessibility issues. In Proceedings of the 24th Australian computer-human interaction conference. 339–348.
- Screen2vec: Semantic embedding of gui screens and gui components. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.
- Droidbot: a lightweight ui-guided test input generator for android. In 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C). IEEE, 23–26.
- Seq2seq dependency parsing. In Proceedings of the 27th International Conference on Computational Linguistics. 3203–3214.
- Chin-Yew Lin and Eduard Hovy. 2003. Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the 2003 human language technology conference of the North American chapter of the association for computational linguistics. 150–157.
- Automatic text input generation for mobile testing. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, 643–653.
- Will AI console me when I lose my pet? Understanding perceptions of AI-mediated email writing. In Proceedings of the 2022 CHI conference on human factors in computing systems. 1–13.
- Fill in the blank: Context-aware automated text input generation for mobile gui testing. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1355–1367.
- Make LLM a Testing Expert: Bringing Human-like Interaction to Mobile GUI Testing via Functionality-aware Decisions. arXiv preprint arXiv:2310.15780 (2023).
- Owl Eyes: Spotting UI Display Issues via Visual Understanding. In ASE. IEEE. https://doi.org/10.1145/3324884.3416547
- Nighthawk: Fully Automated Localizing UI Display Issues via Visual Understanding. IEEE Transactions on Software Engineering (2022), 1–16. https://doi.org/10.1109/TSE.2022.3150876
- Ex pede Herculem: Augmenting Activity Transition Graph for Apps via Graph Convolution Network. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1983–1995.
- NaviDroid: a tool for guiding manual Android testing via hint moves. In Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings. 154–158.
- Guided Bug Crush: Assist Manual GUI Testing of Android Apps via Hint Moves. In CHI 2022. https://doi.org/10.1145/3491102.3501903
- Text entry for the Blind on Smartwatches: A study of Braille code input methods for a novel device. Universal Access in the Information Society 22, 3 (2023), 737–755.
- Henry B Mann and Donald R Whitney. 1947. On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics (1947), 50–60.
- TypeInBraille: quick eyes-free typing on smartphones. In Computers Helping People with Special Needs: 13th International Conference, ICCHP 2012, Linz, Austria, July 11-13, 2012, Proceedings, Part II 13. Springer, 615–622.
- Data-driven accessibility repair revisited: on the effectiveness of generating labels for icons in Android apps. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 107–118.
- Efficient estimation of word representations in vector space. ICLR (2013).
- The accessibility of mobile health sensors for blind users. In International technology and persons with disabilities conference scientific/research proceedings (CSUN 2014). 166–175.
- Rethinking the role of demonstrations: What makes in-context learning work? arXiv preprint arXiv:2202.12837 (2022).
- A comprehensive system for monitoring urban accessibility in smart cities. Sensors 17, 8 (2017), 1834.
- John Morris and James Mueller. 2014. Blind and deaf consumer preferences for android and iOS smartphones. In Inclusive designing: Joining usability, accessibility, and inclusion. Springer, 69–79.
- Rich representations of visual content for screen reader users. In Proceedings of the 2018 CHI conference on human factors in computing systems. 1–11.
- ” With most of it being pictures now, I rarely use it” Understanding Twitter’s Evolving Accessibility to Blind Users. In Proceedings of the 2016 CHI conference on human factors in computing systems. 5506–5516.
- Editorial of the special issue on mobile human–computer interaction. , 429–430 pages.
- Retrieval-Based Prompt Selection for Code-Related Few-Shot Learning. In Proceedings of the 45th International Conference on Software Engineering (ICSE’23).
- Scene text access: A comparison of mobile OCR modalities for blind users. In Proceedings of the 24th International Conference on Intelligent User Interfaces. 197–207.
- BrailleType: unleashing braille over touch screen mobile phones. In Human-Computer Interaction–INTERACT 2011: 13th IFIP TC 13 International Conference, Lisbon, Portugal, September 5-9, 2011, Proceedings, Part I 13. Springer, 100–107.
- Fostering websites accessibility: A case study on the use of the Large Language Models ChatGPT for automatic remediation. In Proceedings of the 16th International Conference on PErvasive Technologies Related to Assistive Environments. 707–713.
- Tim Paek and David Maxwell Chickering. 2007. Improving command and control speech recognition on mobile devices: using predictive user models for language modeling. User modeling and user-adapted interaction 17 (2007), 93–117.
- PRISMA 2020 explanation and elaboration: updated guidance and exemplars for reporting systematic reviews. bmj 372 (2021).
- Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311–318.
- Toward accessible mobile application design: developing mobile application accessibility guidelines for people with visual impairment. Proceedings of HCI Korea (2014), 31–38.
- Examining Zero-Shot Vulnerability Repair with Large Language Models. In 2023 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, 1–18.
- Guidelines are only half of the story: accessibility problems encountered by blind users on the web. In Proceedings of the SIGCHI conference on human factors in computing systems. 433–442.
- Apps and mobile health technology in rehabilitation: the good, the bad, and the unknown. Physical Medicine and Rehabilitation Clinics 30, 2 (2019), 485–497.
- Getting smartphones to talkback: Understanding the smartphone adoption process of blind users. In Proceedings of the 17th international acm sigaccess conference on computers & accessibility. 23–32.
- Examining image-based button labeling for accessibility in Android apps through large-scale analysis. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility. 119–130.
- Latte: Use-case and assistive-service driven automated accessibility testing framework for android. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–11.
- Groundhog: An Automated Accessibility Crawler for Mobile Apps. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. 1–12.
- Predicting and explaining mobile ui tappability with vision modeling and saliency analysis. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–21.
- ChatGPT: Optimizing language models for dialogue.
- Carolyn B. Seaman. 1999. Qualitative methods in empirical studies of software engineering. IEEE Transactions on software engineering 25, 4 (1999), 557–572.
- Design, development and performance evaluation of reconfigured mobile Android phone for people who are blind or visually impaired. In Proceedings of the 28th ACM International Conference on Design of Communication. 159–166.
- On the effect of pretraining corpora on in-context learning by a large-scale language model. arXiv preprint arXiv:2204.13509 (2022).
- Usability, Accessibility and Social Entanglements in Advanced Tool Use by Vision Impaired Graduate Students. Proceedings of the ACM on Human-Computer Interaction 6, CSCW2 (2022), 1–21.
- Javier Sánchez Sierra and J Togores. 2012. Designing mobile apps for visually impaired and blind users. In The Fifth international conference on advances in computer-human interactions. Citeseer, 47–52.
- Computer vision-based door detection for accessibility of unfamiliar environments to blind persons. In Computers Helping People with Special Needs: 12th International Conference, ICCHP 2010, Vienna, Austria, July14-16, 2010, Proceedings, Part II 12. Springer, 263–270.
- Hussain Tinwala and I Scott MacKenzie. 2010. Eyes-free text entry with error correction on touchscreen mobile devices. In Proceedings of the 6th Nordic Conference on Human-Computer Interaction: Extending Boundaries. 511–520.
- UIAutomator. 2021. Python wrapper of Android uiautomator test tool. https://github.com/xiaocong/uiautomator.
- Cider: Consensus-based image description evaluation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4566–4575.
- Can everyone use my app? an empirical study on accessibility in android apps. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 41–52.
- Anna Visvizi and Miltiadis D Lytras. 2019. Sustainable smart cities and smart villages research: Rethinking security, safety, well-being, and happiness. , 215 pages.
- Enabling conversational interaction with mobile ui using large language models. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–17.
- Fahui Wang. 2012. Measurement, optimization, and impact of health care accessibility: a methodological review. Annals of the Association of American Geographers 102, 5 (2012), 1104–1112.
- DroidBot-GPT: GPT-powered UI Automation for Android. arXiv preprint arXiv:2304.07061 (2023).
- Never-ending Learning of User Interfaces. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. 1–13.
- A mobile-based barrier-free service transportation platform for people with disabilities. Computers in Human Behavior 107 (2020), 105776.
- A systematic evaluation of large language models of code. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming. 1–10.
- Shunguo Yan and PG Ramachandran. 2019. The current status of accessibility in mobile apps. ACM Transactions on Accessible Computing (TACCESS) 12, 1 (2019), 1–31.
- Automated conformance testing for JavaScript engines via deep compiler fuzzing. In Proceedings of the 42nd ACM SIGPLAN international conference on programming language design and implementation. 435–450.
- Sikuli: using GUI screenshots for search and automation. In Proceedings of the 22nd annual ACM symposium on User interface software and technology. 183–192.
- Comparative study of CNN and RNN for natural language processing. arXiv preprint arXiv:1702.01923 (2017).
- A review of recurrent neural networks: LSTM cells and network architectures. Neural computation 31, 7 (2019), 1235–1270.
- CIRCLE: continual repair across programming languages. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis. 678–690.
- Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068 (2022).
- iTiger: an automatic issue title generation tool. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1637–1641.
- Screen recognition: Creating accessibility metadata for mobile applications from pixels. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.
- Robust annotation of mobile application interfaces in methods for accessibility repair and enhancement. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology. 609–621.
- SeeingVR: A set of tools to make virtual reality more accessible to people with low vision. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1–14.
- Zhe Liu (234 papers)
- Chunyang Chen (86 papers)
- Junjie Wang (164 papers)
- Mengzhuo Chen (5 papers)
- Boyu Wu (8 papers)
- Yuekai Huang (11 papers)
- Jun Hu (239 papers)
- Qing Wang (341 papers)