Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Screen Recognition: Creating Accessibility Metadata for Mobile Applications from Pixels (2101.04893v1)

Published 13 Jan 2021 in cs.HC

Abstract: Many accessibility features available on mobile platforms require applications (apps) to provide complete and accurate metadata describing user interface (UI) components. Unfortunately, many apps do not provide sufficient metadata for accessibility features to work as expected. In this paper, we explore inferring accessibility metadata for mobile apps from their pixels, as the visual interfaces often best reflect an app's full functionality. We trained a robust, fast, memory-efficient, on-device model to detect UI elements using a dataset of 77,637 screens (from 4,068 iPhone apps) that we collected and annotated. To further improve UI detections and add semantic information, we introduced heuristics (e.g., UI grouping and ordering) and additional models (e.g., recognize UI content, state, interactivity). We built Screen Recognition to generate accessibility metadata to augment iOS VoiceOver. In a study with 9 screen reader users, we validated that our approach improves the accessibility of existing mobile apps, enabling even previously inaccessible apps to be used.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (12)
  1. Xiaoyi Zhang (39 papers)
  2. Lilian de Greef (1 paper)
  3. Amanda Swearngin (14 papers)
  4. Samuel White (3 papers)
  5. Kyle Murray (6 papers)
  6. Lisa Yu (3 papers)
  7. Qi Shan (19 papers)
  8. Jeffrey Nichols (25 papers)
  9. Jason Wu (28 papers)
  10. Chris Fleizach (1 paper)
  11. Aaron Everitt (1 paper)
  12. Jeffrey P. Bigham (48 papers)
Citations (143)