Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MUD: Towards a Large-Scale and Noise-Filtered UI Dataset for Modern Style UI Modeling (2405.07090v1)

Published 11 May 2024 in cs.HC

Abstract: The importance of computational modeling of mobile user interfaces (UIs) is undeniable. However, these require a high-quality UI dataset. Existing datasets are often outdated, collected years ago, and are frequently noisy with mismatches in their visual representation. This presents challenges in modeling UI understanding in the wild. This paper introduces a novel approach to automatically mine UI data from Android apps, leveraging LLMs to mimic human-like exploration. To ensure dataset quality, we employ the best practices in UI noise filtering and incorporate human annotation as a final validation step. Our results demonstrate the effectiveness of LLMs-enhanced app exploration in mining more meaningful UIs, resulting in a large dataset MUD of 18k human-annotated UIs from 3.3k apps. We highlight the usefulness of MUD in two common UI modeling tasks: element detection and UI retrieval, showcasing its potential to establish a foundation for future research into high-quality, modern UIs.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Sidong Feng (19 papers)
  2. Suyu Ma (5 papers)
  3. Han Wang (418 papers)
  4. David Kong (2 papers)
  5. Chunyang Chen (86 papers)
Citations (3)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets