Usability in Interactive Systems

Updated 1 May 2026

Usability is the measure of a system's ability to help users achieve specified goals effectively, efficiently, and satisfactorily in a given context.
It encompasses key dimensions like effectiveness, efficiency, and satisfaction, using metrics such as task completion rate and time efficiency to quantify performance.
Evaluation strategies range from expert heuristic reviews and static artifact inspections to empirical user studies and automated analyses with multimodal LLMs.

Usability is a key quality attribute of interactive systems, encompassing the degree to which a product enables specified users to achieve specified goals effectively, efficiently, and satisfactorily within a specified context of use. Precision in definition, rigorous methodologies, and awareness of the broader social, organizational, and technical landscape are fundamental for modern usability research and practice.

1. Formal Characterization: The Usability Quintuple

A central contemporary formulation defines the type of usability under investigation as a 5-tuple: $U = (M,\,P,\,U_s,\,G,\,C)$ where:

$M\in\mathrm{LEVEL}$ : Level of usability metrics—either internal (static artifact assessment, e.g., code-based accessibility checking), external (inspection of a running system without real users, e.g., expert heuristic review), or in use (empirical study involving real users executing tasks). This three-tiered taxonomy directly mirrors ISO/IEC 25010 quality measurement levels.
$P\subseteq\mathrm{PRODUCT}$ : The product or system under consideration (e.g., desktop application, mobile application, online shop).
$U_s\subseteq\mathrm{USERS}$ : The target users or participant profile, capturing demographics, expertise, and special needs.
$G\subseteq\mathrm{GOALS}$ : The users’ goals, partitioned into do-goals (task/pragmatic; e.g., “complete checkout”) and be-goals (hedonic; e.g., “feel satisfied”).
$C\subseteq\mathrm{CONTEXT}$ : The context of use, which must specify physical setting, device, environment, and experimental setup.

This quintuple expresses usability as a member of the Cartesian product: $\mathrm{usability} \in \mathrm{LEVEL} \times \mathrm{PRODUCT} \times \mathrm{USERS} \times \mathrm{GOALS} \times \mathrm{CONTEXT}$ Specifying concrete values for all coordinates fully disambiguates which usability attribute is evaluated, transcending ambiguous or contextless uses of the term and grounding research findings in reproducible methodology. This approach generalizes the ISO 9241-11 provision that usability always pertains to “specified users, specified goals, specified context, on a product” (Speicher, 2015).

2. Dimensions and Metrics: Effectiveness, Efficiency, and Satisfaction

Consistent with ISO 9241-11, three principal axes—effectiveness, efficiency, and satisfaction—structure the empirical measurement of usability:

Effectiveness: The accuracy and completeness with which users achieve goals ( $\text{Effectiveness} = \frac{\text{Number of tasks completed successfully}}{\text{Total tasks assigned}}$ ).
Efficiency: Resources used in achieving effectiveness, commonly measured as time per successful task ( $\text{Efficiency} = \frac{\text{Total time}}{\text{Number of successful tasks}}$ ), but also encompassing user effort, error recovery time, and multitasking factors.
Satisfaction: Users’ subjective experience, comfort, and acceptability, typically measured via standardized questionnaires (e.g., mean Likert score per item,

$\text{Sat} = \frac{\sum_i s_i}{N}$

for $M\in\mathrm{LEVEL}$ 0 items).

Usability subcharacteristics frequently include learnability, memorability, error tolerance (rate, severity, recoverability), and cognitive load, with operationalization tailored to platform and context (Weichbroth, 16 Feb 2025). In mobile application usability (PACMAD+3 model), factors such as cognitive load, errors, learnability, operability, understandability, and memorability are recognized in addition to the canonical triad of effectiveness, efficiency, and satisfaction (Weichbroth, 16 Feb 2025).

3. Methodologies and Evaluation Strategies

3.1. Approaches and Metrics

Method choices span the internal/external/in-use hierarchy (Speicher, 2015):

Internal: Code or static artifact inspection for accessibility, consistency, or conformity.
External: Expert-based heuristic evaluation (e.g., using Nielsen’s 10 heuristics), cognitive walkthroughs, and guideline checklists.
In Use: Empirical user studies involving task performance, direct observation, video/screen capture, and post-hoc survey (e.g., System Usability Scale (SUS): $M\in\mathrm{LEVEL}$ 1 where $M\in\mathrm{LEVEL}$ 2 is item-adjusted scoring per standard rubric (Zermeño et al., 16 Jul 2025, Islam et al., 2020, Vincenti et al., 2017)).

Heuristic evaluation, pioneered by Nielsen, remains fundamental, but the advent of specialized domains has led to procedural methodologies for domain-specific heuristic development (e.g., PROMETHEUS, which comprises eight stages: domain characterization, retrieval and prioritization of candidate heuristics, detailed description, experimental validation, and iterative refinement with quantitative quality indicators for problem uniqueness, dispersion, severity, and specificity (Jimenez et al., 2018)).

3.2. Context in Usability Evaluation

Physical and social context are recognized as first-class components influencing both outcomes and replicability (Trivedi et al., 2012):

Physical context: Laboratory vs. field, ambient factors, device constraints, light, noise, and mobility.
Social context: Presence of facilitators, observers, paired vs. solo use, and participant relationships.

Empirical evidence demonstrates that physical and social environment can confound or amplify usability issues, with laboratory testing offering control and replicability but field studies revealing real-world interaction effects (Trivedi et al., 2012).

4. Automation and Scaling: Multimodal LLMs and Crowdsourcing

Recent advances in multimodal LLMs (MLLMs) have enabled automated, heuristic-based usability evaluation and recommendation pipelines:

Models ingest textual descriptions, task definitions, and visual artifacts (screen recordings or screenshots).
Externally guided by explicit prompt templates, these systems operationalize Nielsen’s heuristics (or domain-specific criteria), returning structured issue descriptions, recommendations, and severity rankings (Lubos et al., 28 Apr 2026, Lubos et al., 18 Nov 2025, Lubos et al., 22 Aug 2025).
Severity is ranked via latent scoring (considering impact, frequency, and remediation effort).
Human-in-the-loop studies find these approaches yield clarity, plausibility, and substantial completeness—furnishing low-effort, actionable outputs for both non-experts and distributed teams.

Crowdsourcing platforms enable early, large-scale usability evaluation but typically sacrifice the depth and control of closed-lab testing, highlighting a trade-off between speed/diversity and fine-grained observational fidelity (Liu et al., 2012).

5. Domain-Specific and Contextual Variants

In open source software projects, usability is a recurring but underrepresented concern, primarily discussed through issue tracking systems and most often focused on efficiency and aesthetic/minimalist design, with significantly higher visual artifact use (screenshots) in usability issues compared to non-usability issues (Sanei et al., 2023).

In scientific software, usability takes on additional dimensions: scriptability, batch operation support, precise configuration, reproducibility, and the capacity to accommodate evolving research requirements—often exceeding conventional graphical or ergonomics-focused concerns (Queiroz et al., 2017).

Specialized evaluation methodologies, such as the linguistic decision-making (LDM-WUE) approach for web usability A/B testing, aggregate multiple test criteria (SUS, NPS, empirical usability tasks, accessibility scores) using fuzzy 2-tuple models and multi-criteria decision analysis (FAHP–TOPSIS) to produce role- and persona-specific, interpretable rankings of alternatives (Zermeño et al., 16 Jul 2025).

6. Theoretical Models and Standard Alignment

Comprehensive models situate usability as a two-dimensional map from system properties to user activities (Winter et al., 2016):

Facts: Properties of the system (e.g., input/output channels, dialogue management, user characteristics).
Activities: User actions (formulate goal, execute, evaluate).
Impacts: Each fact/attribute pair exerts a positive or negative influence on an activity/attribute pair, e.g.,

$M\in\mathrm{LEVEL}$ 3

This structure enables rigorous isolation of gaps and contradictions in standards (e.g., ISO 15005), supports automatic guideline generation, and provides a testable basis for future quantitative modeling of usability impacts (Winter et al., 2016).

Standard frameworks—ISO 9241-11, ISO/IEC 25010—are closely mirrored in these formulations. The usability quintuple directly instantiates the ISO logic of “product, user, goal, context” with empirical and practical traceability (Speicher, 2015).

7. Socio-Political and Organizational Dimensions

Usability engineering is not neutral; traditional task-centric processes often prioritize organizational or commercial imperatives, subordinating user autonomy and awareness (Sayeed, 2019). The design of user interfaces can entrench patterns of use and reinforce limited user models, creating a feedback loop (the “politics of amnesia”) in which users lose agency and critical knowledge as the interface both shapes and reflects constructed needs.

Refined methodologies suggest reconstructing the user as an active participant, supporting transparency, command-based models (“describe & command” rather than “see & point”), and interfaces that reward expertise—breaking the cycle of enforced invisibility and restoring negotiation power between user and system (Sayeed, 2019).

References

(Speicher, 2015) What is Usability? A Characterization based on ISO 9241-11 and ISO/IEC 25010
(Weichbroth, 16 Feb 2025) Factors influencing the perceived usability of mobile applications
(Jimenez et al., 2018) PROMETHEUS: PROcedural METhodology for developing HEuristics of USability
(Lubos et al., 28 Apr 2026) Recommending Usability Improvements with Multimodal LLMs
(Lubos et al., 18 Nov 2025) Towards LLM-Based Usability Analysis for Recommender User Interfaces
(Sanei et al., 2023) Characterizing Usability Issue Discussions in Open Source Software Projects
(Queiroz et al., 2017) Good Usability Practices in Scientific Software Development
(Liu et al., 2012) Crowdsourcing for Usability Testing
(Zermeño et al., 16 Jul 2025) An Online A/B Testing Decision Support System for Web Usability Assessment Based on a Linguistic Decision-making Methodology
(Trivedi et al., 2012) Role of context in usability evaluations: A review
(Winter et al., 2016) A Comprehensive Model of Usability
(Sayeed, 2019) Usability in the Larger Reality: A Contrarian Argument for the Importance of Social and Political Considerations