- The paper presents PAD, a keyboard-driven interaction paradigm that combines inline preview, candidate cycling, and timing-based acceptance to minimize hand motion.
- The system achieves significant motion reduction—saving an average of 600px per replaced click—and matches trackpad task speeds when AI predictions are highly accurate.
- The authors demonstrate PAD’s promise for enhanced accessibility and legacy hardware deployment through an open-source, browser-based implementation without external APIs.
Preview-Accept-Discard (PAD): A Predictive Low-Motion Interaction Paradigm for Ergonomic GUI Control
Introduction
The paper presents the Preview-Accept-Discard (PAD) interaction paradigm, a predictive, keyboard-driven approach to graphical user interface (GUI) control that aims to reduce repetitive strain injury (RSI) risk by shifting input effort away from conventional pointing devices (mice, trackpads) to the keyboard. PAD leverages AI-assisted target prediction, explicit preview, and timing-based acceptance/rejection via chorded input sequences. The central motivation is both ergonomic and cognitive: PAD seeks to minimize fine-motor control demands while preserving user agency and lowering shortcut memorization burdens.
RSI prevalence among computer users remains unacceptably high, with conventional ergonomic redesigns unable to fully mitigate hand motion demands. The paper builds upon hybrid models in HCI theory, specifically those that formalize GUI selection as the joint outcome of decision cost (Hick's Law) and pointing cost (Fitts's Law). Prior systems (e.g., One-Press [12], ModeKeys [6], AimKeys [6], and KeyMap [24]) each contributed discrete advances—preview mechanisms, keyboard-centric navigation, and guided shortcut learning—but none synthesized predictive target ranking, bounded cycling, and release-timing acceptance in a manner suitable for standard keyboards.
PAD fills this gap by integrating inline preview, candidate cycling via the spacebar, and acceptance semantics through key-release timing, operationalized entirely on commodity hardware.
System Architecture and Interaction Grammar
PAD adopts a minimalist interaction grammar based on simultaneous and sequential key holds/releases:
- Holding
Z+X initiates PAD mode, previewing the highest-ranked prediction.
Spacebar cycles through up to 6 ranked candidates, analogous to top-N accuracy metrics in spell checkers (typically N=3).
- Simultaneous release (within a 170ms window) of
Z+X commits the selected prediction; sequential release discards it.
Predicted GUI targets are visually highlighted using animated curved chords for preview, with transitions designed to maintain perceptual continuity and agency. The selection logic generalizes to any system where AI can rank actionable DOM elements.
Implementation Highlights:
- All input processing occurs client-side; no external APIs required.
- The system is open-sourced and browser-based (React Native), enabling dissemination, replication, and deployment on legacy hardware.
- Chord animations follow principles of feed-forward and transparency, supporting discoverability and explicit control.
Evaluation Methodology and Results
Task Domains
- Email Client Mockup: Within-subjects paper employing a reply-and-send sequence with deterministic predictions.
- ISO 9241-9 Keyboard-Prediction Task: Standardized pointing throughput assessment across varied top-N accuracies.
- Motion Reduction: PAD achieves qualitative elimination of pointer travel; participants averaged a reduction of 3,000 pixels per five accepted chord suggestions (~600px saved per replaced click), compared to trackpad users.
- Task-Completion Time: PAD matches trackpad performance when top-1 prediction accuracy is at spell checker levels (≥90%).
- Agency: Participants reported high perceived control and situational awareness due to explicit acceptance/rejection and preview feedback.
- Error Rate: PAD with ideal top-3 accuracy yields lower missed target rates than trackpad operation.
- Learning Curve: Mastery required ~2 minutes; onboarding progressively introduced PAD grammar.
Notably, PAD rarely exceeds the speed of a trackpad except under near-perfect AI prediction. The cognitive load of verifying and cycling suggestions offsets some of the ergonomic savings unless model accuracy is very high.
Trade-offs and Implementation Considerations
| Approach |
Motion Reduction |
Speed Gain |
Cognitive Demand |
User Agency |
Input Device Requirement |
| Conventional Trackpad |
None |
Baseline |
Baseline |
High |
Trackpad, Mouse |
| PAD @ High Accuracy |
Substantial |
Marginal |
Slight Increase |
High |
Keyboard |
| PAD @ Low Accuracy |
Substantial |
Reduced |
Higher |
High |
Keyboard |
- Optimal Use-Case: PAD excels in domains with predictable, repetitive UI actions and high-quality model prediction (e.g., email clients, productivity suites).
- Limitations: The prototype used deterministic, hard-coded predictions. Performance and acceptance may degrade with lower predictive accuracy. Release-timing acceptance may be inaccessible for some users (e.g., those with limited dexterity).
- Scalability: PAD is hardware-agnostic and low-latency, suitable for web applications and low-cost devices.
Implications and Future Directions
Practical Implications
- RSI Mitigation: PAD provides a pathway to substantially reduce hand motion, a critical factor in RSI etiology. Its deployment can extend the operational lifespan of aging devices with degraded pointing hardware, promoting sustainable computing practices.
- Accessibility: Reduces dependency on mice/trackpads for users with tremor, limited dexterity, or chronic strain.
- Inclusion: Facilitates device usability in resource-limited settings where repair or replacement is nontrivial.
Theoretical Contributions
- PAD exemplifies a hybrid Hicks-Fitts model of predictive interaction, confirming that cognitive verification and motor effort co-exist even under ideal AI performance.
- The research highlights the necessity of explicit agency-preservation mechanisms in human-AI interfaces, reinforcing best practices in feed-forward UX and transparency.
Future Research Directions
- Dataset Synthesis: Collection and training of point-and-click prediction models at the HTML DOM level.
- Longitudinal Outcomes: Clinical validation of fatigue reduction and RSI risk mitigation via EMG and motion-tracking.
- Adaptive Acceptance: Calibration and personalization of chord release-timing based on user-specific motor profiles.
Conclusion
The PAD paradigm demonstrates the feasibility and value of shifting GUI control from motor to cognitive domains via predictive, preview-driven keyboard shortcuts. By operationalizing interaction as a sequence of preview, selection, and timing-based confirmation, PAD reduces hand motion with minimal loss in performance, elevates user agency, and reduces shortcut memorization burdens. The approach is deployable on legacy hardware, offers advantages for accessibility and sustainability, and opens new avenues for AI-assisted ergonomics. Further work is required to validate long-term benefits and optimize real-world adoption in diverse user populations.