Applied Federated Learning: Improving Google Keyboard Query Suggestions (1812.02903v1)

Published 7 Dec 2018 in cs.LG and stat.ML

Abstract: Federated learning is a distributed form of machine learning where both the training data and model training are decentralized. In this paper, we use federated learning in a commercial, global-scale setting to train, evaluate and deploy a model to improve virtual keyboard search suggestion quality without direct access to the underlying user data. We describe our observations in federated training, compare metrics to live deployments, and present resulting quality increases. In whole, we demonstrate how federated learning can be applied end-to-end to both improve user experiences and enhance user privacy.

Citations (576)

View on Semantic Scholar

Summary

The paper introduces a two-stage federated learning method that trains models on-device to boost query click-through rates while preserving user privacy.
The approach employs logistic regression with historical and temporal features to address challenges like diurnal variability and population skew in training.
Results indicate marked improvements in CTR despite environmental impacts, with further gains anticipated through enhancements like LSTM-based featurization.

Applied Federated Learning: Improving Google Keyboard Query Suggestions

This paper presents a paper on federated learning (FL) implemented at a commercial scale, specifically for enhancing the query suggestion feature of the Google Keyboard (Gboard). The authors focus on leveraging FL to improve user experience and privacy simultaneously.

Gboard, a virtual keyboard for mobile devices, presents a unique opportunity for on-device training due to its extensive user base exceeding 1 billion installations as of 2018. The challenge lies in developing systems that respect user privacy while maintaining low latency in query suggestion dynamics. This paper addresses these challenges by utilizing FL to train models directly on user devices, ensuring sensitive data never leaves the user's device.

Federated Learning in Context

FL is a distributed ML approach where model training is decentralized. Traditionally, user data aggregation occurs on central servers; however, FL instead processes model updates on devices and aggregates these updates centrally. This design is advantageous for privacy-sensitive applications and large-scale data sets that are cumbersome to collect centrally.

Implementation Overview

The paper primarily concerns a two-stage model approach. First, a baseline model trained using traditional server-based methods generates query suggestions. Second, an FL-trained triggering model filters these suggestions to enhance query click-through rates (CTR). This triggering model was trained using logistic regression, with features such as historical user interactions and temporal data.

Training Observations

Several insights emerged from federated training experimentation, including:

Diurnal Variability: Training largely occurs when user devices meet specific requirements (e.g., charging, idle, and on a Wi-Fi network), leading to varying model training speeds.
Population Skew: Differences in training and deployment populations due to geographic and device constraints impacted model performance during live tests.

Figures within the paper, such as those depicting training progression and evaluation loss, highlight significant variability during training, especially between peak and off-peak hours.

Practical and Theoretical Implications

Deployments resulted in marked improvements in CTR, although divergences between expected and actual outcomes were noted. This discrepancy is attributed to environmental conditions, device specifications, and success rates of training client updates. Future iterations witnessed additional gains, especially with the incorporation of an LSTM for featurization, indicating further potential for model enhancement using FL.

Future Developments

The exploration in this paper suggests multiple pathways for advancing FL. Enhancements in satisfying environmental conditions and addressing skew sources may yield more precise models. As the framework evolves, the disparity between expected and actual metrics is expected to narrow, paving the way for broader application across different domains within AI.

The paper serves as a comprehensive paper of FL applied at scale, illustrating both the possibilities and the complexities involved in its implementation. The continued iteration and refinement suggest a promising future for FL-optimized models, particularly in privacy-critical applications like Gboard.

PDF Markdown