Enabling On-Device Training of Speech Recognition Models with Federated Dropout (2110.03634v1)
Abstract: Federated learning can be used to train machine learning models on the edge on local data that never leave devices, providing privacy by default. This presents a challenge pertaining to the communication and computation costs associated with clients' devices. These costs are strongly correlated with the size of the model being trained, and are significant for state-of-the-art automatic speech recognition models. We propose using federated dropout to reduce the size of client models while training a full-size model server-side. We provide empirical evidence of the effectiveness of federated dropout, and propose a novel approach to vary the dropout rate applied at each layer. Furthermore, we find that federated dropout enables a set of smaller sub-models within the larger model to independently have low word error rates, making it easier to dynamically adjust the size of the model deployed for inference.
- Dhruv Guliani (6 papers)
- Lillian Zhou (4 papers)
- Changwan Ryu (3 papers)
- Tien-Ju Yang (16 papers)
- Harry Zhang (37 papers)
- Yonghui Xiao (15 papers)
- Giovanni Motta (11 papers)
- Francoise Beaufays (8 papers)