Population Expansion for Training Language Models with Private Federated Learning (2307.07477v1)
Abstract: Federated learning (FL) combined with differential privacy (DP) offers ML training with distributed devices and with a formal privacy guarantee. With a large population of devices, FL with DP produces a performant model in a timely manner. However, for applications with a smaller population, not only does the model utility degrade as the DP noise is inversely proportional to population, but also the training latency increases since waiting for enough clients to become available from a smaller pool is slower. In this work, we thus propose expanding the population based on domain adaptation techniques to speed up the training and improves the final model quality when training with small populations. We empirically demonstrate that our techniques can improve the utility by 13% to 30% on real-world LLMing datasets.
- Tatsuki Koga (6 papers)
- Congzheng Song (23 papers)
- Martin Pelikan (9 papers)
- Mona Chitnis (5 papers)