Papers
Topics
Authors
Recent
Search
2000 character limit reached

Differentially Private Language Models Benefit from Public Pre-training

Published 13 Sep 2020 in cs.LG, cs.CL, and cs.CR | (2009.05886v2)

Abstract: Language modeling is a keystone task in natural language processing. When training a LLM on sensitive information, differential privacy (DP) allows us to quantify the degree to which our private data is protected. However, training algorithms which enforce differential privacy often lead to degradation in model quality. We study the feasibility of learning a LLM which is simultaneously high-quality and privacy preserving by tuning a public base model on a private corpus. We find that DP fine-tuning boosts the performance of LLMs in the private domain, making the training of such models possible.

Citations (50)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.