Can Copyright be Reduced to Privacy? (2305.14822v2)
Abstract: There is a growing concern that generative AI models will generate outputs closely resembling the copyrighted materials for which they are trained. This worry has intensified as the quality and complexity of generative models have immensely improved, and the availability of extensive datasets containing copyrighted material has expanded. Researchers are actively exploring strategies to mitigate the risk of generating infringing samples, with a recent line of work suggesting to employ techniques such as differential privacy and other forms of algorithmic stability to provide guarantees on the lack of infringing copying. In this work, we examine whether such algorithmic stability techniques are suitable to ensure the responsible use of generative models without inadvertently violating copyright laws. We argue that while these techniques aim to verify the presence of identifiable information in datasets, thus being privacy-oriented, copyright law aims to promote the use of original works for the benefit of society as a whole, provided that no unlicensed use of protected expression occurred. These fundamental differences between privacy and copyright must not be overlooked. In particular, we demonstrate that while algorithmic stability may be perceived as a practical tool to detect copying, such copying does not necessarily constitute copyright infringement. Therefore, if adopted as a standard for detecting an establishing copyright infringement, algorithmic stability may undermine the intended objectives of copyright law.
- O. Angel and Y. Spinka. Pairwise optimal coupling of multiple random variables. arXiv preprint arXiv:1903.00632, 2019.
- C. D. Asay. Independent creation in a world of ai. FIU L. Rev., 14:201, 2020.
- AVs.S. Andersen et al v. Stability AI Ltd. et al, Docket No. 3:23-cv-00201 (N.D. Cal. Jan 13, 2023). 2023.
- Y. Benkler. Free as the air to common use: First amendment constraints on enclosure of the public domain. NyuL Rev., 74:354, 1999.
- Synthetic data generators–sequential and private. Advances in Neural Information Processing Systems, 33:7114–7124, 2020.
- Simultaneous private learning of multiple concepts. In Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science, pages 369–380, 2016.
- Bvs.S. Baker v. Selden, 101 U.S 99. 1879.
- Bvs.S. Brown Bag Software v. Symantec Corp., 960 F.2d 1465, 1472 (9th Cir. 1992)). 1992.
- Extracting training data from diffusion models. arXiv preprint arXiv:2301.13188, 2023.
- J. E. Cohen. Configuring the networked self: Law, code, and the play of everyday practice. Yale University Press, 2012.
- U. CONST. U.S CONST. art. I, 8, cl. 8. ().
- Cvs.A. Campbell v. Acuff-Rose Music, Inc., 510 U.S 569, 578. 1994.
- Dvs.G. DOE 1 et al v. GitHub, Inc. et al class action. 2022.
- C. Dwork and V. Feldman. Privacy-preserving prediction. In Conference On Learning Theory, pages 1693–1702. PMLR, 2018.
- Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pages 265–284. Springer, 2006.
- Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pages 214–226, 2012.
- N. Elkin-Koren. Cyberlaw and social change: A democratic approach to copyright law in cyberspace. Cardozo Arts & Ent. LJ, 14:215, 1996.
- N. Elkin-Koren. Copyright in a digital ecosystem: a user-rights approach. Forthcoming in RUTH OKEDIJI, COPYRIGHT IN AN AGE OF LIMITATIONS AND EXCEPTIONS (2015), 2015.
- G. Franceschelli and M. Musolesi. Deepcreativity: measuring creativity with deep learning techniques. Intelligenza Artificiale, 16(2):151–163, 2022.
- Fvs.R. Feist Publ’ns, Inc. v. Rural Tel. Serv. Co., 499 US 340, 345. 1991.
- J. Gibson. Risk aversion and rights accretion in intellectual property law. Yale LJ, 116:882, 2006.
- A. Google. Authors Guild v. Google, Inc., 804 F.3d 202, 207–08, 225 (2d Cir. 2015)). 2015.
- J. Grimmelmann. Copyright for literate robots. Iowa L. Rev., 101:657, 2015.
- U. Y. Hacohen and N. Elkin-Koren. Copyright regenerated: Harnessing genai to measure originality and copyright scope. Hardvard Journal of Law & Technology, 2024.
- Reconstructing training data from trained neural networks. arXiv preprint arXiv:2206.07758, 2022.
- Sok: Memorization in general-purpose large language models. arXiv preprint arXiv:2310.18362, 2023.
- Foundation models and fair use. arXiv preprint arXiv:2303.15715, 2023.
- Hvs.R. Harper & Row v. Nation Enterprises, 471 U.S 539. 1985.
- R. H. Jones. The myth of the idea/expression dichotomy in copyright law. Pace L. Rev., 10:551, 1990.
- Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
- L. Kaplow. Rules versus standards: An economic analysis. Duke Law Journal, 42(3):557–629, 1992.
- Releasing search queries and clicks privately. In Proceedings of the 18th international conference on World wide web, pages 171–180, 2009.
- Talkin”bout ai generation: Copyright and the generative-ai supply chain. arXiv preprint arXiv:2309.08133, 2023.
- Legislation and L. C. C. Law). OPINION: USES OF COPYRIGHTED MATERIALS FOR MACHINE LEARNING. State of Israel Ministry of Justice, 2022.
- M. A. Lemley. Our bizarre system for proving copyright infringement. J. Copyright Soc’y USA, 57:719, 2009.
- M. A. Lemley and B. Casey. Fair learning. Tex. L. Rev., 99:743, 2020.
- J. Litman. The public domain. Emory Lj, 39:965, 1990.
- J. Litman. Billowing white goo. Colum. JL & Arts, 31:587, 2007.
- L.vs.B. Lotus Dev. Corp. v. Borland Int’l, Inc., 49 F.3d 807, 815 (1st Cir. 1995); Lotus Dev. Corp. v. Borland Int’l, Inc., 516 U.S 233 (1996) . 1996.
- Notice failure and notice externalities. Journal of Legal Analysis, 5(1):1–59, 2013.
- Mvs.S. Mazer v. Stein, 347 U.S 201, 219] . 1954.
- N. W. Netanel. Copyright’s paradox. Oxford University Press, 2008.
- N. W. Netanel. Making sense of fair use. Lewis & Clark L. Rev., 15:715, 2011.
- N.vs.C. Nash v. CBS, Inc., 899 F.2d 1537 (7th, cir., 1990). 1990.
- U. C. Office. Works Not Protected by Copyright. 2021.
- G. Parchomovsky and A. Stein. Originality. Va. L. Rev., 95:1505, 2009.
- M. Sag. The new legal landscape for text mining and machine learning. J. Copyright Soc’y USA, 66:291, 2018.
- P. Samuelson. The copyright grab. Wired Magazine, 4, 1996.
- P. Samuelson. Reconceptualizing copyright’s merger doctrine. J. Copyright Soc’y USA, 63:417, 2016.
- P. Samuelson. Generative ai meets copyright. Science, 381(6654):158–161, 2023.
- Formalizing human ingenuity: A quantitative framework for coyright law’s substantial similarity. arXiv preprint arXiv:2206.01230, 2022.
- Selective differential privacy for language modeling. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2848–2859, 2022.
- Svs.M. Sid & Marty Krofft TV Prod., Inc. v. McDonald’s Corp., 562 F.2d 1157, 1164 (9th Cir. 1977). 1977.
- Svs.W. SAS Institute Inc. v. World Programming Ltd., 64 F. Supp. 3d 755, 762 . 2014.
- U.S.C. 17 U.S.C. § 102(b). 2006.
- Uvs.K. Universal City Studios v. Kamar Industries, Inc., 217 USPQ. (BNA) 1165 (S.D Tex 1982). 1982.
- S. Vaidhyanathan. Copyrights and copywrongs. In Copyrights and Copywrongs. New York University Press, 2001.
- N. vs. U. Nichols v. Universal Pictures Corporation, 45 F.2d 119, (2st Cir., 1930) . 1930.
- Provable copyright protection for generative models. arXiv preprint arXiv:2302.10870, 2023.
- Wvs.G. Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith (Docket 21–869)]. .
- Niva Elkin-Koren (4 papers)
- Uri Hacohen (3 papers)
- Roi Livni (35 papers)
- Shay Moran (102 papers)