Contextual Confidence and Generative AI (2311.01193v2)
Abstract: Generative AI models perturb the foundations of effective human communication. They present new challenges to contextual confidence, disrupting participants' ability to identify the authentic context of communication and their ability to protect communication from reuse and recombination outside its intended context. In this paper, we describe strategies--tools, technologies and policies--that aim to stabilize communication in the face of these challenges. The strategies we discuss fall into two broad categories. Containment strategies aim to reassert context in environments where it is currently threatened--a reaction to the context-free expectations and norms established by the internet. Mobilization strategies, by contrast, view the rise of generative AI as an opportunity to proactively set new and higher expectations around privacy and authenticity in mediated communication.
- P.Ā Verma. (2023, March) They thought loved ones were calling for help. it was an AI scam. The Washington Post. [Online]. Available: https://www.washingtonpost.com/technology/2023/03/05/ai-voice-scam/
- J.Ā A. Lanz, āDating app tool upgraded with AI is poised to power catfishing,ā Decrypt, 2023.
- S.Ā Kreps and D.Ā L. Kriner, āThe potential impact of emerging technologies on democratic representation: Evidence from a field experiment,ā New Media & Society, pp. 1ā20, 2023.
- S.Ā Jain, D.Ā Siddharth, and G.Ā Weyl, āPlural publics,ā Edmond and Lily Safra Center for Ethics, 2023. [Online]. Available: https://gettingplurality.org/2023/03/18/plural-publics/
- C.Ā E. Shannon, āA mathematical theory of communication,ā The Bell System Technical Journal, vol.Ā 27, no.Ā 3, pp. 379ā423, 1948.
- I.Ā Solaiman, Z.Ā Talat, W.Ā Agnew, L.Ā Ahmad, D.Ā Baker, S.Ā L. Blodgett, H.Ā Daumé III, J.Ā Dodge, E.Ā Evans, S.Ā Hooker etĀ al., āEvaluating the social impact of generative AI systems in systems and society,ā arXiv preprint arXiv:2306.05949, 2023.
- R.Ā Shelby, S.Ā Rismani, K.Ā Henne, A.Ā Moon, N.Ā Rostamzadeh, P.Ā Nicholas, N.Ā Yilla-Akbari, J.Ā Gallegos, A.Ā Smart, E.Ā Garcia etĀ al., āSociotechnical harms of algorithmic systems: Scoping a taxonomy for harm reduction,ā in Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, 2023, pp. 723ā741.
- L.Ā Weidinger, J.Ā Mellor, M.Ā Rauh, C.Ā Griffin, J.Ā Uesato, P.-S. Huang, M.Ā Cheng, M.Ā Glaese, B.Ā Balle, A.Ā Kasirzadeh etĀ al., āEthical and social risks of harm from language models,ā arXiv preprint arXiv:2112.04359, 2021.
- R.Ā Bommasani, D.Ā A. Hudson, E.Ā Adeli, R.Ā Altman, S.Ā Arora, S.Ā von Arx, M.Ā S. Bernstein, J.Ā Bohg, A.Ā Bosselut, E.Ā Brunskill etĀ al., āOn the opportunities and risks of foundation models,ā arXiv preprint arXiv:2108.07258, 2021.
- L.Ā Weidinger, J.Ā Uesato, M.Ā Rauh, C.Ā Griffin, P.-S. Huang, J.Ā Mellor, A.Ā Glaese, M.Ā Cheng, B.Ā Balle, A.Ā Kasirzadeh etĀ al., āTaxonomy of risks posed by language models,ā in Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, 2022, pp. 214ā229.
- T.Ā Shevlane, S.Ā Farquhar, B.Ā Garfinkel, M.Ā Phuong, J.Ā Whittlestone, J.Ā Leung, D.Ā Kokotajlo, N.Ā Marchal, M.Ā Anderljung, N.Ā Kolt etĀ al., āModel evaluation for extreme risks,ā arXiv preprint arXiv:2305.15324, 2023.
- M.Ā Brundage, S.Ā Avin, J.Ā Clark, H.Ā Toner, P.Ā Eckersley, B.Ā Garfinkel, A.Ā Dafoe, P.Ā Scharre, T.Ā Zeitzoff, B.Ā Filar etĀ al., āThe malicious use of artificial intelligence: Forecasting, prevention, and mitigation,ā arXiv preprint arXiv:1802.07228, 2018.
- H.Ā Nissenbaum, āPrivacy as contextual integrity,ā Washington Law Review, vol.Ā 79, p. 119, 2004.
- National Science and Technology Council, āRoadmap for researchers on priorities related to information integrity research and development,ā 2022.
- D.Ā Allen and J.Ā Pottle, āDemocratic knowledge and the problem of faction,ā Knight Foundation White Paper Series, Trust, Media, and Democracy, 2018.
- A.Ā E. Marwick and d.Ā boyd, āI tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience,ā New Media & Society, vol.Ā 13, no.Ā 1, pp. 114ā133, 2011.
- N.Ā K. Baym and D.Ā Boyd, āSocially mediated publicness: An introduction,ā Journal of Broadcasting & Electronic Media, vol.Ā 56, no.Ā 3, pp. 320ā329, 2012.
- E.Ā Brynjolfsson, āThe Turing trap: The promise & peril of human-like artificial intelligence,ā Daedalus, vol. 151, no.Ā 2, pp. 272ā287, 2022.
- E.Ā Horvitz, āOn the horizon: Interactive and compositional deepfakes,ā in Proceedings of the 2022 International Conference on Multimodal Interaction.Ā Ā Ā Bengaluru, India: ACM, November 2022, pp. 653ā661.
- J.Ā Bote, āSanas, the buzzy Bay Area startup that wants to make the world sound whiter,ā San Francisco Gate, 2022.
- R.Ā Chandran. (2023, April) Indigenous groups fear culture distortion as AI learns their languages. The Japan Times. [Online]. Available: https://www.japantimes.co.jp/news/2023/04/10/world/indigenous-language-ai-colonization-worries/
- R.Ā McIlroy-Young, J.Ā Kleinberg, S.Ā Sen, S.Ā Barocas, and A.Ā Anderson, āMimetic models: Ethical implications of AI that acts like you,ā in Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, 2022, pp. 479ā490.
- J.Ā A. Goldstein, G.Ā Sastry, M.Ā Musser, R.Ā DiResta, M.Ā Gentzel, and K.Ā Sedova, āGenerative language models and automated influence operations: Emerging threats and potential mitigations,ā arXiv preprint arXiv:2301.04246, 2023.
- M.Ā Sharma, M.Ā Tong, T.Ā Korbak, D.Ā Duvenaud, A.Ā Askell, S.Ā R. Bowman, N.Ā Cheng, E.Ā Durmus, Z.Ā Hatfield-Dodds, S.Ā R. Johnston etĀ al., āTowards understanding sycophancy in language models,ā arXiv preprint arXiv:2310.13548, 2023.
- H.Ā Vasconcelos, M.Ā Jƶrke, M.Ā Grunde-McLaughlin, T.Ā Gerstenberg, M.Ā S. Bernstein, and R.Ā Krishna, āExplanations can reduce overreliance on AI systems during decision-making,ā Proceedings of the ACM on Human-Computer Interaction, vol.Ā 7, no. CSCW1, pp. 1ā38, 2023.
- P.Ā Henderson, X.Ā Li, D.Ā Jurafsky, T.Ā Hashimoto, M.Ā A. Lemley, and P.Ā Liang, āFoundation models and fair use,ā arXiv preprint arXiv:2303.15715, 2023.
- N.Ā Carlini, F.Ā Tramer, E.Ā Wallace, M.Ā Jagielski, A.Ā Herbert-Voss, K.Ā Lee, A.Ā Roberts, T.Ā Brown, D.Ā Song, U.Ā Erlingsson etĀ al., āExtracting training data from large language models,ā in 30th USENIX Security Symposium (USENIX Security 21), 2021, pp. 2633ā2650.
- M.Ā Nasr, N.Ā Carlini, J.Ā Hayase, M.Ā Jagielski, A.Ā F. Cooper, D.Ā Ippolito, C.Ā A. Choquette-Choo, E.Ā Wallace, F.Ā TramĆØr, and K.Ā Lee, āScalable extraction of training data from (production) language models,ā arXiv preprint arXiv:2311.17035, 2023.
- I.Ā Shumailov, Z.Ā Shumaylov, Y.Ā Zhao, Y.Ā Gal, N.Ā Papernot, and R.Ā Anderson, āThe curse of recursion: Training on generated data makes models forget,ā arXiv preprint arxiv:2305.17493, 2023.
- K.Ā Singhal, T.Ā Tu, J.Ā Gottweis, R.Ā Sayres, E.Ā Wulczyn, L.Ā Hou, K.Ā Clark, S.Ā Pfohl, H.Ā Cole-Lewis, D.Ā Neal etĀ al., āTowards expert-level medical question answering with large language models,ā arXiv preprint arXiv:2305.09617, 2023.
- European Disability Forum, āResolution on the āEU Artificial Intelligence Act for the inclusion of persons with disabilitiesā,ā Tech. Rep., 2023. [Online]. Available: https://www.edf-feph.org/content/uploads/2023/04/EDF-Board-Resolution-on-the-EU-Artificial-intelligence-Act-for {}-the-inclusion-of-persons-with-disabilities.pdf
- Internet Crime Complaint Center. (2022) Federal Bureau of Investigation elder fraud report. [Online]. Available: https://www.ic3.gov/Media/PDF/AnnualReport/2022_IC3ElderFraudReport.pdf
- A.Ā Puig. (2023, March) Scammers use AI to enhance their family emergency schemes. Federal Trade Commission Consumer Alert. [Online]. Available: https://consumer.ftc.gov/consumer-alerts/2023/03/scammers-use-ai-enhance-their-family-emergency-schemes
- Consumer Financial Protection Bureau, āOffice of servicemember affairs annual report,ā Tech. Rep., 2023. [Online]. Available: https://s3.amazonaws.com/files.consumerfinance.gov/f/documents/cfpb_osa-annual-report_2022.pdf
- M.Ā Xiao, M.Ā Wang, A.Ā Kulshrestha, and J.Ā Mayer, āAccount verification on social media: User perceptions and paid enrollment,ā arXiv preprint arXiv:2304.14939, 2023.
- D.Ā Akhawe and A.Ā P. Felt, āAlice in warningland: A large-scale field study of browser security warning effectiveness,ā in Proceedings of the 22nd USENIX Security Symposium, 2013, pp. 257ā272.
- The Coalition for Content Provenance and Authenticity. (2023) Overview of C2PA. [Online]. Available: https://c2pa.org/
- Project Origin. (2023) Project origin. [Online]. Available: https://www.originproject.info/
- Content Authenticity Initiative. (2023) Content authenticity initiative. [Online]. Available: https://contentauthenticity.org/
- Microsoft. (2023) Cross-platform origin of content framework. [Online]. Available: https://github.com/microsoft/xpoc-framework
- V.Ā Buterin. (2023) What do I think about Community Notes? [Online]. Available: https://vitalik.ca/general/2023/08/16/communitynotes.html
- D.Ā Alba, D.Ā Lu, L.Ā Yin, and F.Ā Eric, āHow muskās x is failing to stem the surge of misinformation about israel and gaza,ā Bloomberg.com. [Online]. Available: https://www.bloomberg.com/graphics/2023-israel-hamas-war-misinformation-twitter-community-notes
- eĀ Estonia. (2023) e-identity: ID-card. [Online]. Available: https://e-estonia.com/solutions/e-identity/id-card/
- Unique Identification Authority of India. (2023) Aadhaar. [Online]. Available: https://uidai.gov.in/en/my-aadhaar/get-aadhaar.html
- Singpass. (2023) Singapore government identity passport. [Online]. Available: https://www.singpass.gov.sg/
- Microsoft Research. (2023) U-prove. [Online]. Available: https://www.microsoft.com/en-us/research/project/u-prove/
- W3C. (2022) Verifiable credentials data model v1.1. [Online]. Available: https://www.w3.org/TR/vc-data-model/
- American Association of Motor Vehicle Administrators. (2023) Mobile driverās license (mDL) implementation guidelines. [Online]. Available: https://www.aamva.org/getmedia/b801da7b-5584-466c-8aeb-f230cef6dda5/mDL-Implementation-Guidelines-Version-1-2_final.pdf
- Digital Government Exchange (DGX) Digital Identity Working Group. (2022) Digital identity and verifiable credentials in centralised, decentralised and hybrid systems. [Online]. Available: https://www.developer.tech.gov.sg/our-digital-journey/digital-government-exchange/files/DGX%20DIWG%202022%20Report%20v1.5.pdf
- Apple. (2023) Apple vision pro. [Online]. Available: https://www.apple.com/apple-vision-pro/
- Microsoft. (2023) LinkedIn and Microsoft Entra introduce a new way to verify your workplace. [Online]. Available: https://www.microsoft.com/en-us/security/blog/2023/04/12/linkedin-and-microsoft-entra-introduce-a-new-way-to-verify-your-workplace/
- S.Ā Basu and R.Ā Malik. (2023) Indiaās Aadhaar surveillance project should concern us all. WIRED UK. [Online]. Available: https://www.wired.co.uk/article/india-aadhaar-biometrics-privacy
- Worldcoin. (2023) Worldcoin whitepaper. [Online]. Available: https://whitepaper.worldcoin.org/
- N.Ā Immorlica, M.Ā O. Jackson, and E.Ā G. Weyl, āVerifying identity as a social intersection,ā Available at SSRN 3375436, 2019.
- OAuth. (2023) Oauth information. [Online]. Available: https://mailarchive.ietf.org/arch/browse/oauth
- Gitcoin. (2023) Gitcoin passport. [Online]. Available: https://passport.gitcoin.co/
- SpruceID. (2023) SpruceID. [Online]. Available: https://spruceid.com/
- Proof of Humanity. (2023) Proof of humanity. [Online]. Available: https://proofofhumanity.id/
- D.Ā Siddarth, S.Ā Ivliev, S.Ā Siri, and P.Ā Berman, āWho watches the watchmen? A review of subjective approaches for sybil-resistance in proof of personhood protocols,ā Frontiers in Blockchain, vol.Ā 3, pp. 1ā16, 2020.
- S.Ā Jain, L.Ā Erichsen, and G.Ā Weyl, āA plural decentralized identity frontier: Abstraction v. composability tradeoffs in web3,ā arXiv preprint arXiv:2208.11443, 2022.
- Y.Ā Wen, J.Ā Kirchenbauer, J.Ā Geiping, and T.Ā Goldstein, āTree-ring watermarks: Fingerprints for diffusion images that are invisible and robust,ā arXiv preprint arXiv:2305.20030, 2023.
- J.Ā Kirchenbauer, J.Ā Geiping, Y.Ā Wen, J.Ā Katz, I.Ā Miers, and T.Ā Goldstein, āA watermark for large language models,ā arXiv preprint arXiv:2301.10226, 2023.
- S.Ā Abdelnabi and M.Ā Fritz, āAdversarial watermarking transformer: Towards tracing text provenance with data hiding,ā in Proceedings of the 2021 IEEE Symposium on Security and Privacy.Ā Ā Ā IEEE, 2021, pp. 121ā140.
- X.Ā Zhao, P.Ā Ananth, L.Ā Li, and Y.-X. Wang, āProvable robust watermarking for AI-generated text,ā arXiv preprint arXiv:2306.17439, 2023.
- S.Ā Aaronson. (2023) My AI safety lecture for UT effective altruism. Shtetl-Optimized. [Online]. Available: https://scottaaronson.blog/?p=6823
- S.Ā Gowal and P.Ā Kohli. (2023) Identifying AI-generated images with SynthID. [Online]. Available: https://www.deepmind.com/blog/identifying-ai-generated-images-with-synthid
- M.Ā Douze and P.Ā Fernandez. (2023, October) Stable signature: A new method for watermarking images created by open source generative ai. [Online]. Available: https://ai.meta.com/blog/stable-signature-watermarking-generative-ai
- Z.Ā Jiang, J.Ā Zhang, and N.Ā Z. Gong, āEvading watermark based detection of AI-generated content,ā arXiv preprint arXiv:2305.03807, 2023.
- X.Ā Zhao, K.Ā Zhang, Y.-X. Wang, and L.Ā Li, āGenerative autoencoders as watermark attackers: Analyses of vulnerabilities and threats,ā arXiv preprint arXiv:2306.01953, 2023.
- J.Ā Kirchenbauer, J.Ā Geiping, Y.Ā Wen, M.Ā Shu, K.Ā Saifullah, K.Ā Kong, K.Ā Fernando, A.Ā Saha, M.Ā Goldblum, and T.Ā Goldstein, āOn the reliability of watermarks for large language models,ā arXiv preprint arXiv:2306.04634, 2023.
- S.Ā Shoker, A.Ā Reddie, S.Ā Barrington, M.Ā Brundage, H.Ā Chahal, M.Ā Depp, B.Ā Drexel, R.Ā Gupta, M.Ā Favaro, J.Ā Hecla etĀ al., āConfidence-building measures for artificial intelligence: Workshop proceedings,ā arXiv preprint arXiv:2308.00862, 2023.
- A.Ā Karpur, D.Ā Lahav, J.Ā Matheny, J.Ā Alstott, and S.Ā Nevo, āSecuring artificial intelligence model weights: Interim report,ā 2023.
- D.Ā Kang, T.Ā Hashimoto, I.Ā Stoica, and Y.Ā Sun, āScaling up trustless dnn inference with zero-knowledge proofs,ā arXiv preprint arXiv:2210.08674, 2022.
- EZKL. (2023) What is EZKL? [Online]. Available: https://docs.ezkl.xyz/
- Evals. (2023) Update on ARCās recent eval efforts. [Online]. Available: https://evals.alignment.org/blog/2023-03-18-update-on-recent-evals/
- OpenAI. (2023) GPT-4 system card. [Online]. Available: https://cdn.openai.com/papers/gpt-4-system-card.pdf
- Anthropic. (2023) Model card: Claude-2. [Online]. Available: https://www-files.anthropic.com/production/images/Model-Card-Claude-2.pdf
- Cohere Safety Team and Responsibility Council. (2023) Generation model card. [Online]. Available: https://docs.cohere.com/docs/generation-card
- J.Ā Mƶkander, J.Ā Schuett, H.Ā R. Kirk, and L.Ā Floridi, āAuditing large language models: A three-layered approach,ā AI and Ethics, pp. 1ā31, 2023.
- P.Ā Cihon, M.Ā J. Kleinaltenkamp, J.Ā Schuett, and S.Ā D. Baum, āAI certification: Advancing ethical practice by reducing information asymmetries,ā IEEE Transactions on Technology and Society, vol.Ā 2, no.Ā 4, pp. 200ā209, 2021.
- M.Ā Brundage, S.Ā Avin, J.Ā Wang, H.Ā Belfield, G.Ā Krueger, G.Ā Hadfield, H.Ā Khlaaf, J.Ā Yang, H.Ā Toner, R.Ā Fong etĀ al., āToward trustworthy AI development: Mechanisms for supporting verifiable claims,ā arXiv preprint arXiv:2004.07213, 2020.
- I.Ā D. Raji, A.Ā Smart, R.Ā N. White, M.Ā Mitchell, T.Ā Gebru, B.Ā Hutchinson, J.Ā Smith-Loud, D.Ā Theron, and P.Ā Barnes, āClosing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing,ā in Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 2020, pp. 33ā44.
- S.Ā K. Katyal, āPrivate accountability in the age of artificial intelligence,ā UCLA Law Review, vol.Ā 66, p.Ā 54, 2019.
- T.Ā Gebru, J.Ā Morgenstern, B.Ā Vecchione, J.Ā W. Vaughan, H.Ā Wallach, H.Ā D. Iii, and K.Ā Crawford, āDatasheets for datasets,ā Communications of the ACM, vol.Ā 64, no.Ā 12, pp. 86ā92, 2021.
- I.Ā D. Raji, P.Ā Xu, C.Ā Honigsberg, and D.Ā Ho, āOutsider oversight: Designing a third party audit ecosystem for AI governance,ā in Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, 2022, pp. 557ā571.
- D.Ā Kang and S.Ā Waiwitlikhit. (2023, March) Tensorplonk: A āGPUā for ZKML, delivering 1,000x speedups. [Online]. Available: https://medium.com/@danieldkang/tensorplonk-a-gpu-for-zkml-delivering-1-000x-speedups-d1ab0ad27e1c
- āZK10: ZKML with EZKL: Where we are and the future,ā 2023. [Online]. Available: https://www.youtube.com/watch?v=YI3ljDis8sc
- E.Ā G. Weyl, P.Ā Ohlhaver, and V.Ā Buterin, āDecentralized society: Finding web3ās soul,ā Available at SSRN 4105763, 2022.
- (2023) Pairwise coordination subsidies: A new quadratic funding design. [Online]. Available: https://ethresear.ch/t/pairwise-coordination-subsidies-a-new-quadratic-funding-design/5553
- (2023) Plural communication channel. Plurality Network. [Online]. Available: https://github.com/PluralCC#about
- T.Ā Shevlane, āStructured access: An emerging paradigm for safe AI deployment,ā arXiv preprint arXiv:2201.05159, 2022.
- M.Ā Anderljung and J.Ā Hazell, āProtecting society from AI misuse: When are restrictions on capabilities warranted?ā arXiv preprint arXiv:2303.09377, 2023.
- M.Ā Anderljung, J.Ā Barnhart, J.Ā Leung, A.Ā Korinek, C.Ā OāKeefe, J.Ā Whittlestone, S.Ā Avin, M.Ā Brundage, J.Ā Bullock, D.Ā Cass-Beggs etĀ al., āFrontier AI regulation: Managing emerging risks to public safety,ā arXiv preprint arXiv:2307.03718, 2023.
- I.Ā Solaiman, āThe gradient of generative AI release: Methods and considerations,ā in Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, 2023, pp. 111ā122.
- Anthropic. (2023) Claude 2. [Online]. Available: https://www.anthropic.com/index/claude-2
- OpenAI. GPT-4 is OpenAIās most advanced system, producing safer and more useful responses. [Online]. Available: https://openai.com/gpt-4
- K.Ā Singhal, S.Ā Azizi, T.Ā Tu, S.Ā S. Mahdavi, J.Ā Wei, H.Ā W. Chung, N.Ā Scales, A.Ā Tanwani, H.Ā Cole-Lewis, S.Ā Pfohl etĀ al., āLarge language models encode clinical knowledge,ā arXiv preprint arXiv:2212.13138, 2022.
- J.Ā Howard. (2023, November) AI safety and the age of dislightenment. fast.ai. [Online]. Available: https://www.fast.ai/posts/2023-11-07-dislightenment.html
- S.Ā Nakamoto, āBitcoin: A peer-to-peer electronic cash system,ā Decentralized business review, 2008.
- E.Ā Medina and R.Ā Mac, āMusk says twitter is limiting number of posts users can read,ā New York Times, 2023. [Online]. Available: https://www.nytimes.com/2023/07/01/business/twitter-rate-limit-elon-musk.html
- G.Ā Support. Prevent mail to Gmail users from being blocked or sent to spam. [Online]. Available: https://support.google.com/a/answer/81126?sjid=2987346224567351299-NA
- Microsoft. (2023) Data loss prevention. [Online]. Available: https://www.microsoft.com/en-us/security/business/security-101/what-is-data-loss-prevention-dlp
- (2023) Custom instructions for chatgpt. [Online]. Available: https://openai.com/blog/custom-instructions-for-chatgpt
- S.Ā Petridis, B.Ā Wedin, J.Ā Wexler, A.Ā Donsbach, M.Ā Pushkarna, N.Ā Goyal, C.Ā J. Cai, and M.Ā Terry, āConstitutionmaker: Interactively critiquing large language models by converting feedback into principles,ā arXiv preprint arXiv:2310.15428, 2023.
- M.Ā Jakobsson, K.Ā Sako, and R.Ā Impagliazzo, āDesignated verifier proofs and their applications,ā in In Proceedings of the International Conference on the Theory and Applications of Cryptographic Techniques.Ā Ā Ā Springer, 1996, pp. 143ā154.
- J.Ā Lanier, āHow to fix Twitter - and all of social media,ā Retreived from https://www.theatlantic.com/technology/archive/2022/05/how-to-fix-twitter-social-media/629951/, 2022.
- L.Ā Ouyang, J.Ā Wu, X.Ā Jiang, D.Ā Almeida, C.Ā Wainwright, P.Ā Mishkin, C.Ā Zhang, S.Ā Agarwal, K.Ā Slama, A.Ā Ray etĀ al., āTraining language models to follow instructions with human feedback,ā Advances in Neural Information Processing Systems, vol.Ā 35, pp. 27ā730ā27ā744, 2022.
- Y.Ā Bai, S.Ā Kadavath, S.Ā Kundu, A.Ā Askell, J.Ā Kernion, A.Ā Jones, A.Ā Chen, A.Ā Goldie, A.Ā Mirhoseini, C.Ā McKinnon etĀ al., āConstitutional AI: Harmlessness from AI feedback,ā arXiv preprint arXiv:2212.08073, 2022.
- F.Ā Khani and M.Ā T. Ribeiro, āCollaborative development of NLP models,ā arXiv preprint arXiv:2305.12219, 2023.
- H.Ā Touvron, L.Ā Martin, K.Ā Stone, P.Ā Albert, A.Ā Almahairi, Y.Ā Babaei, N.Ā Bashlykov, S.Ā Batra, P.Ā Bhargava, S.Ā Bhosale etĀ al., āLlama 2: Open foundation and fine-tuned chat models,ā arXiv preprint arXiv:2307.09288, 2023.
- Y.Ā Li, S.Ā Bubeck, R.Ā Eldan, A.Ā DelĀ Giorno, S.Ā Gunasekar, and Y.Ā T. Lee, āTextbooks are all you need II: phi-1.5 technical report,ā arXiv preprint arXiv:2309.05463, 2023.
- C.Ā Xu, Q.Ā Sun, K.Ā Zheng, X.Ā Geng, P.Ā Zhao, J.Ā Feng, C.Ā Tao, and D.Ā Jiang, āWizardlm: Empowering large language models to follow complex instructions,ā arXiv preprint arXiv:2304.12244, 2023.
- Salesforce. (2023) Xgen. [Online]. Available: https://github.com/salesforce/xgen
- Falcon LLM Team. (2023) Falcon LLM. [Online]. Available: https://falconllm.tii.ae/
- (2023) Whoās Harry Potter? Making LLMs forget. Accessed: September 26, 2023. [Online]. Available: https://www.microsoft.com/en-us/research/project/physics-of-agi/articles/whos-harry-potter-making-llms-forget-2/
- D.Ā Choi, Y.Ā Shavit, and D.Ā Duvenaud, āTools for verifying neural modelsā training data,ā arXiv preprint arXiv:2307.00682, 2023.
- S.Ā Longpre, R.Ā Mahari, N.Ā Muennighoff, A.Ā Chen, K.Ā Perisetla, W.Ā Brannon, J.Ā Kabbara, L.Ā Villa, and S.Ā Hooker, āThe data provenance project,ā in Proceedings of the 40th International Conference on Machine Learning, 2023.
- T.Ā Hardjono and A.Ā Pentland, āData cooperatives: Towards a foundation for decentralized personal data management,ā arXiv preprint arXiv:1905.08819, 2019.
- K.Ā Schwab, A.Ā Marcus, J.Ā Oyola, W.Ā Hoffman, and M.Ā Luzi, āPersonal data: The emergence of a new asset class,ā in An Initiative of the World Economic Forum.Ā Ā Ā World Economic Forum Cologny, Switzerland, 2011, pp. 1ā40.
- (2023) Data freedom act. RadicalxChange. [Online]. Available: https://www.radicalxchange.org/media/papers/data-freedom-act.pdf
- P.Ā W. Koh and P.Ā Liang, āUnderstanding black-box predictions via influence functions,ā in International conference on machine learning.Ā Ā Ā PMLR, 2017, pp. 1885ā1894.
- V.Ā Feldman and C.Ā Zhang, āWhat neural networks memorize and why: Discovering the long tail via influence estimation,ā Advances in Neural Information Processing Systems, vol.Ā 33, pp. 2881ā2891, 2020.
- R.Ā Grosse, J.Ā Bae, C.Ā Anil, N.Ā Elhage, A.Ā Tamkin, A.Ā Tajdini, B.Ā Steiner, D.Ā Li, E.Ā Durmus, E.Ā Perez etĀ al., āStudying large language model generalization with influence functions,ā arXiv preprint arXiv:2308.03296, 2023.
- S.Ā M. Park, K.Ā Georgiev, A.Ā Ilyas, G.Ā Leclerc, and A.Ā Madry, āTrak: Attributing model behavior at scale,ā arXiv preprint arXiv:2303.14186, 2023.
- A.Ā Ilyas, S.Ā M. Park, L.Ā Engstrom, G.Ā Leclerc, and A.Ā Madry, āDatamodels: Predicting predictions from training data,ā in Proceedings of the 39th International Conference on Machine Learning, 2022.
- A.Ā Ghorbani and J.Ā Zou, āData Shapley: Equitable valuation of data for machine learning,ā in Proceedings of the 36th International Conference on Machine Learning, 2019, pp. 2242ā2251.
- R.Ā Jia, D.Ā Dao, B.Ā Wang, F.Ā A. Hubis, N.Ā Hynes, N.Ā M. Gürel, B.Ā Li, C.Ā Zhang, D.Ā Song, and C.Ā J. Spanos, āTowards efficient data valuation based on the Shapley value,ā in The 22nd International Conference on Artificial Intelligence and Statistics.Ā Ā Ā PMLR, 2019, pp. 1167ā1176.
- Z.Ā Hammoudeh and D.Ā Lowd, āTraining data influence analysis and estimation: A survey,ā arXiv preprint arXiv:2212.04612, 2022.
- D.Ā Bogdanov, P.Ā Laud, S.Ā Laur, and P.Ā Pullonen, āFrom input private to universally composable secure multi-party computation primitives,ā in 2014 IEEE 27th Computer Security Foundations Symposium.Ā Ā Ā IEEE, 2014, pp. 184ā198.
- C.Ā Dwork, āDifferential privacy,ā in International colloquium on automata, languages, and programming.Ā Ā Ā Springer, 2006, pp. 1ā12.
- M.Ā Abadi, A.Ā Chu, I.Ā Goodfellow, H.Ā B. McMahan, I.Ā Mironov, K.Ā Talwar, and L.Ā Zhang, āDeep learning with differential privacy,ā in Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 308ā318.
- A.Ā Shamir, āHow to share a secret,ā Communications of the ACM, vol.Ā 22, no.Ā 11, pp. 612ā613, 1979.
- M.Ā Sabt, M.Ā Achemlal, and A.Ā Bouabdallah, āTrusted execution environment: what it is, and what it is not,ā in 2015 IEEE Trustcom/BigDataSE/Ispa, vol.Ā 1.Ā Ā Ā IEEE, 2015, pp. 57ā64.
- B.Ā McMahan, E.Ā Moore, D.Ā Ramage, S.Ā Hampson, and B.Ā A. yĀ Arcas, āCommunication-efficient learning of deep networks from decentralized data,ā in Artificial intelligence and statistics.Ā Ā Ā PMLR, 2017, pp. 1273ā1282.
- L.Ā Ho, J.Ā Barnhart, R.Ā Trager, Y.Ā Bengio, M.Ā Brundage, A.Ā Carnegie, R.Ā Chowdhury, A.Ā Dafoe, G.Ā Hadfield, M.Ā Levi etĀ al., āInternational institutions for advanced AI,ā arXiv preprint arXiv:2307.04699, 2023.
- J.Ā Schuett, N.Ā Dreksler, M.Ā Anderljung, D.Ā McCaffary, L.Ā Heim, E.Ā Bluemke, and B.Ā Garfinkel, āTowards best practices in agi safety and governance: A survey of expert opinion,ā arXiv preprint arXiv:2305.07153, 2023.
- The White House. (2023) Fact sheet: Biden-Harris administration secures voluntary commitments from leading artificial intelligence companies to manage the risks posed by AI. [Online]. Available: https://www.whitehouse.gov/briefing-room/statements-releases/2023/07/21/fact-sheet-biden-harris-administration-secures-voluntary-commitments-from-leading-artificial-intelligence
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.