2000 character limit reached
Responsible Reporting for Frontier AI Development (2404.02675v1)
Published 3 Apr 2024 in cs.CY and cs.AI
Abstract: Mitigating the risks from frontier AI systems requires up-to-date and reliable information about those systems. Organizations that develop and deploy frontier systems have significant access to such information. By reporting safety-critical information to actors in government, industry, and civil society, these organizations could improve visibility into new and emerging risks posed by frontier systems. Equipped with this information, developers could make better informed decisions on risk management, while policymakers could design more targeted and robust regulatory infrastructure. We outline the key features of responsible reporting and propose mechanisms for implementing them in practice.
- AI Policy and Governance Working Group “Comment of the AI Policy and Governance Working Group on the NTIA AI Accountability Policy Request for Comment Docket NTIA-230407-0093”, 2023 URL: https://www.ias.edu/sites/default/files/AI%20Policy%20and%20Governance%20Working%20Group%20NTIA%20Comment.pdf
- “Frontier AI Regulation: Managing Emerging Risks to Public Safety” arXiv, 2023 DOI: 10.48550/arXiv.2307.03718
- “Towards Publicly Accountable Frontier LLMs: Building an External Scrutiny Ecosystem under the ASPIRE Framework” arXiv, 2023 DOI: 10.48550/arXiv.2311.14711
- “Responsible Scaling: Comparing Government Guidance and Company Policy”, 2024 URL: https://www.iaps.ai/research/responsible-scaling
- Anthropic “Anthropic’s Responsible Scaling Policy”, 2023 URL: https://www.anthropic.com/news/anthropics-responsible-scaling-policy
- Yonathan A. Arbel, Matthew Tokson and Albert Lin “Systemic Regulation of Artificial Intelligence” In Arizona State Law Journal, AI Safety Legal Paper Series 12-24 URL: https://ssrn.com/abstract=4666854
- “AI Accidents: An Emerging Threat”, 2021 URL: https://cset.georgetown.edu/publication/ai-accidents-an-emerging-threat/
- Amanda Askell, Miles Brundage and Gillian Hadfield “The Role of Cooperation in Responsible AI Development” arXiv, 2019 DOI: 10.48550/arXiv.1907.04534
- “Filling gaps in trustworthy development of AI” Publisher: American Association for the Advancement of Science In Science 374.6573, 2021, pp. 1327–1329 DOI: 10.1126/science.abi7176
- “Managing AI Risks in an Era of Rapid Progress” arXiv, 2023 DOI: 10.48550/arXiv.2310.17688
- “AI auditing: The Broken Bus on the Road to AI Accountability” arXiv, 2024 DOI: 10.48550/arXiv.2401.14462
- “The Foundation Model Transparency Index” arXiv, 2023 DOI: 10.48550/arXiv.2310.12941
- “Ecosystem Graphs” URL: https://crfm.stanford.edu/ecosystem-graphs/index.html?mode=table
- “Ecosystem Graphs: The Social Footprint of Foundation Models” arXiv, 2023 DOI: 10.48550/arXiv.2303.15772
- “Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims” arXiv, 2020 DOI: 10.48550/arXiv.2004.07213
- “Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback” arXiv, 2023 DOI: 10.48550/arXiv.2307.15217
- “Black-Box Access is Insufficient for Rigorous AI Audits” arXiv, 2024 DOI: 10.48550/arXiv.2401.14446
- CFPB “CFPB Supervision and Examination Process”, 2022 URL: https://files.consumerfinance.gov/f/documents/cfpb_supervision-and-examination-manual.pdf
- Jack Clark “Information Markets and AI Development” In The Oxford Handbook of AI Governance Oxford University Press, 2023 DOI: 10.1093/oxfordhb/9780197579329.013.21
- Allan Dafoe “AI Governance: Overview and Theoretical Lenses” In The Oxford Handbook of AI Governance Oxford University Press, 2023 DOI: 10.1093/oxfordhb/9780197579329.013.2
- Department for Science, Innovation & Technology “Introducing the AI Safety Institute” URL: https://www.gov.uk/government/publications/ai-safety-institute-overview/introducing-the-ai-safety-institute
- Department for Science, Innovation and Technology “Policy paper: Emerging processes for frontier AI safety”, 2023 URL: https://www.gov.uk/government/publications/emerging-processes-for-frontier-ai-safety/emerging-processes-for-frontier-ai-safety
- Department for Science, Innovation and Technology and AI Safety Institute “Policy Updates”, 2023 URL: https://www.aisafetysummit.gov.uk/policy-updates/
- “Tech entrepreneur Ian Hogarth to lead UK’s AI Foundation Model Taskforce”, 2023 URL: https://www.gov.uk/government/news/tech-entrepreneur-ian-hogarth-to-lead-uks-ai-foundation-model-taskforce
- “The Bletchley Declaration by Countries Attending the AI Safety Summit, 1-2 November 2023”, 2023 URL: https://www.gov.uk/government/publications/ai-safety-summit-2023-the-bletchley-declaration/the-bletchley-declaration-by-countries-attending-the-ai-safety-summit-1-2-november-2023
- Roel I.J. Dobbe “System Safety and Artificial Intelligence” In The Oxford Handbook of AI Governance Oxford University Press, 2022 DOI: 10.1093/oxfordhb/9780197579329.013.67
- “Oversight for Frontier AI through a Know-Your-Customer Scheme for Compute Providers” arXiv, 2023 DOI: 10.48550/arXiv.2310.13625
- EPOCH “Epoch Database”, 2024 URL: https://epochai.org/data/epochdb/table
- European Parliament “2021/0106(COD): Artificial Intelligence Act” URL: https://oeil.secure.europarl.europa.eu/oeil/popups/ficheprocedure.do?reference=2021/0106(COD)&l=en
- “Governing AI safety through independent audits” Publisher: Nature Publishing Group In Nature Machine Intelligence 3.7, 2021, pp. 566–571 DOI: 10.1038/s42256-021-00370-7
- Federal Aviation Administration “Mandatory and Voluntary Incident Reporting”, 2023 URL: https://www.faa.gov/hazmat/incident-reporting
- Federal Aviation Administration “The Inspection Process”, 2023 URL: https://www.faa.gov/hazmat/safecargo/why_am_i_being_inspected/inspection_process
- “Adding Structure to AI Harm”, 2023 URL: https://cset.georgetown.edu/publication/adding-structure-to-ai-harm/
- Frontier Model Forum “Frontier Model Forum: Advancing frontier AI safety” URL: https://www.frontiermodelforum.org/
- “Predictability and Surprise in Large Generative Models” In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22 New York, NY, USA: Association for Computing Machinery, 2022, pp. 1747–1764 DOI: 10.1145/3531146.3533229
- “Datasheets for Datasets” arXiv, 2021 DOI: 10.48550/arXiv.1803.09010
- “Gemini: A Family of Highly Capable Multimodal Models” arXiv, 2023 DOI: 10.48550/arXiv.2312.11805
- “Reward Reports for Reinforcement Learning” In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’23 New York, NY, USA: Association for Computing Machinery, 2023, pp. 84–130 DOI: 10.1145/3600211.3604698
- Google DeepMind “AI Safety Summit: An update on our approach to safety and responsibility”, 2023 URL: https://deepmind.google/public-policy/ai-summit-policies/
- “AI Regulation Has Its Own Alignment Problem: The Technical and Institutional Feasibility of Disclosure, Registration, Licensing, and Auditing” In The George Washington Law Review 92, 2024 URL: https://dho.stanford.edu/wp-content/uploads/AI_Regulation.pdf
- Gillian Hadfield, Mariano Florentino Cuéllar and Tim O’Reilly “It’s Time to Create a National Registry for Large AI Models”, 2023 URL: https://carnegieendowment.org/2023/07/12/it-s-time-to-create-national-registry-for-large-ai-models-pub-90180
- “Governing Through the Cloud: The Intermediary Role of Compute Providers in AI Regulation” arXiv, 2024 DOI: 10.48550/arXiv.2403.08501
- “Unsolved Problems in ML Safety” arXiv, 2022 DOI: 10.48550/arXiv.2109.13916
- Dan Hendrycks, Mantas Mazeika and Thomas Woodside “An Overview of Catastrophic AI Risks” arXiv, 2023 DOI: 10.48550/arXiv.2306.12001
- Margot E. Kaminski “Regulating the Risks of AI” In Boston University Law Review 103, 2023, pp. 1347–1411 URL: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4195066
- “On the Societal Impact of Open Foundation Models”, 2024 URL: https://crfm.stanford.edu/open-fms/paper.pdf
- Bradley Karkkainen “Bottlenecks and baselines: Tackling information deficits in environmental regulation” In Texas Law Review 86, 2008, pp. 1409–1444
- Noam Kolt “Algorithmic Black Swans” In Washington University Law Review 101 URL: https://papers.ssrn.com/abstract=4370566
- Noam Kolt “Governing AI Agents”, 2024 URL: https://papers.ssrn.com/abstract=4772956
- Nathan Lambert, Thomas Krendl Gilbert and Tom Zick “The History and Risks of Reinforcement Learning and Human Feedback” arXiv, 2023 DOI: 10.48550/arXiv.2310.13595
- “A Safe Harbor for AI Evaluation and Red Teaming” arXiv, 2024 DOI: 10.48550/arXiv.2403.04893
- Peter Madsen, Robin L. Dillon and Catherine H. Tinsley “Airline Safety Improvement Through Experience with Near-Misses: A Cautionary Tale” In Risk Analysis 36.5, 2015, pp. 1054–1066 DOI: 10.1111/risa.12503
- David Manheim “Building a Culture of Safety for AI: Perspectives and Challenges”, 2023 DOI: 10.2139/ssrn.4491421
- Gary E. Marchant and Carlos Ignacio Gutierrez “Soft Law 2.0: An Agile and Effective Governance Approach for Artificial Intelligence” In Minnesota Journal of Law, Science and Technology 24, 2022, pp. 375 URL: https://heinonline.org/HOL/Page?handle=hein.journals/mipr24&id=379&div=&collection=
- Sean McGregor “Preventing Repeated Real World AI Failures by Cataloging Incidents: The AI Incident Database” Number: 17 In Proceedings of the AAAI Conference on Artificial Intelligence 35.17, 2021, pp. 15458–15463 DOI: 10.1609/aaai.v35i17.17817
- Sean McGregor, Kevin Paeth and Khoa Lam “Indexing AI Risks with Incidents, Issues, and Variants” arXiv, 2022 DOI: 10.48550/arXiv.2211.10384
- “Model Cards for Model Reporting” In Proceedings of the Conference on Fairness, Accountability, and Transparency Atlanta GA USA: ACM, 2019, pp. 220–229 DOI: 10.1145/3287560.3287596
- “Responsible Scaling Policies (RSPs)”, 2023 URL: https://metr.org/blog/2023-09-26-rsp/
- “Auditing large language models: a three-layered approach” In AI and Ethics, 2023 DOI: 10.1007/s43681-023-00289-2
- National Institute of Standards and Technology “AI Risk Management Framework” In NIST, 2021 URL: https://www.nist.gov/itl/ai-risk-management-framework
- “Disclosure by Design: Designing information disclosures to support meaningful transparency and accountability” In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22 New York, NY, USA: Association for Computing Machinery, 2022, pp. 679–690 DOI: 10.1145/3531146.3533133
- OECD.AI Network of Experts “Expert Group on AI Incidents” URL: https://oecd.ai/en//network-of-experts/working-group/10836
- “Towards AI Accountability Infrastructure: Gaps and Opportunities in AI Audit Tooling” arXiv, 2024 DOI: 10.48550/arXiv.2402.17861
- OpenAI “GPT-4 System Card”, 2023 URL: https://cdn.openai.com/papers/gpt-4-system-card.pdf
- OpenAI “GPT-4V(ision) system card”, 2023 URL: https://openai.com/research/gpt-4v-system-card
- OpenAI “Preparedness”, 2023 URL: https://cdn.openai.com/openai-preparedness-framework-beta.pdf
- “Evaluating Frontier Models for Dangerous Capabilities” arXiv, 2024 DOI: 10.48550/arXiv.2403.13793
- “Outsider Oversight: Designing a Third Party Audit Ecosystem for AI Governance” In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’22 New York, NY, USA: Association for Computing Machinery, 2022, pp. 557–571 DOI: 10.1145/3514094.3534181
- “Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context” arXiv, 2024 DOI: 10.48550/arXiv.2403.05530
- “Computing Power and the Governance of Artificial Intelligence” arXiv, 2024 DOI: 10.48550/arXiv.2402.08797
- “Towards best practices in AGI safety and governance: A survey of expert opinion” arXiv, 2023 DOI: 10.48550/arXiv.2305.07153
- Secretary of State for Science, Innovation and Technology “Policy paper: Introducing the AI Safety Institute” ISBN: 978-1-5286-4538-6, 2023 URL: https://www.gov.uk/government/publications/ai-safety-institute-overview/introducing-the-ai-safety-institute
- “Model evaluation for extreme risks” arXiv, 2023 DOI: 10.48550/arXiv.2305.15324
- Kris Shrishak “How to deal with an AI near-miss: Look to the skies” In Bulletin of the Atomic Scientists 79.3, 2023, pp. 166–169 DOI: 10.1080/00963402.2023.2199580
- Matthew C. Stephenson “Information Acquisition and Institutional Design” In Harvard Law Review 124.6, 2011, pp. 1422–1483 URL: https://harvardlawreview.org/print/vol-124/information-acquisition-and-institutional-design/
- The White House “Biden-Harris Administration Launches Artificial Intelligence Cyber Challenge to Protect America’s Critical Software”, 2023 URL: https://www.whitehouse.gov/briefing-room/statements-releases/2023/08/09/biden-harris-administration-launches-artificial-intelligence-cyber-challenge-to-protect-americas-critical-software/
- The White House “Blueprint for an AI Bill of Rights | OSTP”, 2022 URL: https://www.whitehouse.gov/ostp/ai-bill-of-rights/
- The White House “Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence”, 2023 URL: https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/
- The White House “FACT SHEET: Biden-Harris Administration Announces New Actions to Promote Responsible AI Innovation that Protects Americans’ Rights and Safety”, 2023 URL: https://www.whitehouse.gov/briefing-room/statements-releases/2023/05/04/fact-sheet-biden-harris-administration-announces-new-actions-to-promote-responsible-ai-innovation-that-protects-americans-rights-and-safety/
- The White House “FACT SHEET: Biden-Harris Administration Secures Voluntary Commitments from Leading Artificial Intelligence Companies to Manage the Risks Posed by AI”, 2023 URL: https://www.whitehouse.gov/briefing-room/statements-releases/2023/07/21/fact-sheet-biden-harris-administration-secures-voluntary-commitments-from-leading-artificial-intelligence-companies-to-manage-the-risks-posed-by-ai/
- “Skating to Where the Puck Is Going”, 2023 URL: https://cset.georgetown.edu/publication/skating-to-where-the-puck-is-going/
- “Llama 2: Open Foundation and Fine-Tuned Chat Models” arXiv, 2023 DOI: 10.48550/arXiv.2307.09288
- “Why We Need to Know More: Exploring the State of AI Incident Documentation Practices” In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’23 New York, NY, USA: Association for Computing Machinery, 2023, pp. 576–583 DOI: 10.1145/3600211.3604700
- U.S. Department of Commerce “At the Direction of President Biden, Department of Commerce to Establish U.S. Artificial Intelligence Safety Institute to Lead Efforts on AI Safety”, 2023 URL: https://www.commerce.gov/news/press-releases/2023/11/direction-president-biden-department-commerce-establish-us-artificial
- U.S. Department of Commerce “Biden-Harris Administration Announces First-Ever Consortium Dedicated to AI Safety”, 2024 URL: https://www.commerce.gov/news/press-releases/2024/02/biden-harris-administration-announces-first-ever-consortium-dedicated
- Rory Van Loo “Regulatory Monitors: Policing Firms in the Compliance Era” In Columbia Law Review 119, 2019 URL: https://columbialawreview.org/content/regulatory-monitors-policing-firms-in-the-compliance-era/
- “Emergent Abilities of Large Language Models” arXiv, 2022 DOI: 10.48550/arXiv.2206.07682
- “Sociotechnical Safety Evaluation of Generative AI Systems” arXiv, 2023 DOI: 10.48550/arXiv.2310.11986
- “Why and How Governments Should Monitor AI Development” arXiv, 2021 DOI: 10.48550/arXiv.2108.12427
- Thomas Woodside “Keeping Up with the Frontier: Why Congress Should Codify Reporting Requirements For Advanced AI Systems”, 2024 URL: https://cset.georgetown.edu/article/keeping-up-with-the-frontier/
- Noam Kolt (12 papers)
- Markus Anderljung (29 papers)
- Joslyn Barnhart (4 papers)
- Asher Brass (1 paper)
- Kevin Esvelt (1 paper)
- Gillian K. Hadfield (10 papers)
- Lennart Heim (21 papers)
- Mikel Rodriguez (9 papers)
- Jonas B. Sandbrink (2 papers)
- Thomas Woodside (5 papers)