Increased Compute Efficiency and the Diffusion of AI Capabilities (2311.15377v2)
Abstract: Training advanced AI models requires large investments in computational resources, or compute. Yet, as hardware innovation reduces the price of compute and algorithmic advances make its use more efficient, the cost of training an AI model to a given performance falls over time - a concept we describe as increasing compute efficiency. We find that while an access effect increases the number of actors who can train models to a given performance over time, a performance effect simultaneously increases the performance available to each actor. This potentially enables large compute investors to pioneer new capabilities, maintaining a performance advantage even as capabilities diffuse. Since large compute investors tend to develop new capabilities first, it will be particularly important that they share information about their AI models, evaluate them for emerging risks, and, more generally, make responsible development and release decisions. Further, as compute efficiency increases, governments will need to prepare for a world where dangerous AI capabilities are widely available - for instance, by developing defenses against harmful AI models or by actively intervening in the diffusion of particularly dangerous capabilities.
- The De-democratization of AI: Deep Learning and the Compute Divide in Artificial Intelligence Research, October 2020. URL http://arxiv.org/abs/2010.15581. Issue: arXiv:2010.15581 arXiv:2010.15581 [cs].
- Coordinated Pausing: An Evaluation-Based Coordination Scheme for Frontier AI Developers | GovAI. Research Paper, Centre for the Governance of AI, November 2023. URL https://www.governance.ai/research-paper/coordinated-pausing-evaluation-based-scheme.
- Dario Amodei. Written Testimony of Dario Amodei, Ph.D. Co-Founder and CEO, Anthropic, July 2023. URL https://www.judiciary.senate.gov/imo/media/doc/2023-07-26_-_testimony_-_amodei.pdf. Subcommittee on Privacy, Technology, and the Law, United States Senate.
- Protecting Society from AI Misuse: When are Restrictions on Capabilities Warranted?, March 2023. URL http://arxiv.org/abs/2303.09377. Issue: arXiv:2303.09377 arXiv:2303.09377 [cs].
- Frontier AI Regulation: Managing Emerging Risks to Public Safety, November 2023. URL http://arxiv.org/abs/2307.03718. Issue: arXiv:2307.03718 arXiv:2307.03718 [cs].
- Anthropic. Anthropic’s Responsible Scaling Policy, September 2023a. URL https://www.anthropic.com/index/anthropics-responsible-scaling-policy.
- Anthropic. Core Views on AI Safety: When, Why, What, and How, March 2023b. URL https://www.anthropic.com/index/core-views-on-ai-safety.
- Understanding Scaling Laws for Recommendation Models, August 2022. URL http://arxiv.org/abs/2208.08489. Issue: arXiv:2208.08489 arXiv:2208.08489 [cs].
- AWS. Ai Accelerator - AWS Trainium - AWS, November 2023a. URL https://aws.amazon.com/machine-learning/trainium/.
- AWS. AI Code Generator – Amazon CodeWhisperer – AWS, November 2023b. URL https://aws.amazon.com/codewhisperer/.
- Equilibrium fast trading. Journal of Financial Economics, 116(2):292–313, May 2015. ISSN 0304-405X. doi: 10.1016/j.jfineco.2015.03.004. URL https://www.sciencedirect.com/science/article/pii/S0304405X15000288. Number: 2.
- Nick Bostrom. The Vulnerable World Hypothesis. Global Policy, 10(4):455–476, November 2019. ISSN 1758-5880, 1758-5899. doi: 10.1111/1758-5899.12718. URL https://onlinelibrary.wiley.com/doi/10.1111/1758-5899.12718. Number: 4.
- Language Models are Few-Shot Learners, July 2020. URL http://arxiv.org/abs/2005.14165. Issue: arXiv:2005.14165 arXiv:2005.14165 [cs].
- Sparks of Artificial General Intelligence: Early experiments with GPT-4, April 2023. URL http://arxiv.org/abs/2303.12712. Issue: arXiv:2303.12712 arXiv:2303.12712 [cs].
- Structured access for third-party research on frontier AI models: Investigating researchers’ model access requirements. Whitepaper, Oxford Martin AI Governance Initiative, Centre for the Governance of AI, October 2023. URL https://www.oxfordmartin.ox.ac.uk/publications/structured-access-for-third-party-research-on-frontier-ai-models-investigating-researchers-model-access-requirements/.
- The Effects of Data Quality on Machine Learning Performance, November 2022. URL http://arxiv.org/abs/2207.14529. Issue: arXiv:2207.14529 arXiv:2207.14529 [cs].
- The law and economics of AI liability. Computer Law & Security Review, 48:105794, April 2023. ISSN 0267-3649. doi: 10.1016/j.clsr.2023.105794. URL https://www.sciencedirect.com/science/article/pii/S0267364923000055.
- How Fast are Semiconductor Prices Falling? Review of Income and Wealth, 64(3):679–702, April 2017. ISSN 1475-4991. doi: 10.1111/roiw.12308. URL https://onlinelibrary.wiley.com/doi/abs/10.1111/roiw.12308. Number: 3 _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/roiw.12308.
- Joseph Carlsmith. Is Power-Seeking AI an Existential Risk?, June 2022. URL http://arxiv.org/abs/2206.13353. Issue: arXiv:2206.13353 arXiv:2206.13353 [cs].
- Center for AI Safety. Statement on AI Risk, 2023. URL https://www.safe.ai/statement-on-ai-risk.
- Character.AI. character.ai, November 2023. URL https://beta.character.ai/help.
- xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein, July 2023. URL https://www.biorxiv.org/content/10.1101/2023.07.05.547496v1. Pages: 2023.07.05.547496 Section: New Results.
- Evaluating Large Language Models Trained on Code, July 2021. URL http://arxiv.org/abs/2107.03374. Issue: arXiv:2107.03374 arXiv:2107.03374 [cs].
- Ben Cottier. Trends in the Dollar Training Cost of Machine Learning Systems, January 2023. URL https://epochai.org/blog/trends-in-the-dollar-training-cost-of-machine-learning-systems.
- Rogier Creemers. China’s Social Credit System: An Evolving Practice of Control, May 2018. URL https://papers.ssrn.com/abstract=3175792. Issue: 3175792.
- Human-in-the-loop Artificial Intelligence for Fighting Online Misinformation: Challenges and Opportunities. In Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, volume 43. IEEE Computer Society, September 2020. URL https://www.damianospina.com/publication/demartini-2020-human/demartini-2020-human.pdf.
- ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, June 2009. doi: 10.1109/CVPR.2009.5206848. URL https://ieeexplore.ieee.org/document/5206848. ISSN: 1063-6919.
- Department for Science, Innovation and Technology. Emerging processes for frontier AI safety. Policy Paper, UK Government, October 2023. URL https://www.gov.uk/government/publications/emerging-processes-for-frontier-ai-safety.
- Options for Synthetic DNA Order Screening, Revisited. mSphere, 2(4):10.1128/msphere.00319–17, August 2017. doi: 10.1128/msphere.00319-17. URL https://journals.asm.org/doi/full/10.1128/msphere.00319-17. Number: 4 Publisher: American Society for Microbiology.
- Scaling Laws for Acoustic Models, June 2021. URL http://arxiv.org/abs/2106.09488. Issue: arXiv:2106.09488 arXiv:2106.09488 [cs, eess].
- Oversight for Frontier AI through a Know-Your-Customer Scheme for Compute Providers, October 2023. URL http://arxiv.org/abs/2310.13625. Issue: arXiv:2310.13625 arXiv:2310.13625 [cs].
- Epoch. Parameter, compute and data trends in machine learning, 2022. URL https://epochai.org/data/pcd. tex.copyright: CC-BY.
- Epoch. AI Trends, April 2023. URL https://epochai.org/trends.
- Algorithmic progress in computer vision, August 2023. URL http://arxiv.org/abs/2212.05153. Issue: arXiv:2212.05153 arXiv:2212.05153 [cs].
- Alex Fitzpatrick. Why Amazon’s Move to Drop Parler Is a Big Deal for the Future of the Internet. Time, January 2021. URL https://time.com/5929888/amazon-parler-aws/.
- Predictability and Surprise in Large Generative Models. In 2022 ACM Conference on Fairness, Accountability, and Transparency, pages 1747–1764, June 2022. doi: 10.1145/3531146.3533229. URL http://arxiv.org/abs/2202.07785. arXiv:2202.07785 [cs].
- How does the offense-defense balance scale? Journal of Strategic Studies, 42(6):736–763, September 2019. ISSN 0140-2390. doi: 10.1080/01402390.2019.1631810. URL https://doi.org/10.1080/01402390.2019.1631810. Number: 6 Publisher: Routledge _eprint: https://doi.org/10.1080/01402390.2019.1631810.
- Google. Bard can now connect to your Google apps and services, September 2023. URL https://blog.google/products/bard/google-bard-new-features-update-sept-2023/.
- Lennart Heim. This can’t go on(?) - AI Training Compute Costs, June 2023a. URL https://blog.heim.xyz/this-cant-go-on-compute-training-costs/.
- Lennart Heim. Video and Transcript of Presentation on Introduction to Compute Governance, May 2023b. URL https://blog.heim.xyz/presentation-on-introduction-to-compute-governance/.
- Information security considerations for AI and the long term future, May 2022. URL https://blog.heim.xyz/information-security-considerations-for-ai/.
- Measuring Massive Multitask Language Understanding, January 2021. URL http://arxiv.org/abs/2009.03300. Issue: arXiv:2009.03300 arXiv:2009.03300 [cs].
- Measuring the Algorithmic Efficiency of Neural Networks, May 2020. URL http://arxiv.org/abs/2005.04305. Issue: arXiv:2005.04305 arXiv:2005.04305 [cs, stat].
- Deep Learning Scaling is Predictable, Empirically, December 2017. URL http://arxiv.org/abs/1712.00409. Issue: arXiv:1712.00409 arXiv:1712.00409 [cs, stat].
- Trends in Machine Learning Hardware, November 2023. URL https://epochai.org/blog/trends-in-machine-learning-hardware.
- Training Compute-Optimal Large Language Models, March 2022. URL http://arxiv.org/abs/2203.15556. Issue: arXiv:2203.15556 arXiv:2203.15556 [cs].
- Santosh Janardhan. Reimagining Our Infrastructure for the AI Age, May 2023. URL https://about.fb.com/news/2023/05/metas-infrastructure-for-ai/.
- TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings, April 2023. URL http://arxiv.org/abs/2304.01433. Issue: arXiv:2304.01433 arXiv:2304.01433 [cs].
- Scaling Laws for Neural Language Models, January 2020. URL http://arxiv.org/abs/2001.08361. Issue: arXiv:2001.08361 arXiv:2001.08361 [cs, stat].
- Energy efficiency in cloud computing data centers: a survey on software technologies. Cluster Computing, 26(3):1845–1875, June 2023. ISSN 1573-7543. doi: 10.1007/s10586-022-03713-0. URL https://doi.org/10.1007/s10586-022-03713-0. Number: 3.
- Evaluating Language-Model Agents on Realistic Autonomous Tasks. Research paper, Alignment Research Center, 2023.
- Will Knight. OpenAI’s CEO Says the Age of Giant AI Models Is Already Over. Wired, November 2023. ISSN 1059-1028. URL https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/. Section: tags.
- Risk assessment at AGI companies: A review of popular risk assessment techniques from other safety-critical industries, July 2023. URL http://arxiv.org/abs/2307.08823. Issue: arXiv:2307.08823 arXiv:2307.08823 [cs].
- Exclusionary strategies and the rise of winner-takes-it-all markets on the Internet. Telecommunications Policy, 40(6):582–592, June 2016. ISSN 0308-5961. doi: 10.1016/j.telpol.2016.02.009. URL https://www.sciencedirect.com/science/article/pii/S0308596116000549. Number: 6.
- Introducing Superalignment, July 2023. URL https://openai.com/blog/introducing-superalignment.
- Chuan Li. OpenAI’s GPT-3 Language Model: A Technical Overview, June 2020. URL https://lambdalabs.com/blog/demystifying-gpt-3.
- Holistic Evaluation of Language Models, October 2023. URL http://arxiv.org/abs/2211.09110. Issue: arXiv:2211.09110 arXiv:2211.09110 [cs].
- Handwritten digit recognition: benchmarking of state-of-the-art techniques. Pattern Recognition, 36(10):2271–2285, October 2003. ISSN 0031-3203. doi: 10.1016/S0031-3203(03)00085-2. URL https://www.sciencedirect.com/science/article/pii/S0031320303000852. Number: 10.
- Will AI Make Cyber Swords or Shields?, August 2022. URL https://cset.georgetown.edu/publication/will-ai-make-cyber-swords-or-shields/.
- AI and Compute: How Much Longer Can Computing Power Drive Artificial Intelligence Progress? Technical report, Center for Security and Emerging Technology, January 2022. URL https://cset.georgetown.edu/publication/ai-and-compute/.
- Sean McGregor. Preventing Repeated Real World AI Failures by Cataloging Incidents: The AI Incident Database. Proceedings of the AAAI Conference on Artificial Intelligence, 35(17):15458–15463, May 2021. ISSN 2374-3468. doi: 10.1609/aaai.v35i17.17817. URL https://ojs.aaai.org/index.php/AAAI/article/view/17817. Number: 17.
- Meta. Introducing Llama 2, 2023. URL https://ai.meta.com/blog/llama-2/.
- Proposing a Foundation Model Information-Sharing Regime for the UK | GovAI Blog. Research Post, Centre for the Governance of AI, June 2023. URL https://www.governance.ai/post/proposing-a-foundation-model-information-sharing-regime-for-the-uk#.
- Report launch: examining risks at the intersection of AI and bio. Research Report, Centre for Long-Term Resilience, October 2023. URL https://www.longtermresilience.org/post/report-launch-examining-risks-at-the-intersection-of-ai-and-bio.
- Securing Artificial Intelligence Model Weights: Interim Report. Technical report, RAND Corporation, October 2023. URL https://www.rand.org/pubs/working_papers/WRA2849-1.html.
- The alignment problem from a deep learning perspective, September 2023. URL http://arxiv.org/abs/2209.00626. Issue: arXiv:2209.00626 arXiv:2209.00626 [cs].
- NIST. Face Recognition Technology Evaluation (FRTE) 1:1 Verification, November 2023. URL https://pages.nist.gov/frvt/html/frvt11.html.
- Taking AI risks seriously: a new assessment model for the AI Act. AI & SOCIETY, July 2023. ISSN 1435-5655. doi: 10.1007/s00146-023-01723-z. URL https://doi.org/10.1007/s00146-023-01723-z.
- Deployment corrections: An incident response framework for frontier AI models. Research Report, Institute for AI Policy and Strategy, September 2023. URL https://www.iaps.ai/research/deployment-corrections.
- OpenAI. GPT-4, March 2023. URL https://openai.com/research/gpt-4.
- OpenAI. GPT-4 Technical Report, March 2023. URL http://arxiv.org/abs/2303.08774. Issue: arXiv:2303.08774 arXiv:2303.08774 [cs].
- Our World in Data. Computation used to train notable AI systems, by affiliation of researchers, November 2023. URL https://ourworldindata.org/grapher/artificial-intelligence-training-computation-by-researcher-affiliation.
- Papers with Code. Papers with Code - MMLU Benchmark (Multi-task Language Understanding), November 2023. URL https://paperswithcode.com/sota/multi-task-language-understanding-on-mmlu.
- Dwarkesh Patel. Dario Amodei (Anthropic CEO) - Scaling, Alignment, & AI Progress, August 2023. URL https://www.dwarkeshpatel.com/p/dario-amodei. Dwarkesh Podcast.
- GPT-3.5 Turbo fine-tuning and API updates, August 2023. URL https://openai.com/blog/gpt-3-5-turbo-fine-tuning-and-api-updates.
- Language Models are Unsupervised Multitask Learners, 2019. URL https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf.
- Felix Richter. Amazon Maintains Lead in the Cloud Market, August 2023. URL https://www.statista.com/chart/18819/worldwide-market-share-of-leading-cloud-infrastructure-service-providers.
- Max Roser. Artificial intelligence has advanced despite having few resources dedicated to its development – now investments have increased substantially. Our World in Data, October 2023. URL https://ourworldindata.org/ai-investments.
- What is Moore’s Law?, March 2023. URL https://ourworldindata.org/moores-law.
- Stuart Russell. Human Compatible: Artificial Intelligence and the Problem of Control. Penguin, October 2019. ISBN 978-0-525-55862-0.
- Jonas B. Sandbrink. Artificial intelligence and biological misuse: Differentiating risks of language models and biological design tools, October 2023. URL http://arxiv.org/abs/2306.13952. Issue: arXiv:2306.13952 arXiv:2306.13952 [cs].
- Are Emergent Abilities of Large Language Models a Mirage?, May 2023. URL http://arxiv.org/abs/2304.15004. Issue: arXiv:2304.15004 arXiv:2304.15004 [cs].
- AI and the Future of Disinformation Campaigns. CSET Policy Brief, Centre for Security and Emerging Technology, December 2021. URL https://cset.georgetown.edu/wp-content/uploads/CSET-AI-and-the-Future-of-Disinformation-Campaigns-Part-2.pdf.
- Open-Sourcing Highly Capable Foundation Models. Research paper, Centre for the Governance of AI, September 2023. URL https://www.governance.ai/research-paper/open-sourcing-highly-capable-foundation-models.
- Improved protein structure prediction using potentials from deep learning. Nature, 577(7792):706–710, January 2020. ISSN 1476-4687. doi: 10.1038/s41586-019-1923-7. URL https://www.nature.com/articles/s41586-019-1923-7. Number: 7792 Publisher: Nature Publishing Group.
- Compute Trends Across Three Eras of Machine Learning, March 2022. URL http://arxiv.org/abs/2202.05924. Issue: arXiv:2202.05924 arXiv:2202.05924 [cs].
- Toby Shevlane. Structured access: an emerging paradigm for safe AI deployment, April 2022. URL http://arxiv.org/abs/2201.05159. Issue: arXiv:2201.05159 arXiv:2201.05159 [cs].
- Model evaluation for extreme risks, September 2023. URL http://arxiv.org/abs/2305.15324. Issue: arXiv:2305.15324 arXiv:2305.15324 [cs].
- Irene Solaiman. The Gradient of Generative AI Release: Methods and Considerations, February 2023. URL http://arxiv.org/abs/2302.04844. Issue: arXiv:2302.04844 arXiv:2302.04844 [cs].
- The White House. Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence, October 2023. URL https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/.
- Social and Governance Implications of Improved Data Efficiency. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pages 378–384, February 2020. doi: 10.1145/3375627.3375863. URL http://arxiv.org/abs/2001.05068. arXiv:2001.05068 [cs].
- Dual use of artificial-intelligence-powered drug discovery. Nature Machine Intelligence, 4(3):189–191, March 2022. ISSN 2522-5839. doi: 10.1038/s42256-022-00465-9. URL https://www.nature.com/articles/s42256-022-00465-9. Number: 3 Publisher: Nature Publishing Group.
- Mosaic LLMs (Part 2): GPT-3 quality for <$500k, September 2022. URL https://www.mosaicml.com/blog/gpt-3-quality-for-500k.
- Pablo Villalobos. Scaling Laws Literature Review, January 2023. URL https://epochai.org/blog/scaling-laws-literature-review.
- Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning, October 2022. URL http://arxiv.org/abs/2211.04325. Issue: arXiv:2211.04325 arXiv:2211.04325 [cs].
- Jason Wei. Common arguments regarding emergent abilities, May 2023. URL https://www.jasonwei.net/blog/common-arguments-regarding-emergent-abilities.
- Emergent Abilities of Large Language Models, October 2022. URL http://arxiv.org/abs/2206.07682. Issue: arXiv:2206.07682 arXiv:2206.07682 [cs].
- Ethical and social risks of harm from Language Models, December 2021. URL http://arxiv.org/abs/2112.04359. Issue: arXiv:2112.04359 arXiv:2112.04359 [cs].
- BloombergGPT: A Large Language Model for Finance, May 2023. URL http://arxiv.org/abs/2303.17564. Issue: arXiv:2303.17564 arXiv:2303.17564 [cs, q-fin] version: 2.
- Scaling Autoregressive Models for Content-Rich Text-to-Image Generation, June 2022. URL http://arxiv.org/abs/2206.10789. Issue: arXiv:2206.10789 arXiv:2206.10789 [cs].
- The AI Index 2022 Annual Report. Technical report, AI Index Steering Committee, Stanford Institute for Human-Centered AI, Stanford University, March 2022. URL https://aiindex.stanford.edu/wp-content/uploads/2022/03/2022-AI-Index-Report_Master.pdf.
- Konstantin Pilz (3 papers)
- Lennart Heim (21 papers)
- Nicholas Brown (3 papers)