Predictability and Surprise in Large Generative Models (2202.07785v2)

Published 15 Feb 2022 in cs.CY

Abstract: Large-scale pre-training has recently emerged as a technique for creating capable, general purpose, generative models such as GPT-3, Megatron-Turing NLG, Gopher, and many others. In this paper, we highlight a counterintuitive property of such models and discuss the policy implications of this property. Namely, these generative models have an unusual combination of predictable loss on a broad training distribution (as embodied in their "scaling laws"), and unpredictable specific capabilities, inputs, and outputs. We believe that the high-level predictability and appearance of useful capabilities drives rapid development of such models, while the unpredictable qualities make it difficult to anticipate the consequences of model deployment. We go through examples of how this combination can lead to socially harmful behavior with examples from the literature and real world observations, and we also perform two novel experiments to illustrate our point about harms from unpredictability. Furthermore, we analyze how these conflicting properties combine to give model developers various motivations for deploying these models, and challenges that can hinder deployment. We conclude with a list of possible interventions the AI community may take to increase the chance of these models having a beneficial impact. We intend this paper to be useful to policymakers who want to understand and regulate AI systems, technologists who care about the potential policy impact of their work, and academics who want to analyze, critique, and potentially develop large generative models.

PDF Abstract

An Analysis of Predictability and Surprise in Large Generative Models

The paper, "Predictability and Surprise in Large Generative Models," presents a nuanced examination of large-scale pre-training for generative models, elucidating an intriguing paradox within the field of artificial intelligence. The authors highlight the coexistence of predictable model behavior as determined by scaling laws and the unpredictable emergence of specific capabilities and outcomes. This discussion is particularly pertinent in the context of large models like GPT-3, Gopher, and others, where the balance between predictability and surprise is critical to understanding their behavior and potential impacts.

Key Observations

The researchers identify four distinctive features of large generative models, which together encapsulate the paradox of predictability and unpredictability:

Smooth General Capability Scaling: The paper emphasizes that the general performance of these models improves predictably with increased scale across model parameters, data size, and computational resources. This predictability is encapsulated in empirical scaling laws that facilitate systematic planning and resource allocation for model development.
Abrupt Specific Capability Scaling: Despite the overarching predictability, certain capabilities emerge unexpectedly and abruptly at larger scales, demonstrating rapid improvement in specific tasks that were not anticipated. This phenomenon presents a challenge in foreseeing the full extent of a model's capabilities.
Open-Ended Inputs and Domains: The models are characterized by open-endedness concerning the range of inputs and problem domains they can address, leading to unforeseen competencies until prompted by specific inputs.
Open-Ended Outputs: The inherent unpredictability of the output, despite fixed tasks or topics, introduces complexities in controlling and predicting model behavior, posing additional risks.

Implications and Challenges

The capabilities and behaviors of large generative models outlined in this paper have critical implications for both AI development and policy-making. The predictable scaling offers a clear incentive for investment and development, as it mitigates the risk by forecasting the potential returns on model scaling. However, the unpredictable emergence of specific capabilities—both beneficial and harmful—complicates deployment scenarios, accentuating the need for careful consideration and management of these systems.

The open-ended nature of these models suggests that potential societal impacts, both positive and negative, may remain undiscovered until certain interactions or deployments occur. This unpredictability is increasingly relevant given the rapid proliferation of these models in industrial applications.

Future Directions

The authors identify several areas where research and policy interventions could mitigate risks and harness the potential benefits of large generative models. Measures include improving accessibility and resources for academic and public sector research, enhancing methodological frameworks for model evaluation and "red teaming" to identify latent hazards, and developing governance structures to align industrial practices with broader societal interests.

Conclusion

This paper underscores the dual nature of predictability and surprise in large generative models—the foundation of their rapid adoption and the core of the challenge they present. As these models continue to evolve, a coordinated approach encompassing technical, regulatory, and ethical dimensions will be crucial to maximizing their benefits and minimizing their risks. Such efforts will facilitate responsible deployment and management of AI technologies, aligning their development with public interest.