Annotated History of Modern AI and Deep Learning (2212.11279v2)

Published 21 Dec 2022 in cs.NE

Abstract: Machine learning is the science of credit assignment: finding patterns in observations that predict the consequences of actions and help to improve future performance. Credit assignment is also required for human understanding of how the world works, not only for individuals navigating daily life, but also for academic professionals like historians who interpret the present in light of past events. Here I focus on the history of modern AI which is dominated by artificial neural networks (NNs) and deep learning, both conceptually closer to the old field of cybernetics than to what's been called AI since 1956 (e.g., expert systems and logic programming). A modern history of AI will emphasize breakthroughs outside of the focus of traditional AI text books, in particular, mathematical foundations of today's NNs such as the chain rule (1676), the first NNs (linear regression, circa 1800), and the first working deep learners (1965-). From the perspective of 2022, I provide a timeline of the -- in hindsight -- most important relevant events in the history of NNs, deep learning, AI, computer science, and mathematics in general, crediting those who laid foundations of the field. The text contains numerous hyperlinks to relevant overview sites from my AI Blog. It supplements my previous deep learning survey (2015) which provides hundreds of additional references. Finally, to round it off, I'll put things in a broader historic context spanning the time since the Big Bang until when the universe will be many times older than it is now.

PDF Abstract

Overview of "Annotated History of Modern AI and Deep Neural Networks"

This technical report by Jürgen Schmidhuber provides a detailed annotated history of AI and deep neural networks (DNNs). It focuses specifically on the evolution of DNNs, tracing their roots back to early mathematical concepts and developments outside the mainstream narratives often presented in AI literature. The document attempts to clarify key historical milestones and correct widespread misconceptions.

Historical Progression and Key Contributions

The report begins by framing the history of AI, noting that a contemporary interpretation would heavily emphasize the contributions from the domains of DNNs and machine learning. Schmidhuber continues by documenting the significant milestones in DNN research, starting from the 17th-century discovery of the chain rule—fundamental for backpropagation algorithms used in training neural networks—to 19th-century linear regression models, which he identifies as the first neural networks.

Major Developments

Foundation of Backpropagation: The paper cites the pivotal role of the chain rule, formalized by Leibniz, as essential to the development of backpropagation in neural networks—a technique critical for training multilayer perceptrons employed ubiquitously today.
Early Neural Network Models: The timeline highlights key early contributions such as those by Alexey Ivakhnenko, who developed the first functional deep learning models in the 1960s, and Seppo Linnainmaa’s work on backpropagation in 1970.
Recurrent and Convolutional Neural Networks: Schmidhuber credits the early formation of recurrent neural networks (RNNs) to works in the 1920s and continues to address how these networks evolved to become the adaptive systems known today. He also points to the importance of convolutional neural networks (CNNs), outlining their inception in the 1970s and their subsequent advancements leading to breakthroughs in computer vision.
Long Short-Term Memory (LSTM): LSTM networks are identified as a pivotal development addressing vanishing gradient issues in RNNs, achieving significant success in sequence prediction tasks such as LLMing and speech recognition.

Addressing Historical Misattributions

Schmidhuber makes a notable effort to correct misconceptions around the history of AI, particularly in relation to deep learning. Many foundational figures and their contributions, which predate commonly cited turning points in AI history, are revisited. He critiques historical accounts that omit or downplay the contributions of earlier researchers, arguing for a more inclusive acknowledgment of pioneers who laid the groundwork for modern breakthroughs.

Implications and Future Directions

The implications of this detailed historical account are multifaceted. Practically, the acknowledgment of these foundational contributions is crucial not only for fair academic recognition but also for fostering an understanding of how current methods can be built upon to further AI capabilities. Theoretically, a comprehensive historical perspective can lead to richer insights into the scientific intuitions driving these innovations.

Anticipated Developments

Moving forward, the discussion underscores the expectation that advances in hardware and AI theory will continue to influence the trajectory of DNNs. The document underscores the importance of recognizing the contributions of both hardware advancements and theoretical innovations as equal contributors to the evolution of AI systems.

This technical report serves as a critical resource, offering a meticulously detailed and well-researched historical perspective, invaluable to those within the AI research community for understanding the foundational aspects that have informed present-day AI and DNN methodologies.