Stochastic Gradient Descent Captures How Children Learn About Physics (2209.12344v1)

Published 25 Sep 2022 in cs.LG and cs.AI

Abstract: As children grow older, they develop an intuitive understanding of the physical processes around them. They move along developmental trajectories, which have been mapped out extensively in previous empirical research. We investigate how children's developmental trajectories compare to the learning trajectories of artificial systems. Specifically, we examine the idea that cognitive development results from some form of stochastic optimization procedure. For this purpose, we train a modern generative neural network model using stochastic gradient descent. We then use methods from the developmental psychology literature to probe the physical understanding of this model at different degrees of optimization. We find that the model's learning trajectory captures the developmental trajectories of children, thereby providing support to the idea of development as stochastic optimization.

Summary

The paper introduces a novel approach linking stochastic gradient descent in neural networks to the acquisition of physical principles observed in children.
The authors employ a Recurrent State Space Model trained on 100,000 video sequences and evaluate surprise via KL divergence to measure rule learning.
Findings reveal that simpler physical rules are learned earlier, mirroring developmental stages in infant cognition and supporting the stochastic optimization hypothesis.

The paper presents a novel investigation into the comparability of children's cognitive development trajectories and the learning processes of artificial systems. It specifically examines the proposition that cognitive development can be understood as a form of stochastic optimization. This investigation is conducted through the training of a generative neural network model employing stochastic gradient descent, and subsequent analysis of its learning trajectory in comparison to empirical developmental psychology findings regarding children's understanding of physical laws.

Core Investigation

1. Hypothesis:

The central hypothesis posits that cognitive development can be conceptualized as a stochastic optimization process, paralleling the learning dynamics of artificial models.

2. Model Employed:

The authors utilize a Recurrent State Space Model (RSSM), a sequential extension of the Variational Autoencoder (VAE), engineered to comprehend high-dimensional visual stimuli (i.e., video sequences of physical events).

3. Developmental Context:

Grounded in developmental psychology, the paper uses the well-documented progression in infants' understanding of physical support events, as characterized by Baillargeon and others, to benchmark the artificial system's learning trajectory.

Methodology

1. Data Generation:

To model infants' comprehension of support events, the researchers generated a dataset comprising 100,000 video sequences depicting colored block stacks varying across multiple visual and structural parameters, utilizing the Unity game engine.

2. Rule Acquisition:

The paper delineates a progressive acquisition of four key physical principles by infants, increasing in complexity from mere contact detection to the incorporation of shape considerations in stability judgments.

3. Surprise Measurement:

The model's understanding of each rule is evaluated via a violation of expectation method, where differences in surprise levels—quantitatively measured as the Kullback-Leibler (KL) divergence between expected and unexpected sequences—serve as a proxy for rule acquisition.

Findings

1. Validation and Performance:

The RSSM demonstrated the capability to predict physical events accurately, matching real-world sequences that adhere to established physical rules. The measured surprise aligned with model predictions, confirming effective rule acquisition by the learning conclusion.

2. Learning Trajectory:

The research finds that the learning trajectory of the artificial model aligns with developmental stages in children: more straightforward rules are acquired earlier in the training process than complex ones, effectively mirroring empirical observations of human cognitive development.

Discussion and Future Directions

1. Limitations and Proposals:

While the paper convincingly draws parallels between stochastic optimization in artificial models and children's developmental learning, it acknowledges that children do not encounter vast numbers of structured learning events but rather generalize knowledge from real-world observations. Future work should extend this approach using rich, longitudinal datasets, such as SAYCam, to model more authentic, everyday learning environments.

2. Extended Hypotheses:

The paper suggests augmenting the stochastic optimization hypothesis to encompass development as complexity increase, enabling models to manage growing cognitive demands over time. Utilizing $\beta$ -VAE objectives could elegantly encapsulate this developmental complexity.

In sum, the paper contributes to a nuanced understanding of cognitive development by aligning computational learning trajectories with empirical developmental psychology, emphasizing the dynamic structure of cognitive processes and providing innovative insights for both cognitive psychology and AI.

PDF Markdown