Causality in the human niche: lessons for machine learning (2506.13803v1)

Published 13 Jun 2025 in cs.AI and cs.LG

Abstract: Humans interpret the world around them in terms of cause and effect and communicate their understanding of the world to each other in causal terms. These causal aspects of human cognition are thought to underlie humans' ability to generalize and learn efficiently in new domains, an area where current machine learning systems are weak. Building human-like causal competency into machine learning systems may facilitate the construction of effective and interpretable AI. Indeed, the machine learning community has been importing ideas on causality formalized by the Structural Causal Model (SCM) framework, which provides a rigorous formal language for many aspects of causality and has led to significant advances. However, the SCM framework fails to capture some salient aspects of human causal cognition and has likewise not yet led to advances in machine learning in certain critical areas where humans excel. We contend that the problem of causality in the ``human niche'' -- for a social, autonomous, and goal-driven agent sensing and acting in the world in which humans live -- is quite different from the kind of causality captured by SCMs. For example, everyday objects come in similar types that have similar causal properties, and so humans readily generalize knowledge of one type of object (cups) to another related type (bowls) by drawing causal analogies between objects with similar properties, but such analogies are at best awkward to express in SCMs. We explore how such causal capabilities are adaptive in, and motivated by, the human niche. By better appreciating properties of human causal cognition and, crucially, how those properties are adaptive in the niche in which humans live, we hope that future work at the intersection of machine learning and causality will leverage more human-like inductive biases to create more capable, controllable, and interpretable systems.

Authors (2)

Richard D. Lange (6 papers)
Konrad P. Kording (30 papers)

Summary

This paper, "Causality in the human niche: lessons for machine learning" (Lange et al., 13 Jun 2025 ), argues that to build more capable, controllable, and interpretable AI systems, ML needs to incorporate inductive biases inspired by human causal cognition. The authors contend that human causal reasoning is uniquely adapted to the "human niche"—the specific environment, constraints, and goals humans operate within. Current ML approaches, particularly those based on Structural Causal Models (SCMs), often fail to capture these nuanced aspects of human-like causality, limiting their effectiveness in real-world scenarios where humans excel, such as generalization and efficient learning in new domains.

The core of the paper (Section 2) explores how various properties of the human niche have shaped human causal cognition, offering lessons for ML.

Human causal cognition is adapted to their environment:

Agency: Humans learn by intervening. ML systems could benefit from more active learning paradigms where agents can experiment to discover causal relationships, rather than relying solely on passive observational data. This simplifies causal learning.
Complexity: The world is immensely complex. Humans use "good-enough," approximate models, leveraging vast knowledge bases, schemas, and concepts. ML should move beyond seeking a single, perfectly identifiable causal model towards developing systems that can build and utilize pragmatic, actionable models. Causal relations are seen as one part of a broader knowledge base, not the entirety of it.
Open-endedness: The environment is full of novelty. Humans adapt through curiosity, hypothesis-driven exploration, and causal induction—the ability to instantiate new causal models on the fly in novel situations using "causal theories" (models of causal models). ML needs to develop capabilities for zero-shot causal reasoning, perhaps by incorporating ontological understanding (how objects and properties relate) to form analogies.
Confounding: Unobserved factors are ubiquitous. Humans address this with epistemic humility (understanding models are revisable) and by performing experiments. ML systems could benefit from representing uncertainty over causal models and actively experimenting to disambiguate them.
Human-designed environment: Much of our environment is built by and for humans, often with simple, sparse causal interfaces (e.g., a light switch). ML systems operating in these environments have a stronger incentive to adopt human-like causal inductive biases.
Other agents: Humans learn by observing and interacting with other agents, inferring their intentions and goals. ML can leverage inverse reinforcement learning and action understanding, but may need richer representations for goals beyond simple reward functions.
Hierarchical structure: The world has structure in space and time. Humans use coarse-graining, abstracting details to manage complexity and make predictions at various scales. ML systems could benefit from multi-level modeling where different levels of abstraction inform each other.
Ontological structure: Humans organize knowledge based on object properties and their similarities/differences, enabling causal analogies and generalization. ML can leverage this for more effective causal induction by building systems that understand and use object properties.
Sparse interactions: Relevant causal interactions are typically sparse. Humans focus on a few entities at a time. ML can use sparsity as an inductive bias, particularly in object-centric models and by incorporating attention mechanisms. The paper notes how even symmetric physical interactions (like billiard balls colliding) are often interpreted asymmetrically by humans due to a sparsity prior (most things are stationary).
Stream of data: Humans experience a continuous, ever-changing stream of data. This necessitates lifelong learning and the ability to apply causal principles to novel situations (causal induction). Episodic memory and counterfactual reasoning about past events aid this.
Arrow of time: Temporal order and simultaneity are strong cues for causality. While SCMs often lack explicit temporal dynamics (or use cumbersome "unrolling"), humans reason causally across various timescales. ML could benefit from more explicit and flexible temporal modeling.

Human causal cognition is adapted to their constraints:

Partial observations: Humans cope with missing/noisy data by simplifying (Occam's Razor), representing uncertainty over models, and engaging in curiosity-driven exploration to reveal missing information. ML can adopt these, for instance, using GFlowNets to sample from posterior distributions over DAGs or by implementing intrinsic motivation for exploration.
Costly interventions: When direct experimentation is infeasible, humans use mental simulation and causal induction (analogies from related domains).
Limited attention/working memory: Humans manage this by coarse-graining, focusing on sparse (often dyadic) interactions, and attending to relevant properties based on the current situation.
Slow reasoning: Humans amortize slow, deliberate System-2 reasoning by learning fast, System-1 pattern-recognition shortcuts. ML can apply amortization to infer latent variables, model structure, or even learn causal theories.
All models are wrong: Humans use models that are useful for prediction, mechanistic insight, or balancing error with simplicity. ML should move beyond identifiability as the sole goal, balancing prediction, inference, and simplicity. Counterfactual reasoning about past events is crucial when predictive models are imperfect.

Human causal cognition is adapted to their goals:

Compositional and coarse-grained goals: Human goals are often abstract and decomposable. ML systems, especially in IRL, could benefit from representing rewards compositionally over abstract states. Compositional hypotheses can also drive exploration.
Contextual and changing goals: Goals shift based on context. Humans dynamically retrieve relevant causal information and instantiate models at appropriate granularities. ML systems need context- and goal-dependent world models and rich ontologies.
Intrinsic value of understanding: Humans are driven by curiosity and the desire to reduce uncertainty about the world (state, future, dynamics). ML can implement this via intrinsic rewards for information gain or rapid uncertainty reduction.
Social dynamics (demonstration, cooperation, competition): Humans learn by observing others, inferring intent and knowledge. This suggests the need for "theory of mind" capabilities in ML to avoid confounded causal learning in social contexts.
Social dynamics (language): Language transmits causal knowledge through instruction, explanation, and narrative. ML can leverage large language corpora but also explore how narratives structure causal thought and integrate multiple levels of explanation.
Social dynamics (credit and blame): Humans use counterfactual reasoning to assign responsibility, which is key for learning and social norms.

Section 3 provides a critical review of SCMs, highlighting their assumptions and how they can be problematic in the "human niche":

Tabular data: SCMs often assume pre-defined, measured variables, which is unlike "raw data" settings where representation learning is needed.
Reichenbach principle: Not all relations are parsimoniously described as directed causal ones.
Directed Acyclic Graphs (DAGs): Awkward for bidirectional influences or non-causal relations. Less expressive than alternatives like causal programs. Figure 2 illustrates how simple physical systems (billiard balls, a ring of dominoes) can challenge a single DAG representation, as the intuitive causal structure can depend on observer frame or the intervention itself.
Discovery of "true" model: The "best" model is often goal- and constraint-dependent, not a single "true" underlying model.
Independent exogenous noise & Deterministic functions: These assumptions (equivalent to no unobserved confounders and attributing all uncertainty to noise) restrict modelable systems and demand high observational fidelity. Counterfactuals, they argue, primarily need unobserved variables, not necessarily independent exogenous noise or deterministic mechanisms.
Interventionally Independent Causal Mechanisms (I-ICM): Assumes interventions are local and modular, which may not hold in many real-world systems.
Statistically Independent Causal Mechanisms (S-ICM): Assumes mechanisms are drawn independently, contradicting how humans generalize causal knowledge (e.g., similar objects have similar causal properties). Abandoning S-ICM could speed up learning.
Data-generating process assumptions: Learning from interventions is easier than from observational data. The i.i.d. assumption is often violated, but non-i.i.d. data (e.g., paired interventional data) can provide powerful learning signals.

The paper concludes by reiterating that bridging the gap between current ML causality and human causal cognition can lead to more capable AI. Key recurring themes for future ML work include: coarse-graining and abstraction, sparsity, zero-shot causal induction via theories and ontology, curiosity and hypothesis-driven exploration, uncertainty representation and amortized inference, and epistemic humility coupled with experimentation. The authors emphasize that causal relations are just one part of human understanding, advocating for a broader ecological approach to inspire AI development.

PDF Markdown

Tweets

https://twitter.com/rohanpaul_ai/status/1936828339441799596

Causality in the human niche: lessons for machine learning (2506.13803v1)

Summary

Related Papers

Tweets