Exploratory Causal Inference in SAEnce

Published 15 Oct 2025 in cs.LG and cs.AI | (2510.14073v1)

Abstract: Randomized Controlled Trials are one of the pillars of science; nevertheless, they rely on hand-crafted hypotheses and expensive analysis. Such constraints prevent causal effect estimation at scale, potentially anchoring on popular yet incomplete hypotheses. We propose to discover the unknown effects of a treatment directly from data. For this, we turn unstructured data from a trial into meaningful representations via pretrained foundation models and interpret them via a sparse autoencoder. However, discovering significant causal effects at the neural level is not trivial due to multiple-testing issues and effects entanglement. To address these challenges, we introduce Neural Effect Search, a novel recursive procedure solving both issues by progressive stratification. After assessing the robustness of our algorithm on semi-synthetic experiments, we showcase, in the context of experimental ecology, the first successful unsupervised causal effect identification on a real-world scientific trial.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper introduces a method using Sparse Autoencoders and Neural Effect Search to identify statistically significant treatment effects.
It employs foundation models to extract semantic features and counteract effect entanglement in high-dimensional data.
Empirical results show NES outperforms standard corrections in both semi-synthetic benchmarks and real-world ecological trials.

Exploratory Causal Inference in SAEnce

This paper introduces a novel approach to identify statistically significant causal effects from exploratory experiments using Sparse Autoencoders (SAEs) and Neural Effect Search (NES). It addresses the challenge of multiple-testing issues and effect entanglement in causal inference. The method is empirically validated through both semi-synthetic benchmarks and real-world trials in experimental ecology.

Framework Overview

The paper's framework involves transforming data from randomized controlled trials into meaningful representations using Foundation Models (FM) and SAEs. The resultant neural representations are analyzed to discover treatment effects. A critical challenge is the "Paradox of Exploratory Causal Inference" where increasing test power flags all outcome-entangled neurons as significant.

Figure 1: The Paradox of Exploratory Causal Inference: Increasing test power makes any outcome-entangled code significant, independent of its main interpretation.

Treatment Effect Estimation

The estimation of treatment effects leverages the causal inference principle, focusing on discrete outcome variables. The authors employ FMs to extract semantically meaningful features from high-dimensional data, which are encoded into a sparse and interpretable dictionary using SAEs.

Key to this process is the assumption that the SAE code coordinates each function as a separate measurement channel. However, entanglement of effects can result in polysemantic neurons, creating false positives when testing for significance.

Neural Effect Search

To address the entanglement and polysemanticity issues, the authors propose NES, a recursive procedure. NES stratifies the testing data, progressively identifying the most representative neuron of each effect, and correcting for entanglement bias through iterative refinement.

The paper proves NES's consistency and convergence properties, showing it can outperform standard correction methods like Bonferroni in maintaining precision and recall across varying sample sizes or treatment effect magnitudes.

Experimental Results

Semi-Synthetic Benchmark

Experiments on CelebA dataset show NES maintaining high precision and recall, effectively disentangling effects even in high-power settings where conventional methods collapse under entanglement pressure.

Figure 2: Semi-synthetic benchmark. NES achieves the best trade-off in precision, recall, and IoU, while avoiding significant collapse of standard corrections.

Real-World Application

In a real-world ecology trial, NES successfully identified two significant behaviors—grooming and background effects—which align with previous expert annotations. This reinforces the method’s potential for aiding hypothesis generation in experimental science.

Figure 3: Exploratory Causal Inference for Experimental Ecology. NES vitally retrieves two significant effects aligning with literature.

Conclusions and Implications

The method provides a robust, data-driven approach for discovering causal relationships without prior knowledge of affected outcomes. Its application could significantly enhance exploratory research efficiency, allowing automatic hypothesis surfacing from complex datasets.

Future work may focus on extending these methods to multi-modal data and improving the disentanglement of continuous variables within SAEs. These advancements could further impact AI-driven exploratory data analysis, advancing both the theoretical understanding of causal representations and their practical deployment in scientific research.

Markdown Report Issue