Papers
Topics
Authors
Recent
Search
2000 character limit reached

Infinite Latent Events Model (ILEM)

Updated 6 January 2026
  • Infinite Latent Events Model (ILEM) is a nonparametric hierarchical Bayesian framework that infers an infinite set of latent binary events and their causal interdependencies in time series data.
  • It leverages a hierarchy of Dirichlet processes combined with a noisy-OR transition function to dynamically build infinite-dimensional Bayesian networks without predefining the number of events.
  • Empirical results demonstrate ILEM's effectiveness in applications such as sound factorization, network topology identification, and video game analysis, showcasing its scalability and precise causal inference.

The Infinite Latent Events Model (ILEM) is a nonparametric hierarchical Bayesian framework developed for structure discovery in discrete timeseries data. It allows inference of a countably infinite set of latent binary events, their activations over time, and their causal interdependencies. By leveraging a hierarchy of Dirichlet processes (DPs) and a noisy-OR transition function, ILEM constructs infinite-dimensional dynamic Bayesian networks without requiring prespecification of the number of latent factors or causal links. This formulation facilitates principled structure learning in settings where both the event set and their interactions are unknown, with demonstrated efficacy in domains such as sound factorization, network topology identification, and complex video game environments (Wingate et al., 2012).

1. Observed Data and Latent Variable Representation

ILEM models time-series observations Y={y1,…,yT}Y = \{y_1, \dots, y_T\}, with each yt∈RDy_t \in \mathbb{R}^D (or {0,1}D\{0,1\}^D for binary images), as resulting from the activation of an infinite set of latent events. These activations are expressed as binary vectors xt∈{0,1}∞x_t \in \{0,1\}^\infty, denoting which latent events are active at time tt. Collectively, the latent event matrix X∈{0,1}T×∞X \in \{0,1\}^{T \times \infty} provides a temporal map of event activations. In parallel, actual cause counts Ct,i,j∈NC_{t,i,j} \in \mathbb{N} are recorded, indicating the number of times event ii at time t−1t-1 induces event jj at time yt∈RDy_t \in \mathbb{R}^D0. This explicit representation supports direct modeling of temporally indexed causal structure.

2. Nonparametric Priors and Generative Process

ILEM imposes nonparametric priors on event occurrence and transitions using a hierarchical DP construction. The event-feature matrix yt∈RDy_t \in \mathbb{R}^D1 (equivalent to yt∈RDy_t \in \mathbb{R}^D2) can alternatively be viewed under an Indian Buffet Process prior. In stick-breaking notation, the global event palette is defined by yt∈RDy_t \in \mathbb{R}^D3. For each event yt∈RDy_t \in \mathbb{R}^D4 (including a distinguished "background" event yt∈RDy_t \in \mathbb{R}^D5), transition probabilities yt∈RDy_t \in \mathbb{R}^D6 specify how parent yt∈RDy_t \in \mathbb{R}^D7 generates children. A separate background DP yt∈RDy_t \in \mathbb{R}^D8 produces activations from an always-on source.

The noisy-OR transition function emerges by marginalizing out yt∈RDy_t \in \mathbb{R}^D9:

{0,1}D\{0,1\}^D0

where {0,1}D\{0,1\}^D1. Thus, the probability of event {0,1}D\{0,1\}^D2 firing at {0,1}D\{0,1\}^D3 is "or-combined" over all parents active at {0,1}D\{0,1\}^D4.

The generative process proceeds by sampling children from active parents via Poisson({0,1}D\{0,1\}^D5), children from background via Poisson({0,1}D\{0,1\}^D6), and assigning activations via child cause counts. Each parent {0,1}D\{0,1\}^D7's child is selected probabilistically: existing children are favored in proportion to their prior cause counts, while innovations allow the dynamic introduction of new events through DP mechanisms.

Observation likelihood is typically modeled using a linear-Gaussian mapping:

{0,1}D\{0,1\}^D8

with a matrix-Gaussian prior over {0,1}D\{0,1\}^D9. Marginalization over xt∈{0,1}∞x_t \in \{0,1\}^\infty0 yields a closed-form expression for xt∈{0,1}∞x_t \in \{0,1\}^\infty1.

ILEM eschews a fixed adjacency matrix, instead maintaining cause counts xt∈{0,1}∞x_t \in \{0,1\}^\infty2 at each timepoint. Parent event xt∈{0,1}∞x_t \in \{0,1\}^\infty3 tallies total causes xt∈{0,1}∞x_t \in \{0,1\}^\infty4, guiding subsequent child selection via the Chinese Restaurant Process (CRP) predictive rule:

  • xt∈{0,1}∞x_t \in \{0,1\}^\infty5
  • xt∈{0,1}∞x_t \in \{0,1\}^\infty6

If a novel child xt∈{0,1}∞x_t \in \{0,1\}^\infty7 is selected, global popularity xt∈{0,1}∞x_t \in \{0,1\}^\infty8 determines whether an existing event across all parents is reused, or a genuinely new event is introduced (probability proportional to xt∈{0,1}∞x_t \in \{0,1\}^\infty9). This dynamic causal link construction yields a sparse, emergent network---its size and topology remain unconstrained a priori and adapt to the data.

4. Bayesian Inference and Algorithmic Details

Posterior inference in ILEM is conducted via Markov Chain Monte Carlo (MCMC), specifically collapsed Gibbs sampling over tt0 and tt1 while marginalizing tt2 and tt3. CRP likelihoods and Poisson distributions govern conditional updates. For scalability, tt4 may be binarized to record solely the presence or absence of a link, without multiplicity.

Metropolis-Hastings moves refine mixing:

  • Event-rewrite proposals: relabel an active event, updating all associated tt5 entries.
  • Parent-swap proposals: reassign parentage from tt6 to tt7 for an event.
  • Annealing the temperature during burn-in avoids poor local minima.

Likelihood computations leverage rank-one updates for tt8 and its inverse/determinant under the linear-Gaussian model. Initialization can employ nonnegative matrix factorization (NMF). Only a finite set of events is active at any finite tt9 due to finite expected Poisson draws, with slice sampling or truncations bounding X∈{0,1}T×∞X \in \{0,1\}^{T \times \infty}0 for implementation.

5. Scalability and Infinite-Dimensional Analysis

ILEM’s framework supports a countably infinite event space. In practice, only a finite number X∈{0,1}T×∞X \in \{0,1\}^{T \times \infty}1 of events are realized over any finite dataset, with new events introduced via DP innovation at parent draws (X∈{0,1}T×∞X \in \{0,1\}^{T \times \infty}2) and globally from the base measure (X∈{0,1}T×∞X \in \{0,1\}^{T \times \infty}3). The hierarchical DP architecture and Poisson child draws ensure computational tractability. For exact inference, slice-sampling (Walker, 2007) allows adaptive control of the active event set's cardinality.

6. Empirical Demonstrations and Application Domains

ILEM has been empirically validated across several domains:

Domain Dataset Description ILEM Outputs
Causal Soundscape 52 time-steps, 8 kHz sound clips (X∈{0,1}T×∞X \in \{0,1\}^{T \times \infty}4), events: frog, cricket, elephant X∈{0,1}T×∞X \in \{0,1\}^{T \times \infty}510 prototypes, directed causal links
Space Invaders 400 frames, 15X∈{0,1}T×∞X \in \{0,1\}^{T \times \infty}615 binary images (alien, turret, bullet, explosion) Prototypes match sprites, causal chains inferred
Network Topology ("SysAdmin") 400 time-steps, X∈{0,1}T×∞X \in \{0,1\}^{T \times \infty}7 machines (crash patterns, unknown sparse graph) High link recovery rates (12/12, 19/20, 40/42)

In the network topology context, ILEM outperformed one-step co-occurrence statistics, inferred hidden nodes such as "rogue" machines, and scaled to X∈{0,1}T×∞X \in \{0,1\}^{T \times \infty}8 nodes with nearly perfect link recovery on sparse random graphs.

7. Strengths, Limitations, and Prospective Extensions

ILEM simultaneously learns the cardinality and sequence of latent events, the factored causal link structure, and the observation model prototypes. Its nonparametric Bayesian foundation obviates manual specification of event dimensionality and ad hoc regularization. Noisy-OR semantics provide interpretable, excitatory causal relationships.

Identified limitations include potential slow MCMC mixing for large X∈{0,1}T×∞X \in \{0,1\}^{T \times \infty}9 and Ct,i,j∈NC_{t,i,j} \in \mathbb{N}0, reliance on excitatory (OR-like) links, sensitivity to hyperparameters (Ct,i,j∈NC_{t,i,j} \in \mathbb{N}1, Ct,i,j∈NC_{t,i,j} \in \mathbb{N}2, Ct,i,j∈NC_{t,i,j} \in \mathbb{N}3), and fixed observation models. Extensions under active consideration are:

  • Semi-Markov/duration modeling for event persistence.
  • General conditional-probability tables beyond noisy-OR, including inhibitory effects.
  • Variational or slice-sampling inference for improved scalability.
  • Hierarchical/deep architectures allowing nested sub-events.
  • Nonlinear or deep network observation models (e.g., variational autoencoders).

The ILEM unifies nonparametric structure discovery for timeseries causal networks, adaptively learning sparse, interpretable event-to-event causality and event prototypes without requiring explicit enumeration of latent dimensions, and achieves state-of-the-art causal inference in multiple domains (Wingate et al., 2012).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Infinite Latent Events Model (ILEM).