Dependent Dirichlet Processes (DDP)
- Dependent Dirichlet Processes (DDP) are Bayesian nonparametric models designed to model collections of dependent probability measures indexed by covariates like time or space.
- They leverage mechanisms such as Fleming–Viot diffusions and hidden Markov models to capture decaying correlations and evolving partition structures over time.
- DDPs enable precise predictive inference and adaptive clustering, as demonstrated by improved performance in synthetic experiments and clinical applications.
A Dependent Dirichlet Process (DDP) is a Bayesian nonparametric construct used to model collections of random probability measures indexed by covariates (such as time, space, or group), where each marginal law remains a Dirichlet process but the measures are dependent in a controlled fashion. DDPs provide a framework for flexible inference on data that exhibit complex dependency, allow for partial exchangeability, and facilitate predictive modeling in temporally or spatially structured data, density regression, and latent structure discovery.
1. Temporal Dependence via Fleming–Viot Diffusions
A central mechanism for constructing temporally dependent Dirichlet processes is the use of Fleming–Viot (FV) diffusions. Rather than a fixed stick-breaking construction, FV diffusions define a continuous-time Markov process on the space of probability measures. In this framework:
- At any time point, the random measure is marginally distributed as a Dirichlet process with parameter .
- The dynamics are governed by a latent death process , modeling the extinction and introduction of atoms: as time increases, more atoms are lost from to , reducing correlation between the measures.
- New atoms are introduced at rate ; if an atom's weight reaches zero, it is removed.
This diffusion construction allows the correlation between random measures to decay as the time separation grows, matching time-series intuition and supporting temporally adaptive prior modeling (Ascolani et al., 2020).
2. Hidden Markov Model Formulation
The dynamic dependent process can be formulated as a hidden Markov model (HMM):
- The hidden state (the probability measure) evolves according to the FV-DDP transition kernel:
- Observations are conditionally i.i.d. draws from for .
- This perspective allows tractable forward-filtering and backward-sampling strategies, as the propagation of the (posterior) mixing measure can be computed analytically given the FV transition (Ascolani et al., 2020).
The HMM framing introduces auxiliary latent states such as the death process, supporting closed-form predictive updates.
3. Time-Dependent Predictive Distribution
The predictive distribution for future observations (at time ) conditional on past data can be derived explicitly as a mixture of time-dependent Pólya urn schemes:
where are weights induced by the death process, is the total number of surviving past draws, is the baseline distribution, and is the empirical distribution based on the surviving atoms.
For simultaneous prediction of multiple new observations, a recursion yields: where is the empirical measure from previous predicted values (Ascolani et al., 2020).
This time-dependent predictive formula captures the impact of both the baseline distribution and surviving historic observations, modulated by the death process.
4. Partition Structure: The “Chinese Restaurant with Conveyor Belt” Metaphor
The latent partitioning of observations under these models is described by a generalization of the Chinese Restaurant Process (CRP):
- At time , the metaphor introduces a “conveyor belt” displaying a time-dependent set of “dishes” (surviving previous draws due to the death process).
- A new observation can either pick a dish from the conveyor belt (with weight ), select a new dish from the baseline (), or (for ) join a dish already chosen by previous new customers ().
- As the conveyor belt empties (i.e., all historic atoms are lost), the process reduces to the standard CRP associated with a DP (Ascolani et al., 2020).
This metaphor precisely represents the time-varying, partially exchangeable clustering structure of the data, where dependence on the past is controlled by the stochastic death process.
5. Posterior and Predictive Sampling Algorithms
Two sampling strategies are presented for inference under FV-DDP models:
(a) Exact Sampling Algorithm
- Sample a latent vector of surviving multiplicities from with probabilities .
- For the next prediction:
- With probability , sample from ,
- With probability , sample from ,
- With probability , sample from .
- Update the weights by multiplying previous weights by the predictive probability of the new outcome.
(b) Approximate Sampling Algorithm
- Employ a Monte Carlo method: simulate the death process in summary (e.g., as a one-dimensional process tracking the total surviving atoms), then sample the partition configuration from a hypergeometric distribution.
- Suitable when is large, scaling as (Ascolani et al., 2020).
These procedures yield the (potentially random) partitions and the corresponding predictive probabilities efficiently in both low and high-dimensional settings.
6. Empirical Applications
The FV-DDP framework has been validated on both synthetic and real data:
- Synthetic: For mixtures of translated Poisson distributions whose parameters shift over time, FV-DDP gives sharper and more accurate predictive credible intervals (as measured by normalized distances) versus time-dependent stick-breaking models, especially in capturing regime switches.
- Real-World: Applied to temporally indexed clinical scores (e.g., Karnofsky score in lymphoma studies), the method models changes in the distribution as new data accumulate and older data “age out.” The resulting temporal predictive distribution reflects clinical trends, such as improvement in scores, aligning with established benchmarks like the Kaplan–Meier estimate (Ascolani et al., 2020).
7. Significance and Future Prospects
The FV-DDP approach:
- Delivers explicit, closed-form predictive distributions in models where dependence among random measures is temporal and is mediated by measurable Markov processes (deaths of atoms).
- Encodes a tractable dependence structure allowing blending of historic and new observations in a nonparametric Bayesian context (including the ur n-based partition generation).
- Provides a flexible platform for further developments in time- or covariate-dependent Bayesian nonparametric inference, particularly in domains where explicit predictive distributions are required for sequential data, adaptive clustering, or regime-switching phenomena.
These theoretical and computational advances enable multivariate, time-evolving nonparametric inference in a principled and practical manner, with rigorously characterized predictive and partition structures (Ascolani et al., 2020).