Causality-Consistent PH Model
- The causality-consistent PH model is a joint lifetime framework that uses time-inhomogeneous multivariate phase-type distributions to mediate dependency via a shared latent initial state.
- It provides closed-form joint survival and density functions through cumulative sub-intensity matrices and explicit time-inhomogeneity, enhancing causal interpretations.
- A custom EM-type estimation approach integrates multinomial regression for covariate-driven initial distributions while efficiently handling right-censored data.
The causality-consistent PH model is a joint lifetime modeling framework based on time-inhomogeneous multivariate phase-type (PH) distributions, specifically the multivariate inhomogeneous PH (mIPH) class as formulated by Albrecher, Bladt, and Müller (Hansjörg et al., 2022). This construction provides a direct causal interpretation of dependence in bivariate survival data, with all association mediated exclusively through a shared latent initial state. The mIPH approach contrasts with copula-based methods by imposing time-order and explicit mediation via covariates, ensuring full causal consistency.
1. Model Specification and State-Space Structure
The causality-consistent PH model operates on a common state-space , where states $1$ to are transient and is absorbing. The observed random vectors , interpreted as lifetimes, are constructed as absorption times of two time-inhomogeneous pure-jump Markov chains and . Both chains commence from a random initial state , with the initial distribution potentially varying across subjects.
For each chain , the instantaneous evolution is governed by a sub-intensity matrix , encoding transition rates between phases, and an exit vector denotes the rate of absorption to state . The complete generator takes the form:
with lifetimes defined as first hitting times of the absorbing state:
This construction yields the joint law , where both marginals are matrix-exponential distributions.
2. Joint Survival Functions and Density Formulation
The model admits closed-form expressions for the joint survival and density functions, conditional on the latent starting state. For , the joint survival probability over times :
where is the cumulative sub-intensity matrix, and is the th unit vector.
The joint density for observed lifetimes is:
with encoding exit rates from each phase.
3. Causal Mediation and Independence Structure
Causal consistency in the mIPH model is enforced by mediating all dependence via the single latent starting state . The model defines a sharp causal structure:
- Covariates (such as ages at issue, health indicators) determine the initial distribution through multinomial logistic regression.
- Conditional on , the two chains and their lifetimes evolve independently, and hazards at future times depend only on elapsed time and , not on the partner's observed outcome.
A formal statement:
There is no edge nor , and updating survivor information simply refines the posterior over , adhering to a pure causal mediation paradigm. For conditional hazard estimation:
Thus, knowing does not directly influence apart from its effect on the latent distribution.
4. Parameter Estimation: EM-Type and Covariate Modeling
Maximum likelihood estimation in the causality-consistent PH model proceeds via a custom EM algorithm (ERMI), which accommodates right-censored observations and covariate-driven heterogeneity:
- E-step: Compute posterior weights and sufficient statistics for transitions and sojourn times.
- M-step: Update phase transition matrix via estimated transition counts and total occupation times.
- R-step: Update multinomial regression coefficients governing covariate influence on starting probabilities.
- I-step: Optimize the time-inhomogeneity functions controlling phase progressions.
Observed-data likelihood for individual is a mixture over latent states:
where is observed or censored time, is the censoring indicator.
5. Time-Inhomogeneity: Flexibility and Parsimony
A central feature of the mIPH model is scalar time-inhomogeneity, enabling significant flexibility and parameter parsimony. Commonly, the sub-intensity matrices factor as , with scalar functions and a common constant matrix, reducing the dimensionality of parameter estimation. Notably, the matrix-Gompertz specification yields a survival matrix:
This structure allows for high-quality fits with few phases, as illustrated with ten phases for joint-lifetime data of Frees et al. (Hansjörg et al., 2022):
6. Practical Implications and Interpretability
In practical applications, the causality-consistent PH model provides a mechanism for inferring associations between paired lifetimes (e.g., spouses, insured lives) through a latent health state determined by explicit covariates. Conditioning on one partner's survival time updates knowledge of the shared latent state, thereby shifting the hazard of the other—a direct encoding of causal mediation. The model eschews copula construction, sidestepping issues of non-identifiability and arbitrary dependence structures. All model properties correspond precisely to the directed-graph mechanism:
- Joint survival arises entirely from the shared initial condition.
- Time-inhomogeneity admits model parsimony and robustness in fitting.
- Covariates enter exclusively via the multinomial latent-state initialization.
7. Summary Table of Key Components
| Component | Mathematical Notation | Description |
|---|---|---|
| State-space | Phases + absorbing state | |
| Initial latent state | Covariate-driven start | |
| Sub-intensity matrix | Time-dependent rates | |
| Joint survival function | Closed-form formula | |
| Inhomogeneity specification | Time scaling factor | |
| Multinomial regression | via | Covariate mapping |
| EM estimation steps | E, M, R, I | Posterior + parameter fit |
This architecture provides a fully causally interpretable ageing mechanism for joint lifetimes, with directed associations and closed-form formulas throughout (Hansjörg et al., 2022). The mIPH construction enables exact mediation modeling, right-censoring handling, and interpretable dependence consistent with real-world causal processes.