Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 104 tok/s
Gemini 3.0 Pro 36 tok/s Pro
Gemini 2.5 Flash 133 tok/s Pro
Kimi K2 216 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Bivariate Pseudo-Poisson Model

Updated 19 November 2025
  • The topic is defined as a joint distribution where one variable is exactly Poisson and the second is modeled by a Poisson regression, enabling both equi-dispersion and over-dispersion.
  • It employs linear and nonlinear conditional rate models with closed-form moments and maximum likelihood estimation to capture dependency in count data.
  • Extensions include symmetric Poisson conditional models and robust goodness-of-fit tests, with applications in epidemiology, traffic safety, and health statistics.

A bivariate pseudo-Poisson distribution is a class of joint distributions for two non-negative integer-valued random variables (X1,X2)(X_1, X_2) in which one marginal distribution is exactly Poisson, and the conditional distribution of the second variable given the first is a Poisson regression (with rates that may be affine or nonlinear in the first variable). This construction enables the modeling of bivariate count data with the property that one margin is equi-dispersed while the other is over-dispersed, and it establishes an explicit, tractable dependence structure driven by the conditional mean specification (Arnold et al., 2020, Lakhani, 18 Nov 2025, Veeranna et al., 2023).

1. Model Specification and Joint Distributions

Linear Pseudo-Poisson Model

Let X1,X2X_1, X_2 be non-negative integer-valued random variables. The canonical three-parameter linear pseudo-Poisson model is defined by:

  • X1Poisson(λ1)X_1 \sim \mathrm{Poisson}(\lambda_1), λ1>0\lambda_1 > 0
  • X2X1=x1Poisson(λ2+λ3x1)X_2\mid X_1 = x_1 \sim \mathrm{Poisson}(\lambda_2 + \lambda_3 x_1), λ20\lambda_2 \ge 0, λ30\lambda_3 \ge 0

The joint probability mass function (pmf) is

P(X1=x1,X2=x2)=eλ1λ1x1x1!e(λ2+λ3x1)(λ2+λ3x1)x2x2!for x1,x2=0,1,2,P(X_1 = x_1, X_2 = x_2) = e^{-\lambda_1} \frac{\lambda_1^{x_1}}{x_1!} \, e^{-(\lambda_2+\lambda_3 x_1)} \frac{(\lambda_2+\lambda_3 x_1)^{x_2}}{x_2!} \quad\text{for } x_1, x_2 = 0, 1, 2, \ldots

The natural parameter space is {λ1>0,λ20,λ30}\{\lambda_1 > 0,\, \lambda_2 \ge 0,\, \lambda_3 \ge 0\}. Independence is recovered when λ3=0\lambda_3 = 0 (Arnold et al., 2020, Veeranna et al., 2023).

Nonlinear Conditional Rate Extensions

The model can be extended by specifying the Poisson conditional rate via a nonlinear, bounded regression function F(x1;θ)F(x_1; \theta), yielding (Lakhani, 18 Nov 2025): X2X1=x1Poisson(δ+βF(x1;θ)),δ0,β0X_2 \mid X_1 = x_1 \sim \mathrm{Poisson}\left(\delta+\beta F(x_1;\theta)\right),\quad \delta\ge 0,\, \beta\neq 0 where typically F(x1;θ)F(x_1; \theta) is non-decreasing, with F(0;θ)=0F(0;\theta)=0 and limx1F(x1;θ)=1\lim_{x_1\to\infty} F(x_1; \theta) = 1. Examples include the exponential kernel F(x1;γ)=1eγx1F(x_1;\gamma) = 1-e^{-\gamma x_1} and the Lomax kernel F(x1;γ,η)=1(γx1+γ) ⁣ηF(x_1; \gamma', \eta)=1-\left(\tfrac{\gamma'}{x_1+\gamma'}\right)^{\!\eta}.

2. Marginal, Conditional, and Generating Functions

  • Marginal of X1X_1: By construction, X1Poisson(λ1)X_1 \sim \mathrm{Poisson}(\lambda_1) (Arnold et al., 2020, Lakhani, 18 Nov 2025).
  • Conditional of X2X1X_2 \mid X_1: X2X1=x1Poisson(λ2+λ3x1)X_2\mid X_1 = x_1 \sim \mathrm{Poisson}(\lambda_2+\lambda_3 x_1) (linear), or more generally Poisson(λ2(x1))\mathrm{Poisson}(\lambda_2(x_1)) (nonlinear) (Arnold et al., 2020, Lakhani, 18 Nov 2025).
  • Marginal of X2X_2: X2X_2 is a Poisson mixture, yielding a Neyman Type A marginal with probability generating function (pgf)

GX2(t)=exp{λ2(t1)}exp{λ1(eλ3(t1)1)}G_{X_2}(t) = \exp\{\lambda_2 (t-1)\} \cdot \exp\{\lambda_1 (e^{\lambda_3(t-1)}-1)\}

for the linear model. Hence Var(X2)>E(X2)\mathrm{Var}(X_2) > \mathrm{E}(X_2) when λ3>0\lambda_3 > 0 (Arnold et al., 2020).

  • Joint pgf:

G(t1,t2)=exp{λ1(t1eλ3(t21)1)+λ2(t21)}G(t_1, t_2) = \exp\Big\{\lambda_1(t_1 e^{\lambda_3(t_2-1)} - 1) + \lambda_2(t_2 - 1)\Big\}

Mixed moments can be obtained via differentiation (Arnold et al., 2020, Veeranna et al., 2023).

3. Moments, Dispersion, and Dependence

For the linear model, the first and second moments and covariance are available in closed form: E[X1]=λ1,Var[X1]=λ1\mathrm{E}[X_1]=\lambda_1,\quad \mathrm{Var}[X_1]=\lambda_1

E[X2]=λ2+λ3λ1\mathrm{E}[X_2]=\lambda_2+\lambda_3\lambda_1

Var[X2]=λ2+λ3λ1+λ32λ1\mathrm{Var}[X_2]=\lambda_2+\lambda_3\lambda_1+\lambda_3^2\lambda_1

Cov(X1,X2)=λ3λ1\mathrm{Cov}(X_1, X_2) = \lambda_3 \lambda_1

ρ=λ3λ1λ1(λ2+λ3λ1+λ32λ1)\rho = \frac{\lambda_3\lambda_1}{\sqrt{\lambda_1(\lambda_2 + \lambda_3\lambda_1+\lambda_3^2\lambda_1)}}

Thus, X2X_2 is equi-dispersed for λ3=0\lambda_3=0 and over-dispersed otherwise.

For nonlinear conditional rate models, the sign of the correlation is determined by the sign of β\beta. The exponential kernel, for instance, yields

Cov(X1,X2)=αβ(1ν)μν1\mathrm{Cov}(X_1, X_2) = \alpha\beta(1 - \nu)\mu^{\nu-1}

with μ=eα\mu = e^{\alpha}, ν=eγ\nu = e^{-\gamma} (Lakhani, 18 Nov 2025).

The generalized dispersion index (Kokonendji–Puig) for Z=(X,Y)Z=(X,Y)^\top can be written

GDI(Z)=E[Z]Cov(Z)E[Z]E[Z]E[Z]\mathrm{GDI}(Z) = \frac{\sqrt{\mathrm{E}[Z]}^\top\,\mathrm{Cov}(Z)\,\sqrt{\mathrm{E}[Z]}}{\mathrm{E}[Z]^\top\mathrm{E}[Z]}

with GDI(Z)>1GDI(Z) > 1 indicating overdispersion (Veeranna et al., 2023).

4. Parameter Estimation and Inference

Parameter estimation procedures for the pseudo-Poisson model are direct:

  • For the linear form, the log-likelihood for i.i.d. observations {(x1i,x2i)}\{(x_{1i}, x_{2i})\} is

(λ1,λ2,λ3)=i[λ1+x1ilnλ1(λ2+λ3x1i)+x2iln(λ2+λ3x1i)]\ell(\lambda_1, \lambda_2, \lambda_3) = \sum_i \big[ -\lambda_1 + x_{1i}\ln \lambda_1 - (\lambda_2+\lambda_3 x_{1i}) + x_{2i}\ln(\lambda_2+\lambda_3 x_{1i}) \big]

The estimator for λ1\lambda_1 is explicit, λ^1=Xˉ1\hat{\lambda}_1=\bar{X}_1, while λ2\lambda_2 and λ3\lambda_3 are obtained by solving the score equations numerically (e.g., by Newton–Raphson) (Arnold et al., 2020, Veeranna et al., 2023).

For nonlinear pseudo-Poisson models, estimation is also via MLE, with parameter-specific score equations. The estimator for α\alpha remains α^=n1ix1i\hat{\alpha} = n^{-1}\sum_i x_{1i} (Lakhani, 18 Nov 2025).

5. Goodness-of-Fit Testing

A range of goodness-of-fit (GoF) tests are available for the pseudo-Poisson model (Veeranna et al., 2023):

Test Class Principle Comments
Supremum (pgf-based) Maximum standardized absolute deviation of empirical vs theoretical pgf over (t1,t2)(t_1,t_2) grid Bootstrap for critical values
Fisher-index–based Difference between empirical and model-based generalized dispersion index (GDI) Not consistent vs all alternatives, simple
Muñoz–Gamero quadratic (MG) Integrated squared deviation of empirical minus model pgf against weight w(t1,t2)w(t_1,t_2) Bootstrap for critical values
Pointwise KK Standardized difference at fixed (t1,t2)(t_1, t_2) Highly sensitive to location
Classical χ2\chi^2 Cell-based comparison of observed and expected counts Simple, least powerful

Supremum and MG tests are consistent against broad alternatives and show good power properties given moderate sample size (n30n \geq 30). Fisher-index–based tests target overdispersion specifically.

The R package PseudoPoissonGoF (under development) automates fitting and all proposed GoF procedures (Veeranna et al., 2023).

6. Extensions and Generalizations

Symmetric Poisson Conditional Models

Allowing both conditionals to be Poisson (so neither marginal is Poisson except in independence) leads to the symmetric or "bivariate Poisson-conditional" model: XY=yPoisson(λ1λ3y),YX=xPoisson(λ2λ3x)X \mid Y = y \sim \mathrm{Poisson}(\lambda_1 \lambda_3^y),\quad Y \mid X = x \sim \mathrm{Poisson}(\lambda_2 \lambda_3^x) with 0<λ310 < \lambda_3 \leq 1. The corresponding joint pmf is

P{X=x,Y=y}=K(λ1,λ2,λ3)λ1xλ2yλ3xyx!y!P\{X = x, Y = y\} = K(\lambda_1, \lambda_2, \lambda_3) \frac{\lambda_1^x \lambda_2^y \lambda_3^{xy}}{x!y!}

where K()K(\cdot) is a normalizing constant requiring summation to infinite limits. Correlation is negative for λ3<1\lambda_3 < 1, and zero under independence (λ3=1\lambda_3 = 1) (Arnold et al., 2023).

Nonlinear Curvature Models

Generalizing the conditional mean to bounded, nonlinear kernels (e.g., exponential, Lomax) enables modeling negative as well as positive correlation, and accommodates complex boundary behaviors as at (0,0)(0,0). Akaike Information Criterion (AIC) comparisons demonstrate that such models can substantially outperform linear models, especially in cases where negative correlation is present (Lakhani, 18 Nov 2025). For example, setting β<0\beta<0 in the exponential kernel induces strictly decreasing conditional means and negative dependence—a feature unavailable in linear forms.

7. Applications and Empirical Results

Pseudo-Poisson models are particularly advantageous where one variable is equi-dispersed (truly Poisson) and the other is over-dispersed. Empirical case studies include:

  • Health and Retirement Study: X1=X_1= number of chronic conditions, X2=X_2= health-care utilizations; X1X_1 is nearly equi-dispersed, X2X_2 slightly over-dispersed, estimated parameters λ^1=2.643\hat{\lambda}_1=2.643, λ^2=0.688\hat{\lambda}_2=0.688, λ^3=0.031\hat{\lambda}_3=0.031 (Arnold et al., 2020).
  • Interstate-95 Traffic Accidents: X1=X_1= fatalities, X2=X_2= number of injury accidents; both margins over-dispersed, and AIC comparison favors the model that treats X2X_2 as the marginal and X1X2X_1\,|\,X_2 as Poisson (Arnold et al., 2020).
  • Negative Correlation: Simulation studies with exponential-curvature conditional mean (e.g., α=5,β=20,γ=0.5,δ=25\alpha=5, \beta=-20, \gamma=0.5, \delta=25) fit negative correlations (ρ0.6\rho\approx-0.6) that linear pseudo-Poisson models cannot (Lakhani, 18 Nov 2025).

Pseudo-Poisson models are widely used in epidemiology, traffic safety, and health statistics due to analytic tractability, flexibility in dependence structure, and direct parameter interpretation.


References:

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Bivariate Pseudo-Poisson Distribution.