Covariance-Free Ridge Estimator in Mixed Models
- The paper introduces a design-consistent pairwise composite likelihood estimator for two-level mixed models that remains unbiased under informative sampling.
- It leverages bivariate normal log-densities with closed-form matrix inverses to replace the full likelihood, simplifying complex survey computations.
- Simulation studies show that while efficiency for variance components may drop in moderate clusters, the approach ensures robust, unbiased estimates.
Rao’s mixed-effects model, specifically the pairwise composite-likelihood estimation framework developed by Rao and co-workers, provides a general, design-consistent approach for fitting two-level linear mixed models in the context of complex survey data when model clusters and sampling design clusters coincide. This methodology leverages sums of bivariate normal log-densities (rather than the full likelihood), yielding estimators that remain robust under informative sampling—particularly relevant in large-scale health and social science surveys. The approach has been implemented in the R package svylme via the function svy2lme (Lumley et al., 2023).
1. Two-level Linear Mixed Model Specification
Let clusters (e.g., schools) be indexed by , with within-cluster units . The classical two-level linear mixed model is: where , , and . Here, is a vector of fixed-effect covariates, the fixed-effects vector; is a vector of random-effect covariates, and the random-effects vector for cluster . The random-effects covariance is generally written , with collecting variance and correlation parameters, so that the full parameter vector is .
The marginal distribution for the cluster response vector is: where is the covariate matrix and .
2. Pairwise Composite Loglikelihood
Rao’s estimator replaces the full-data loglikelihood with a sum over all within-cluster bivariate log-densities. For responses in cluster (with ), the joint covariance is , where denotes the submatrix of associated with units and . The bivariate normal log-density is
where , with .
The (infinite-population) composite loglikelihood is
with . For survey samples, the design-weighted composite loglikelihood is
where if unit is sampled and is the inclusion probability for the pair.
3. Estimating Equations and Profile Deviance
Partial derivatives of yield composite score equations for unbiased parameter estimation. The fixed-effect score equation is
with stacking , .
The estimator is: where is diagonal with entries for sampled pairs.
The and scores are similarly specified, and solutions are found via profiling: The profile deviance is:
4. Computational Implementation and Algorithm
In R, the approach is implemented in svylme via svy2lme(design, formula). The workflow includes:
- Pair selection: For each sampled cluster, enumerate all observed within-cluster pairs and compute pairwise inclusion weights using available sampling probabilities or approximations if joint probabilities are unavailable.
- Profiling: For each optimizer trial value , compute and by generalized least squares on pseudo-observations.
- Optimization: Minimize the profile deviance over using Powell’s bound-constrained derivative-free optimizer BOBYQA (minqa package), with starting values from an unweighted lme4 fit (lmer).
- Internal acceleration: Determinants and inverses of the matrices are computed in closed form.
5. Variance Estimation and Asymptotic Properties
Under mild regularity conditions (law of large numbers and central limit theorem for weighted sums of pairwise scores), the composite-likelihood estimator is consistent and asymptotically normal: where is the sensitivity matrix and the variability matrix for the composite score . Design-based sandwich estimators are used for practical variance estimation, with
A plausible implication is that, with appropriate weighting, confidence intervals and hypothesis tests retain their design-consistency properties in large samples.
6. Efficiency, Bias, and Applicability
Simulation studies indicate that under noninformative sampling, the pairwise composite-likelihood estimator is nearly unbiased for , , and , but generally less efficient than full maximum likelihood or stagewise pseudolikelihood estimators with proper weight-scaling. The loss of efficiency is most prominent for variance component estimation—standard errors for the random intercept variance may be two to three times larger in moderate cluster sizes—but this efficiency gap narrows as cluster sizes decrease. In the limiting case , the pairwise composite-likelihood estimator is as efficient as maximum likelihood.
Under strongly informative sampling, where probability of observation is cluster-level effect dependent, pairwise composite scores retain unbiasedness in contrast to stagewise pseudolikelihood estimators, which can be substantially biased for both fixed-effects and variance components unless clusters become very large. This robustness to informative sampling makes the approach broadly applicable in survey settings with complex or multistage design features (Lumley et al., 2023).
7. Summary and Practical Considerations
Rao’s pairwise composite likelihood framework yields a general and design-consistent estimator for two-level linear mixed models under complex survey sampling, applicable even when clusters are neither large nor within-cluster observation probabilities are uniform. The tradeoff is reduced efficiency for certain parameters, particularly variance components, in exchange for unbiasedness and robustness to survey informativeness. The method is implemented efficiently in the R svylme package, using closed-form optimizations and optimizers tailored for complex likelihood surfaces. This estimator is particularly recommended when survey designs are informative and standard methods are either inapplicable or yield biased results (Lumley et al., 2023).