Hierarchical Folded Normal Model
- Hierarchical Folded Normal Model is a complex probabilistic framework that embeds the folded normal distribution—obtained by taking the absolute value of a normal variate—into multilevel models.
- The model features boundary coercivity and a unique profile likelihood maximizer, ensuring stable parameter estimation even under nonregular likelihood conditions.
- Rigorous results establish Hausdorff consistency, nonstandard asymptotics (including n^(1/4) convergence for μ = 0), and practical implications for hierarchical and penalized inference.
The hierarchical folded normal model pertains to probabilistic models where the folded normal distribution is embedded as a prior, likelihood, or data-generating component within a broader, typically multilevel, statistical structure. The folded normal distribution arises by taking the absolute value of a normal variate, yielding a nonregular model with challenging likelihood geometry—features that propagate into hierarchical extensions. The most recent theoretical treatment establishes rigorous likelihood properties, full identification results, nonstandard asymptotics, and robust estimation principles for such models, addressing key boundary and uniform convergence issues (Mallik, 25 Aug 2025).
1. Folded Normal Distribution and Likelihood Structure
The folded normal distribution is defined as the distribution of where . Its density for is
where is the standard normal density. When incorporated as part of a (possibly hierarchical) stochastic model, the log-likelihood for observed is: This likelihood is even in , resulting in non-identifiability of the sign and intricate geometry, especially in hierarchical constructions where µ and σ may themselves depend on latent or hyperparameters.
2. Likelihood Geometry: Boundary Coercivity and Maximizers
The folded normal log-likelihood exhibits boundary coercivity: as or , , provided the data exhibit nonzero sample variance. Explicitly, for any : where is the sample variance of the data. This property ensures that, within a hierarchical or penalized framework, likelihood maximization is not compromised by non-informative extrema at the domain boundaries.
For profile likelihood optimization, for each fixed , the function has a unique maximizer , determined by the fixed-point equation: The solution path is strictly decreasing and in σ: with . The profile likelihood in , , is strictly unimodal, with exactly one maximizer in .
3. Identification, Consistency, and Asymptotic Rates
Let denote the true parameters, with the parameter space identifiable only up to sign, . The estimation procedures yield the following guarantees:
- Hausdorff Consistency: The set of maximizers converges in the Hausdorff metric to the true set :
- Kullback-Leibler Separation: The expected log-likelihood is strictly maximized on , and population KL divergence is strictly positive elsewhere.
Asymptotic Distribution:
| Property | (Regular) | (Nonregular) |
|---|---|---|
| MLE convergence rate | ||
| Limiting law | Gaussian | Argmax of quadratic-minus-quartic |
| Fisher information | Nonsingular | Singular in μ-direction |
| Confidence set geometry | Elliptical (Wald/LR) | Highly non-elliptical; Wald fails |
| Inference method | Classical (Wald/LR) | Nonstandard (LR, subsampling, etc.) |
In the regular case , standard normal asymptotics apply. In the nonregular case , the Fisher information matrix is singular for μ, resulting in an estimator with convergence rate and a limiting law described as an argmax of a quadratic-minus-quartic contrast:
4. Uniform Laws, Envelope Bounds, and Finite Sample Control
The uniform law of large numbers (ULLN) for the folded normal log-likelihood is established via explicit envelope functions on the log-likelihood, score, and Hessian. This approach avoids covering number or entropy-based machinery and supports finite-sample deviation bounds. The principle is that, for any parameter set bounded away from identification sets, the probability of uniform deviations exceeding specified constants decreases rapidly with sample size.
Envelope bounds grant practical error control for both inference and hierarchical extensions, since parameter spaces can be large or growing with model complexity.
5. Implications for Hierarchical and Penalized Models
The properties established in the non-hierarchical case extend to hierarchical or multilevel models incorporating folded normal components. Key consequences:
- Well-Behaved Posterior and Penalized Likelihoods: Boundary coercivity and unimodal profiling ensure posterior or penalized estimation procedures avoid pathological maxima.
- Explicit Control in Empirical Bayes: Envelope bounds and uniform laws allow robust calculation of marginal likelihoods and empirical Bayes estimates, accounting for nonregularity, especially in "null signal" regimes.
- Penalty Design: Boundary coercivity persists under suitable penalty functions. For mixture models and complex hierarchies, adding quadratic penalties in location and log-scale secures estimator existence and stability since the penalty dominates possible variance-collapse spikes.
A plausible implication is that sample size–dependent penalty tuning can be explicitly calculated for consistency: when the penalty shrinks slowly enough that sample size multiplied by penalty diverges, penalized estimators retain consistency.
6. Estimation Procedures and Practical Inference
- MLE Computation: Profile likelihood in folded normal models has a unique maximizer; Newton-Raphson or similar algorithms are stable under mild starting conditions.
- Confidence Interval Construction: For near zero, naive Wald intervals are inaccurate due to the one-fourth rate and non-Gaussian limits. Likelihood-ratio or subsampling-based intervals are preferred.
- Variance Estimation: Inference for is regular, but joint confidence regions with μ are highly non-elliptical in near-symmetry regimes.
- Hierarchical/Bayesian Implementation: The explicit bounds and path regularity undergird robust prior and hyperparameter selection within hierarchical folded normal frameworks, especially for partially or wholly non-identifiable settings.
7. Summary Table and Key Formulas
| Property | Regular () | Nonregular () |
|---|---|---|
| MLE rate | in | |
| Limiting law | Gaussian | Argmax quadratic-minus-quartic |
| Fisher information | Nonsingular | Singular in μ-direction |
| Inference strategy | Wald/LR | Nonstandard (not Wald) |
| Boundary behavior | Coercive | Coercive |
Key formulas:
- Folded normal density:
- Profile path derivative:
- Loglikelihood contrast expansion (nonregular):
- Limiting contrast (nonregular):
where .
The hierarchical folded normal model thus rests on an explicit characterization of likelihood geometry and nonregularity, with finite sample error control and guaranteed consistency—features directly translatable to complex hierarchical and penalized statistical models (Mallik, 25 Aug 2025).