Papers
Topics
Authors
Recent
2000 character limit reached

Testing Parametric Distribution Family Assumptions via Differences in Differential Entropy (2512.11305v1)

Published 12 Dec 2025 in econ.EM

Abstract: We introduce a broadly applicable statistical procedure for testing which parametric distribution family generated a random sample of data. The method, termed the Difference in Differential Entropy (DDE) test, provides a unified framework applicable to a wide range of distributional families, with asymptotic validity grounded in established maximum likelihood, bootstrap, and kernel density estimation principles. The test is straightforward to implement, computationally efficient, and requires no tuning parameters or specialized regularity conditions. It compares an MLE-based estimate of differential entropy under the null hypothesis with a nonparametric bootstrapped kernel density estimate, using their divergence as an information-theoretic measure of model fit.

Summary

  • The paper introduces the DDE test, which contrasts MLE-based and KDE-based differential entropy estimates to evaluate the fit of parametric distributions.
  • It employs a bootstrap calibration method that accounts for finite-sample bias without relying on asymptotic approximations.
  • Simulation and empirical results demonstrate the test’s power in detecting deviations in tail behavior, skewness, and overall distribution shape.

Testing Parametric Distribution Family Assumptions via Differences in Differential Entropy

Introduction

This paper introduces the Difference in Differential Entropy (DDE) test, a unified, information-theoretic statistical procedure for assessing which parametric distribution family is consistent with an observed sample. The DDE test leverages the contrast between parametric (MLE-based) and nonparametric (KDE-based) estimators of differential entropy (DE). The method is notable for broad applicability, lack of tuning parameters, computational efficiency, and principled bootstrap-based p-value computation, facilitating hypothesis testing for both simple and composite parametric families without reliance on asymptotic approximations or restrictive regularity conditions. The framework is constructed to yield informative diagnostics whether or not the null is rejected, positioning the entropy difference as an omnibus discrepancy measure between hypothesized and empirical distributions.

Methodological Framework

The test operates by exploiting the convergence properties of entropy estimators:

  • Under the null, when the parametric family matches the data-generating process, the MLE-based DE and KDE-based DE estimators converge.
  • Under alternatives, systematic divergence is observed, with the magnitude and sign of the DDE statistic encapsulating global distributional discrepancies—in shape, support, skewness, kurtosis, and, in particular, tail behavior.

Critical to the test is the parametric bootstrap calibration. Instead of relying on asymptotic normality, which can give misleading inferences due to slow convergence and idiosyncratic biases of entropy estimators, the test resamples from the fitted parametric null, thus intrinsically capturing both smooth and local features of the null distribution, as well as finite-sample bias effects.

Estimation of Differential Entropy

  • Parametric Component: The MLE-based DE is computed using the plug-in principle, with closed-form expressions or bias-corrected estimators from the fitted parametric family.
  • Nonparametric Component: The KDE-based DE uses a Gaussian kernel, with adaptive bandwidth determined via sample kurtosis and distributional characteristics. Special handling via logarithmic transformation is used for nonnegative-support distributions, mitigating boundary bias.

A notable theoretical contribution is the analytic characterization of bias in both the MLE and KDE entropy estimators, with distribution-specific corrections derived and integrated within the bootstrap calibration. This ensures correct finite-sample sizing without explicit bias correction, as the parametric bootstrap inherently accounts for the empirical bias structure.

Simulation Evidence

Simulations across Normal, Exponential, Gamma, and LaPlace nulls, with alternatives designed to probe sensitivity to location, scale, skewness, kurtosis, and smoothness, demonstrate that:

  • The DDE test achieves close control of Type I error for sample sizes as low as n=50n=50.
  • Power increases rapidly with nn against alternatives that induce significant entropy shifts (e.g., heavy tails, pronounced skewness, multimodality).
  • The test is less powerful for pairs of distributions with similar entropy (e.g., Normal vs. Logistic), which is an inherent limitation of entropy-based tests.

A key empirical result is that, even in relatively small samples, substantial power is achieved for detecting misfit when the true distribution departs materially from the null family, particularly in the presence of tail or shape misspecification.

Empirical Applications

Three classical datasets illustrate practical relevance:

  1. Old Faithful Geyser Waiting Times: Standard models (Gamma, Lognormal, Normal) are decisively rejected, but the more expressive Generalized Gamma model is not, indicating necessary model flexibility.
  2. Danish Fire Insurance Loss Data: All standard and flexible parametric models, including Generalized Gamma, are rejected, suggesting more complex or nonparametric models are needed for extremal risk modeling.
  3. Translog Cost Function Residuals: DDE test results cast doubt on the Normality assumption typically invoked by CLT arguments, with only weak non-rejection in several cases, emphasizing the method’s diagnostic value for model misspecification in economic modeling contexts.

Theoretical and Practical Implications

The DDE test offers several advantages over classical goodness-of-fit tests:

  • Distributional Omnibus Nature: Unlike moment-based or characteristic function-based tests, the entropy gap captures global differences, with heightened sensitivity to tail and shape features, and less dependence on pointwise discrepancies.
  • Composite Hypotheses: Parameter uncertainty is accommodated directly, maintaining validity for families rather than simple fixed-parameter nulls.
  • Automated, Adaptive Implementation: The bandwidth selection adapts to the empirical distributional shape, and the bootstrapped test statistic inherently adjusts for all sources of bias.

Limitations

  • Local Alternatives: The test can be insensitive to alternatives with similar entropy, even if the distributions differ in other respects (e.g., multimodality vs. unimodality with matched entropy).
  • Curse of Dimensionality: Extension to multivariate settings would bring increased KDE estimation challenges, particularly for adaptive bandwidth selection in high dimensions.

Future Directions

Potential extensions include:

  • Generalization to the multivariate setting and assessment of high-dimensional entropy estimation under the curse of dimensionality.
  • Refinement of the bootstrapping approach for computational efficiency and improved calibration.
  • Exploration of variable-bandwidth or local-KDE methods to improve sensitivity to local departures or multimodal alternatives.
  • Integration of DDE testing in model selection and risk assessment pipelines, particularly for heavy-tailed or complex empirical datasets.

Conclusion

The DDE test constitutes a flexible, information-theoretic framework for robust hypothesis testing of parametric distributional assumptions. By directly contrasting parametric and nonparametric entropy estimates, and calibrating through parametric bootstrapping, it combines theoretical rigor with empirical performance. The test is particularly effective for diagnosing global and tail-based deviations from the null, with constructive interpretive value for both rejections and non-rejections. While less sensitive for entropy-equivalent alternatives, its utility in model validation, selection, and risk analysis in statistics and econometrics is well-supported by both simulation evidence and empirical applications.

Whiteboard

Video Overview

Open Problems

We found no open problems mentioned in this paper.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 2 likes about this paper.