Generalized Fisher-Darmois-Koopman-Pitman Theorem and Rao-Blackwell Type Estimators for Power-Law Distributions (2205.00530v2)
Abstract: This paper generalizes the notion of sufficiency for estimation problems beyond maximum likelihood. In particular, we consider estimation problems based on Jones et al. and Basu et al. likelihood functions that are popular among distance-based robust inference methods. We first characterize the probability distributions that always have a fixed number of sufficient statistics (independent of sample size) with respect to these likelihood functions. These distributions are power-law extensions of the usual exponential family and contain Student distributions as a special case. We then extend the notion of minimal sufficient statistics and compute it for these power-law families. Finally, we establish a Rao-Blackwell-type theorem for finding the best estimators for a power-law family. This helps us establish Cram\'er-Rao-type lower bounds for power-law families.
- E. W. Barankin and A. P. Maitra, “Generalization of the Fisher-Darmois-Koopman-Pitman theorem on sufficient statistics,” Sankhya: The Indian Journal of Statistics, Series A, vol. 25(3), pp. 217–244, 1963.
- A. G. Bashkirov, “On maximum entropy principle, superstatistics, power-law distribution and Rényi parameter,” Phys. A, vol. 340, pp. 153–162, 2004.
- A. Basu, I. R. Harris, N. L. Hjort, and M. C. Jones, “Robust and efficient estimation by minimizing a density power divergence,” Biometrika, vol. 85, pp. 549–559, 1998.
- D. Blackwell, “Conditional expectation and unbiased sequential estimation,” Ann. Math. Statist., vol. 18(1), pp. 105–110, 1947.
- M. Broniatowski, A. Toma, and I. Vajda, “Decomposable pseudo-distances and applications in statistical estimation,” J. Statist. Plann. Inference, vol. 142, pp. 2574–2585, 2012.
- M. Broniatowski and I. Vajda, “Several applications of divergence criteria in continuous families,” Kybernetika (Prague), vol. 48, pp. 600–636, 2012.
- C. Bunte and A. Lapidoth, “Encoding tasks and Rényi entropy,” IEEE Trans. Inform. Theory, vol. 60, pp. 5065–5076, 2014.
- A. Cichocki and S. Amari, “Families of alpha-beta-and gamma-divergences: Flexible and robust measure of similarities,” Entropy, vol. 12, pp. 1532–1568, 2010.
- I. Csiszár, “Generalized cutoff rates and Rényi’s information measures,” IEEE Trans. Inform. Theory., vol. 41, pp. 26–34, 1995.
- I. Csiszár and F. Matúš, “Generalized minimizers of convex integral functionals, Bergman distance, Pythagorean identities,” Kybernetika (Prague)., vol. 48, pp. 637–689, 2012.
- G. Darmois, “Sur les lois de probabilites a estimation exhaustive,” C. R. Acad. Sci. Paris (in French), vol. 200, pp. 1265–1266, 1935.
- S. Eguchi, O. Komori, and S. Kato, “Projective power entropy and maximum Tsallis entropy distributions,” Entropy, vol. 13, pp. 1746–1764, 2011.
- C. Field and B. Smith, “Robust estimation: A weighted maximum likelihood approach,” Int. Stat. Rev., vol. 62, pp. 405–424, 1994.
- R. A. Fisher, “On the mathematical foundations of theoretical statistics,” Philos. Trans. Roy. Soc. A., vol. 222, pp. 309–368, 1922.
- ——, “Two new properties of mathematical likelihood,” Proceedings of the Royal Society, Series A, vol. 144, pp. 285–307, 1934.
- H. Fujisawa, “Normalized estimating equation for robust parameter estimation,” Electron. J. Stat., vol. 7, pp. 1587–1606, 2013.
- H. Fujisawa and S. Eguchi, “Robust parameter estimation with a small bias against heavy contamination,” J. Multivariate Anal., vol. 99, pp. 2053–2081, 2008.
- A. Gayen and M. A. Kumar, “Generalized estimating equation for the Student-t distributions,” in 2018 IEEE International Symposium on Information Theory (ISIT), 2018, pp. 571–575.
- ——, “A generalized notion of sufficiency for power-law distributions,” in 2021 IEEE International Symposium on Information Theory (ISIT), 2021, pp. 2185–2190.
- ——, “Projection theorems and estimating equations for power-law models,” Journal of Multivariate Analysis, vol. 184, p. 104734, 2021.
- P. R. Halmos and L. J. Savage, “Application of the Radon-Nikodym theorem to the theory of sufficient statistics,” The Annals of Mathematical Statistics, vol. 20, pp. 225 – 241, 1949.
- H. G. Hoang, B. Vo, B. T. Vo, and R. Mahler, “The Cauchy–Schwarz divergence for poisson point processes,” IEEE Transactions on Information Theory, vol. 61, pp. 4475–4485, 2015.
- R. Jenssen, J. C. Principe, D. Erdogmus, and T. Eltoft, “The Cauchy–Schwarz divergence and Parzen windowing: Connections to graph theory and Mercer kernels,” Journal of the Franklin Institute, vol. 343, pp. 614–629, 2006.
- M. C. Jones, N. L. Hjort, I. R. Harris, and A. Basu, “A comparison of related density based minimum divergence estimators,” Biometrika, vol. 88, pp. 865–873, 2001.
- K. Kampa, E. Hasanbelliu, and J. C. Principe, “Closed-form Cauchy-Schwarz pdf divergence for mixture of gaussians,” Proceedings of International Joint Conference on Neural Networks, San Jose, California, USA, pp. 2578–2585, 2011.
- B. O. Koopman, “On distributions admitting a sufficient statistic,” Trans. Amer. Math. Soc., vol. 39, pp. 399–409, 1936.
- M. A. Kumar and R. Sundaresan, “Minimization problems based on relative α𝛼\alphaitalic_α-entropy I: Forward projection,” IEEE Trans. Inform. Theory., vol. 61, pp. 5063–5080, 2015.
- ——, “Minimization problems based on relative α𝛼\alphaitalic_α-entropy II: Reverse projection,” IEEE Trans. Inform. Theory., vol. 61, pp. 5081–5095, 2015.
- E. Lutwak, D. Yang, and G. Zhang, “Cramér-Rao and moment-entropy inequalities for Rényi entropy and generalized Fisher information,” IEEE Trans. Inform. Theory., vol. 51, pp. 473–478, 2005.
- A. Maji, A. Ghosh, and A. Basu, “The logarithmic super divergence and asymptotic inference properties,” AStA Adv. Stat. Anal, vol. 100, pp. 99–131, 2016.
- J. Naudts, “Estimators, escort probabilities, and ϕitalic-ϕ\phiitalic_ϕ-exponential families in statistical physics,” J. Inequal. Pure. Appl. Math., vol. 5, p. 102, 2004.
- J. Neyman, “Sur un teorema concernente la cosidette statisticle sufficienti,” Giorn. Ist. Ital. Attuari, vol. 6, pp. 320–334, 1935.
- F. Nielsen, K. Sun, and S. Marchand-Maillet, “k-means clustering with Hölder divergences,” Geometric Science of Information, pp. 856–863, 2017.
- ——, “On Hölder projective divergences,” Entropy, vol. 19, pp. 1–28, 2017.
- E. J. G. Pitman, “Sufficient statistics and intrinsic accuracy,” Mathematical Proceedings of the Cambridge Phil. Soc., vol. 32, pp. 567–579, 1936.
- C. R. Rao, “Information and accuracy attainable in the estimation of statistical parameters,” Bulletin of the Calcutta Mathematical Society, vol. 37(3), pp. 81–91, 1945.
- ——, “Minimum variance and the estimation of several parameters,” Mathematical Proceedings of the Cambridge Philosophical Society, vol. 43, pp. 280–283, 1947.
- ——, “Sufficient statistics and minimum variance estimates,” Mathematical Proceedings of the Cambridge Philosophical Society, vol. 45(2), pp. 213–218, 1949.
- Z. Shanqing, Z. Kunlong, and X. Weibin, “Application of Cauchy-Schwarz divergence in image segmentation,” Computer Engineering and Applications, vol. 49, pp. 129–131, 2013.
- R. Sundaresan, “A measure of discrimination and its geometric properties,” in 2002 IEEE International Symposium on Information Theory (ISIT), 2002, p. 264.
- ——, “Guessing under source uncertainty,” IEEE Trans. Inform. Theory., vol. 53, pp. 269–287, 2007.
- M. P. Windham, “Robustifying model fitting,” J. R. Stat. Soc. Ser. B. Stat. Methodol., vol. 57, pp. 599–609, 1995.