Optimal sub-Gaussian variance proxy for truncated Gaussian and exponential random variables (2403.08628v2)
Abstract: This paper establishes the optimal sub-Gaussian variance proxy for truncated Gaussian and truncated exponential random variables. The proofs rely on first characterizing the optimal variance proxy as the unique solution to a set of two equations and then observing that for these two truncated distributions, one may find explicit solutions to this set of equations. Moreover, we establish the conditions under which the optimal variance proxy coincides with the variance, thereby characterizing the strict sub-Gaussianity of the truncated random variables. Specifically, we demonstrate that truncated Gaussian variables exhibit strict sub-Gaussian behavior if and only if they are symmetric, meaning their truncation is symmetric with respect to the mean. Conversely, truncated exponential variables are shown to never exhibit strict sub-Gaussian properties. These findings contribute to the understanding of these prevalent probability distributions in statistics and machine learning, providing a valuable foundation for improved and optimal modeling and decision-making processes.
- On strict sub-Gaussianity, optimal proxy variance and symmetry for bounded random variables. ESAIM - Probab. Stat., 24:39–55.
- Progressive censoring: theory, methods, and applications. Statistics for Industry and Technology. Birkhäuser Boston, MA.
- Concentration inequalities in the infinite urn scheme for occupancy counts and the missing mass, with applications. Bernoulli, 23(1):249–287.
- On the concentration of the missing mass. Electron. Commun. Probab., 18(3):1–7.
- Concentration inequalities: A nonasymptotic theory of independence. Oxford University Press.
- Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends Mach., 5(1):1–122.
- Sub-Gaussian random variables. Ukr. Math. J., 32(6):483–489.
- Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector. arXiv:1802.04308 [math.ST].
- Fast mean estimation with sub-gaussian rates. In Beygelzimer, A. and Hsu, D., editors, Proceedings of the Thirty-Second Conference on Learning Theory, volume 99 of Proceedings of Machine Learning Research, pages 786–806. PMLR.
- Densely connected sub-Gaussian linear structural equation model learning via 𝑙1subscript𝑙1\textit{l}_{1}l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT- and 𝑙2subscript𝑙2\textit{l}_{2}l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT-regularized regressions. Comput. Stat. Data Anal., 181.
- Chow, Y. (2013). Some convergence theorems for independent random variables. Ann. Math. Stat., 37(6):1482–1493.
- Score-based generative models break the curse of dimensionality in learning a family of sub-Gaussian distributions. In The Twelfth International Conference on Learning Representations. arXiv:2402.08082.
- Robust sub-Gaussian estimation of a mean vector in nearly linear time. Ann. Stat., 50(1):511–536.
- Sub-Gaussian mean estimators. Ann. Stat., 44(6):2695–2725.
- Elder, S. (2016). Bayesian adaptive data analysis guarantees from subgaussianity. arXiv:1611.00065.
- Bayesian Data Analysis, Third Edition. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis.
- Building an efficient lattice gadget toolkit: Subgaussian sampling and more. In Ishai, Y. and Rijmen, V., editors, Advances in Cryptology – EUROCRYPT 2019 - 38th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Lecture Notes in Computer Science, pages 655–684. Springer Verlag.
- Götze, S. B. F. (1999). Exponential integrability and transportation cost related to logarithmic sobolev inequalities. J. Funct. Anal., 163(1):1–28.
- Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc., 58(301):13–30.
- Ionides, E. L. (2008). Truncated importance sampling. J. Comput. Graph., 17(2):295–311.
- Large deviation methods for approximate probabilistic inference. In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence, pages 311–319.
- Bandit algorithms. Cambridge University Press.
- Ledoux, M. (1999). Concentration of measure and logarithmic sobolev inequalities. LNIM, pages 120–216.
- Bayesian high-dimensional semi-parametric inference beyond sub-Gaussian errors. J. Korean Stat., 50(2):511–527.
- Smallest singular value of random matrices and geometry of random polytopes. Adv. Math., 195(2):491–523.
- On the sub-Gaussianity of the Beta and Dirichlet distributions. Electron. Commun. Probab., 22:1–14.
- Subgaussian and Differentiable Importance Sampling for Off-Policy Evaluation and Learning. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems, volume 34, pages 8119–8132. Curran Associates, Inc.
- Michal, D. (2023). Algorithmic Gaussianization through Sketching: Converting Data into Sub-gaussian Random Designs. In Neu, G. and Rosasco, L., editors, Proceedings of Machine Learning Research, volume 195, pages 1–36. PMLR.
- Statistical limits of spiked tensor models. Ann. inst. Henri Poincaré (B) Probab. Stat., 56(1):230–264.
- Pisier, G. (1986). Probabilistic methods in the geometry of Banach spaces. In Letta, G. and Pratelli, M., editors, Probability and Analysis, Lecture Notes in Mathematics, pages 167–241. Springer-Verlag.
- Pisier, G. (2016). Subgaussian sequences in probability and Fourier analysis. Graduate J. Math., 1:59–78.
- Concentration of measure inequalities in information theory, communications, and coding. Found. Trends Commun. Inf., 10(1-2):1–246.
- Smallest singular value of a random rectangular matrix. Commun. Pure Appl. Math., 62(12):1707–1739.
- Szepesvári, C. (2010). Algorithms for reinforcement learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Springer Cham.
- Vershynin, R. (2018). High-dimensional probability: An introduction with applications in data science, volume 47 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press.
- Bayesian neural network unit priors and generalized Weibull-tail property. In Proceedings of the Asian Conference on Machine Learning Research, volume 157, pages 1397–1412. PMLR.
- Sub-Weibull distributions: generalizing sub-Gaussian and sub-Exponential properties to heavier-tailed distributions. Stat, 9(1).
- Understanding Priors in Bayesian Neural Networks at the Unit Level. In Proceedings of the Internationnal Conference on Machine Learning, volume 97, pages 6458–6467. PMLR.
- Truncated importance sampling for reinforcement learning with experience replay. In Proceedings of the International Multiconference on Computer Science and Information Technology, pages 305–315.
- Deep Learning-Enhanced ICA Algorithm for Sub-Gaussian Blind Source Separation. In Liang, Q., Wang, W., Mu, J., Liu, X., and Na, Z., editors, Artificial Intelligence in China, pages 252–259. Springer Nature Singapore.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.