Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 71 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 12 tok/s Pro
GPT-5 High 21 tok/s Pro
GPT-4o 81 tok/s Pro
Kimi K2 231 tok/s Pro
GPT OSS 120B 435 tok/s Pro
Claude Sonnet 4 33 tok/s Pro
2000 character limit reached

Entropy and Learning of Lipschitz Functions under Log-Concave Measures (2509.10355v1)

Published 12 Sep 2025 in math.PR and math.FA

Abstract: We study regression of $1$-Lipschitz functions under a log-concave measure $\mu$ on $\mathbb{R}d$. We focus on the high-dimensional regime where the sample size $n$ is subexponential in $d$, in which distribution-free estimators are ineffective. We analyze two polynomial-based procedures: the projection estimator, which relies on knowledge of an orthogonal polynomial basis of $\mu$, and the least-squares estimator over low-degree polynomials, which requires no knowledge of $\mu$ whatsoever. Their risk is governed by the rate of polynomial approximation of Lipschitz functions in $L2(\mu)$. When this rate matches the Gaussian one, we show that both estimators achieve minimax bounds over a wide range of parameters. A key ingredient is sharp entropy estimates for the class of $1$-Lipschitz functions in $L2(\mu)$, which are new even in the Gaussian setting.

Summary

  • The paper presents two polynomial-based estimators—projection and least-squares—that leverage low-degree polynomial approximations to efficiently learn 1-Lipschitz functions.
  • It provides sharp upper and lower bounds on the L2 risk and metric entropy, showing that subexponential sample regimes suffice for high-dimensional regression.
  • The study demonstrates estimator robustness even under unknown log-concave measures, offering practical insights and computationally tractable methods for high-dimensional settings.

Entropy and Learning of Lipschitz Functions under Log-Concave Measures

Problem Setting and Motivation

This work addresses the regression problem for $1$-Lipschitz functions f:RdRf: \mathbb{R}^d \to \mathbb{R} under a log-concave measure μ\mu in high dimensions, with a focus on the subexponential sample regime (nexp(d)n \ll \exp(d)). The central challenge is to construct estimators f^\hat{f} for ff from noisy samples (Xi,Yi)(X_i, Y_i), where Yi=f(Xi)+ξiY_i = f(X_i) + \xi_i and ξi\xi_i are i.i.d. Gaussian noise, and to analyze the minimax risk in L2(μ)L^2(\mu). The paper is motivated by the inadequacy of distribution-free estimators in this regime and the need for procedures that exploit the structure of log-concave measures and the regularity of the function class.

Polynomial-Based Estimation Procedures

Two polynomial-based estimators are analyzed:

  1. Projection Estimator: Assumes knowledge of an orthonormal polynomial basis for L2(μ)L^2(\mu). The estimator projects the empirical data onto the space of polynomials of degree at most mm, with coefficients estimated from the data. For the Gaussian case, this corresponds to Hermite polynomials; for general log-concave μ\mu, any orthonormal polynomial basis suffices.
  2. Least-Squares Estimator: Does not require knowledge of μ\mu. It selects the polynomial of degree at most mm that minimizes the empirical squared error over the observed data.

Both estimators' performance is governed by the L2(μ)L^2(\mu)-approximation rate of $1$-Lipschitz functions by low-degree polynomials, denoted Ψμ(m)\Psi_\mu(m). For the Gaussian measure, Ψγ(m)1/m\Psi_\gamma(m) \lesssim 1/m; for general log-concave measures, the rate may be slower but is always subpolynomial.

Implementation Details

  • Projection Estimator: For known μ\mu, compute empirical averages of the basis polynomials against the data, with a variance reduction step for the mean coefficient. The estimator is

f^=αmf^αPα,\hat{f} = \sum_{|\alpha| \leq m} \hat{f}_\alpha P_\alpha,

where f^α\hat{f}_\alpha are empirical averages as specified in the paper.

f^LS=argmindeg(P)mi=1n(P(Xi)Yi)2.\hat{f}_{LS} = \arg\min_{\deg(P) \leq m} \sum_{i=1}^n (P(X_i) - Y_i)^2.

This is a standard quadratic program in the coefficients of the polynomial basis.

  • Choice of Degree mm: The optimal mm balances the approximation error Ψμ(m)\Psi_\mu(m) and the estimation error, which scales with the dimension D=(d+mm)D = \binom{d+m}{m} of the polynomial space and the sample size nn.
  • Computational Considerations: For moderate mm and large dd, DD can be large, but for m=O(logn/logd)m = O(\log n / \log d), DD remains subexponential in dd for subexponential nn.

Main Theoretical Results

Upper Bounds

  • Projection Estimator: For nn in the range d5nedlogdd^5 \leq n \leq e^{\sqrt{d} \log d}, with m=logn/logd4m = \lfloor \log n / \log d \rfloor - 4, the risk satisfies

Eff^L2(μ)2Ψμ2(m)+O(1/d).\mathbb{E} \|f - \hat{f}\|_{L^2(\mu)}^2 \leq \Psi_\mu^2(m) + O(1/d).

For larger nn, the error term decays as O(1/m2)O(1/m^2).

  • Least-Squares Estimator: Achieves comparable risk bounds in a slightly smaller range of nn, with an additional logarithmic factor in the estimation error due to the lack of knowledge of μ\mu.
  • Gaussian Case: For μ=γd\mu = \gamma_d, both estimators achieve the minimax rate with Ψγ(m)1/m\Psi_\gamma(m) \lesssim 1/m.

Lower Bounds and Metric Entropy

  • Metric Entropy of Lipschitz Functions: The paper establishes sharp lower bounds for the metric entropy HLμ(ε)H_L^\mu(\varepsilon) of the class of $1$-Lipschitz functions in L2(μ)L^2(\mu), even in the Gaussian case. For isotropic product log-concave measures,

HLμ(ε)(dc/ε2)H_L^\mu(\varepsilon) \gtrsim \binom{d}{\lfloor c/\varepsilon^2 \rfloor}

for εd1/4\varepsilon \gg d^{-1/4}.

  • Minimax Lower Bound: Using Fano's method and the entropy estimates, the minimax risk for learning $1$-Lipschitz functions is lower bounded by

Rn,dlogdlogn\mathcal{R}^*_{n,d} \gtrsim \frac{\log d}{\log n}

for nn up to ecd2ηlogde^{c d^{2\eta} \log d} (general log-concave) or ecdlogde^{c \sqrt{d} \log d} (product case).

  • Matching Upper and Lower Bounds: For measures with Ψμ2(m)1/m\Psi_\mu^2(m) \lesssim 1/m (e.g., Gaussian, uniform on the hypercube), the projection and least-squares estimators achieve the minimax rate in the specified sample regimes.

Contrasts with Classical Nonparametric Regression

  • Curse of Dimensionality: Classical nonparametric estimators (e.g., nearest neighbors) require nexp(d)n \gtrsim \exp(d) for nontrivial risk, whereas the polynomial-based estimators here achieve minimax rates for nn subexponential in dd.
  • Sample Complexity: To achieve L2L^2 error ε\varepsilon, it suffices to take ndc/ε2n \simeq d^{c/\varepsilon^2} samples in the Gaussian case.

Technical Innovations

  • Sharp Entropy Bounds: The paper provides new lower bounds for the metric entropy of $1$-Lipschitz functions under isotropic log-concave measures, using random polynomial constructions and properties of the Langevin semigroup.
  • Polynomial Approximation Rates: The analysis leverages tensorization and concentration properties of log-concave measures to obtain dimension-dependent rates for polynomial approximation of Lipschitz functions.
  • Robustness to Unknown Measure: The least-squares estimator is shown to be nearly minimax optimal even without knowledge of μ\mu, provided μ\mu is log-concave and isotropic.

Implications and Future Directions

Practical Implications

  • High-Dimensional Regression: The results provide a principled approach for regression of Lipschitz functions in high dimensions under log-concave distributions, with provable guarantees in regimes where classical methods fail.
  • Algorithmic Simplicity: Both estimators are computationally tractable for moderate mm and can be implemented using standard linear algebra routines.
  • Robustness: The least-squares estimator is applicable without knowledge of the underlying measure, making it suitable for practical scenarios with unknown or complex distributions.

Theoretical Implications

  • Entropy of Function Classes: The entropy bounds for Lipschitz functions under log-concave measures are new and may have further applications in empirical process theory and statistical learning.
  • Extension to Other Function Classes: The techniques may be adapted to other regularity classes (e.g., Sobolev, bounded variation) and to other structured measures.
  • Connections to Isoperimetry and Concentration: The results highlight the interplay between functional inequalities (Poincaré, log-Sobolev), polynomial approximation, and statistical estimation.

Future Developments

  • Sharper Approximation Rates: Further work may refine the polynomial approximation rates for specific log-concave measures, especially in non-product cases.
  • Beyond Lipschitz Functions: Extending the analysis to broader function classes or to settings with weaker regularity assumptions.
  • Adaptive Procedures: Developing estimators that adapt to unknown smoothness or intrinsic dimension.
  • Applications to Active Learning and Experimental Design: Leveraging the entropy and approximation results for optimal sampling strategies in high dimensions.

Conclusion

This paper rigorously characterizes the statistical and metric complexity of learning $1$-Lipschitz functions under log-concave measures in high dimensions. By analyzing polynomial-based estimators and establishing sharp entropy bounds, it demonstrates that minimax-optimal rates are achievable in the subexponential sample regime, in stark contrast to classical nonparametric methods. The results have significant implications for high-dimensional statistics, learning theory, and the analysis of function spaces under structured measures, and open several avenues for further research in both theory and practice.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 post and received 13 likes.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube