Papers
Topics
Authors
Recent
2000 character limit reached

Real Log Canonical Threshold

Updated 27 November 2025
  • Real Log Canonical Threshold is a birational and analytic invariant that quantifies the singular behavior of real-analytic functions and algebraic structures.
  • It plays a pivotal role in singular learning theory by governing asymptotic behaviors of marginal likelihood and generalization error in Bayesian models.
  • Computational methods such as blow-up algorithms, combinatorial approaches, and Monte Carlo estimators enable practical estimation of RLCT in complex models.

The real log canonical threshold (RLCT) is a birational and analytic invariant measuring the singularity of real-analytic and algebraic structures, and plays a central role in both real algebraic geometry and singular learning theory. In the context of statistics and machine learning, the RLCT (also called the learning coefficient) governs the leading behavior of marginal likelihood and generalization error in singular models, allowing for refined asymptotic model comparison well beyond classical regular cases.

1. Analytic and Geometric Definition

Given a real-analytic function f ⁣:RdRf \colon \mathbb{R}^d \to \mathbb{R}, the RLCT, denoted λR(f)\lambda_R(f), is defined via the analysis of the integrability of fp|f|^{-p} near its zero locus, or equivalently through resolution of singularities: λR(f)=infP,i: ai>0bi+1ai\lambda_R(f) = \inf_{P,\,i:\ a_i>0} \frac{b_i+1}{a_i} where for a real log resolution σ:URd\sigma: U \to \mathbb{R}^d, locally fσ(u)=η(u)u1a1udadf\circ \sigma(u) = \eta(u)\, u_1^{a_1}\cdots u_d^{a_d} and detDσ(u)=η(u)u1b1udbd\det D\sigma(u) = \eta'(u)\, u_1^{b_1}\cdots u_d^{b_d} for non-vanishing analytic functions η,η\eta,\eta', ai,biNa_i, b_i \in \mathbb{N} for i=1,,di=1,\ldots,d (Kosta et al., 20 Nov 2024).

The multiplicity mR(f)m_R(f) is defined as the maximal number of exponents (bi+1)/ai(b_i+1)/a_i achieving the minimum at any point. This pair (λR,mR)(\lambda_R, m_R) fully characterizes the leading singular behavior of sharp analytic volume or zeta integrals around the singular locus: V(ϵ)=f(w)ϵφ(w)dwCϵλR(lnϵ)mR1V(\epsilon) = \int_{|f(w)| \leq \epsilon} \varphi(w)\, dw \sim C\, \epsilon^{\lambda_R}(-\ln \epsilon)^{m_R-1} as ϵ0\epsilon \to 0 for compact WRdW\subset\mathbb{R}^d, analytic φ>0\varphi>0 (Kosta et al., 20 Nov 2024).

2. RLCT in Complex and Real Settings

The RLCT is an extension of the log canonical threshold (lct) defined in the complex algebraic context, often calculated from complex resolutions of singularities. For fR[x1,,xd]f\in\mathbb{R}[x_1, \ldots, x_d], the complex lct λC(f)\lambda_C(f) is calculated via divisorial data from a complex log resolution ρ:XCd\rho: X \to \mathbb{C}^d: λC(f)=min1imbi+1ai\lambda_C(f) = \min_{1 \leq i \leq m} \frac{b_i+1}{a_i} where bi,aib_i, a_i are the discrepancy and multiplicity coefficients along exceptional divisors EiE_i (Kosta et al., 20 Nov 2024).

When the log resolution is defined over R\mathbb{R} and each exceptional divisor EiE_i meets real points, λR(f)=λC(f)\lambda_R(f) = \lambda_C(f), with the multiplicities also coinciding under mild conditions (Kosta et al., 20 Nov 2024).

3. RLCT in Singular Learning Theory

The RLCT, also called the learning coefficient, is the key invariant dictating the small-sample asymptotics for both Bayesian evidence and generalization error in singular or non-identifiable statistical models. For a statistical model p(xw)p(x|w) and true data distribution q(x)q(x), define the KL distance function: K(w)=q(x)logq(x)p(xw)dxK(w) = \int q(x) \log\frac{q(x)}{p(x|w)}\, dx The RLCT (λ,m)(\lambda, m) is then determined by the leading pole λ- \lambda (order mm) of the zeta function ζ(z)=ΩK(w)zφ(w)dw\zeta(z) = \int_\Omega K(w)^z \varphi(w) dw (Imai, 2019, Hirose, 2023).

For data of size nn, the log marginal likelihood (free energy) asymptotically satisfies (Watanabe's Main Formula II): logL(n)=nL0+λlogn(m1)loglogn+Op(1)-\log L(n) = n L_0 + \lambda \log n - (m-1)\log \log n + O_p(1) In regular cases λ=m=d/2\lambda = m = d/2; for singular models, λ\lambda is typically smaller, yielding weaker Bayesian Occam penalties (Imai, 2019, Hirose, 2023).

4. Combinatorial Formulas: Hyperplane Arrangements

For f(x)=j=1nLj(x)sjf(x) = \prod_{j=1}^n L_j(x)^{s_j}, LjL_j linear, sj>0s_j > 0 (not necessarily reduced), set B={Hj={Lj=0}}\mathcal{B} = \{H_j = \{L_j=0\}\} the arrangement, and let L(B)L(\mathcal{B}) be its intersection poset (all proper nontrivial intersections WRdW \neq \mathbb{R}^d):

  • codimW=ddimW\mathrm{codim}\, W = d - \dim W
  • s(W)=j:LjW0sjs(W) = \sum_{j: L_j|_W \equiv 0} s_j

Explicitly: λR(f)=minWL(B)codimWs(W)\lambda_R(f) = \min_{W \in L(\mathcal{B})} \frac{\mathrm{codim}\, W}{s(W)}

mR(f)=maxchains W0Wd1{i:codimWi/s(Wi)=λR(f)}m_R(f) = \max_{\text{chains } W_0 \subsetneq \ldots \subsetneq W_{d-1}} |\{i : \mathrm{codim}\, W_i/s(W_i) = \lambda_R(f)\}|

If ff has real coefficients, λR(f)=λC(f)\lambda_R(f) = \lambda_C(f) and mR(f)=mC(f)m_R(f) = m_C(f) (Kosta et al., 20 Nov 2024).

Examples:

  • In R2\mathbb{R}^2, for f=i=1nLisif=\prod_{i=1}^n L_i^{s_i} (s1sns_1 \leq \cdots \leq s_n), λR=min{1/sn,2/isi}\lambda_R = \min\{1/s_n, 2/\sum_i s_i\}.
  • For f=xy2z2(x+y+z)f=x\,y^2\,z^2\,(x+y+z) in R3\mathbb{R}^3, λR=1/2\lambda_R = 1/2, mR=3m_R = 3 (Kosta et al., 20 Nov 2024).

5. Algorithmic and Blow-up Approaches

Computation of RLCTs for general polynomial models hinges on real-analytic resolution of singularities. For sum-of-products (sop) polynomials, dedicated blow-up algorithms—iteratively replacing singular charts via coordinate blow-ups—yield normal or locally normal crossing forms (Hirose, 2023). For binomials f=wp+wqf = w^p + w^q (with g(w)=ws1g(w)=w^{s-1}), critical poles can be computed explicitly: 1/λ=max{maxi{min(ρil,ρir)},maxijνilνjrρilρjrνil+νjrρilρjr}1/\lambda = \max\left\{ \max_i \{\min(\rho_i^l, \rho_i^r)\},\, \max_{i \neq j} \frac{\nu_i^l \nu_j^r - \rho_i^l \rho_j^r}{\nu_i^l + \nu_j^r - \rho_i^l - \rho_j^r} \right\} where ρil=pi/si\rho_i^l = p_i/s_i, ρir=qi/si\rho_i^r = q_i/s_i, νil=(pi+ri)/si\nu_i^l = (p_i + r_i)/s_i, etc. For higher terms (n>2n>2), a linear programming simplex bound provides 1/λmaxαΔminjiαi(aij/si)1/\lambda \ge \max_{\alpha \in \Delta} \min_j \sum_i \alpha_i (a_{ij}/s_i) (Hirose, 2023).

For hyperplane arrangements, the combinatorial algorithm computes all intersection flats, their codimension and s(W)s(W), and determines (λR,mR)(\lambda_R, m_R) through inclusion poset chains (Kosta et al., 20 Nov 2024).

6. RLCT at Non-Singular Points and Upper Bounds

At non-singular points of the true parameter manifold in statistical models, RLCT is immediately computable. Under analytic assumptions and after a variable split (θ,τ)(\theta, \tau), assumptions (feasibility, independence, vanishing order) yield

λ0=d1r+rm2m\lambda_0 = \frac{d_1 - r + r m}{2m}

where d1=d_1 = effective parameter dimension, rr counts directions with quadratic expansion, and mm minimal order for higher directions (Kurumadani, 23 Aug 2024).

For global estimation in singular models, the RLCT is the minimum over non-singular points, with the above formula providing an effective upper bound, guiding practical Bayesian model comparison (Kurumadani, 23 Aug 2024, Yoshida et al., 2023).

7. Statistical Estimation and Model Selection

In practice, analytic RLCTs are known only in restricted cases; for broader models, Monte Carlo-based estimators exploit the relationship between the variance of log-likelihoods under tempered posteriors and the RLCT: λ^V(1)=t2Vart[logp(Dw)]\hat{\lambda}_V^{(1)} = t^2 \cdot \operatorname{Var}_t[\log p(D|w)] for t1/lognt \sim 1/\log n, and averaging over multiple simulated datasets increases accuracy. This enables the singular Bayesian information criterion (sBIC) to be widely deployed as WsBIC, using nλ^Vn^{-\hat{\lambda}_V} as the complexity penalty, which empirically outperforms WBIC/BIC in mixtures, reduced-rank, and other singular models (Imai, 2019).

Table: RLCT Calculation Methods and Their Domains

Method Domain/Applicability Reference
Real log resolution General analytic or algebraic ff (Kosta et al., 20 Nov 2024)
Combinatorial poset (arrangements) Hyperplane arrangements (Kosta et al., 20 Nov 2024)
Blow-up algorithm (sop polynomials) Sum-of-products, binomial polynomials (Hirose, 2023)
Non-singular point expansion Regular points in parametric models (Kurumadani, 23 Aug 2024)
Monte Carlo (variance-based) General statistical models (Imai, 2019)

The RLCT is thus a unifying invariant at the interface of algebraic geometry, singularity theory, and Bayesian statistics, encoding local analytic complexity and model selection criteria in both theoretical and computational practice. Calculating or estimating the RLCT, exactly or via bounds, is essential for the asymptotic characterization of evidence and generalization in nonregular models, including applications in high-dimensional arrangements, tensor decompositions, and non-identifiable latent-variable models (Kosta et al., 20 Nov 2024, Yoshida et al., 2023).

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Real Log Canonical Threshold.