Papers
Topics
Authors
Recent
2000 character limit reached

Real Log Canonical Threshold (RLCT) in Bayesian Models

Updated 18 November 2025
  • Real Log Canonical Threshold (RLCT) is an invariant that quantifies the local complexity of singularities in statistical models, generalizing the classical log canonical threshold to real-analytic settings.
  • RLCT governs the asymptotic behavior of Bayesian inference by determining leading error rates and influencing model selection techniques such as sBIC and WBIC.
  • Its computation leverages algebraic-geometric and combinatorial methods, including resolution of singularities, Newton polyhedron analysis, and blow-up algorithms.

The real log canonical threshold (RLCT) is a birational-geometric invariant central to singular learning theory, Bayesian model selection, and the asymptotic analysis of integrals on real algebraic and analytic varieties. It quantifies the local complexity of singularities in statistical models and governs the rate at which Bayesian generalization error and marginal likelihood converge. The RLCT, often called the learning coefficient in statistics, generalizes the classical log canonical threshold (LCT) from complex algebraic geometry to the real-analytic and applied settings, where it directly determines the leading-order asymptotics of Bayesian inference in both regular and singular models.

1. Definition and Analytic Foundations

The RLCT is defined in terms of the pole structure of certain zeta functions associated with real-analytic (often polynomial) functions describing singularities. Given a real-analytic function f:RdRf : \mathbb{R}^d \to \mathbb{R}, choose a proper real-analytic resolution of singularities σ:URd\sigma: U \to \mathbb{R}^d such that in local coordinates u=(u1,,ud)u = (u_1, \dots, u_d),

fσ(u)=η(u)i=1duiai,detJac(σ)(u)=η(u)i=1duibi,f \circ \sigma(u) = \eta(u) \prod_{i=1}^d u_i^{a_i}, \qquad \det \mathrm{Jac}(\sigma)(u) = \eta'(u) \prod_{i=1}^d u_i^{b_i},

with η,η\eta,\eta' units, ai1a_i \ge 1, bi0b_i \ge 0. The RLCT (with multiplicity) is given by

λR=infP,ai>0bi+1ai,mR=maxP{i:ai>0,bi+1ai=λR}.\lambda_R = \inf_{P,\,a_i>0} \frac{b_i+1}{a_i}, \qquad m_R = \max_{P} \left\lvert \left\{i : a_i > 0,\, \frac{b_i+1}{a_i} = \lambda_R \right\} \right\rvert.

In the context of Bayesian learning, RLCT is computed for the Kullback–Leibler divergence function K(θ)=EX[logp(Xθ)p(Xθ)]K(\theta) = \mathbb{E}_X\left[\log \frac{p(X|\theta_*)}{p(X|\theta)}\right] near the realization set Θ={θ:K(θ)=0}\Theta_* = \{\theta : K(\theta) = 0\}.

The RLCT λ\lambda is the smallest real part among poles (with m-m their multiplicity) of the zeta function

ζ(z)=ΩK(θ)zφ(θ)dθ\zeta(z) = \int_\Omega K(\theta)^z \varphi(\theta) d\theta

for analytic prior φ\varphi and domain Ω\Omega around a true parameter θ\theta_* (Kurumadani, 2024, Kosta et al., 2024).

2. Role in Asymptotics of Bayesian Inference and Model Selection

In singular learning theory, the RLCT governs the non-regular asymptotics of the free energy (negative log-marginal likelihood) and Bayes generalization loss. For nn samples,

F(n)=logp(Xnθ)φ(θ)dθnS+λlnn+O(1),F(n) = -\log \int p(X^n|\theta)\,\varphi(\theta)\,d\theta \sim nS + \lambda \ln n + O(1),

G(n)=S+λn+o(1n),G(n) = S + \frac{\lambda}{n} + o\left(\frac{1}{n}\right),

where SS is the entropy of the true distribution, and λQ\lambda \in \mathbb{Q} is the RLCT (Kurumadani, 2024). The multiplicity mm appears in the second sub-leading term as a coefficient of loglogn\log\log n.

In model selection, methods such as singular BIC (sBIC) and widely applicable BIC (WBIC) rely on the accurate estimation or knowledge of the RLCT. The sBIC achieves an Op(1)O_p(1) approximation to the log-marginal likelihood when RLCTs and their multiplicities are known exactly; WBIC uses thermodynamic expectations but does not require explicit RLCTs. The bias and variance in these criteria is directly determined by the RLCT, which acts as a complexity penalty in the Bayesian evidence expansion (Imai, 2019).

3. Combinatorial and Geometric Computation Techniques

3.1. Binomial Ideals and Monomial Models

For binomial and monomial ideals, the RLCT can be reduced to piecewise-linear optimization over fans determined by the Newton polyhedron. For a binomial ideal xaiuixbi\langle x^{a_i} - u_i x^{b_i} \rangle, the RLCT is computed as the minimum of a function LCT(v)\mathrm{LCT}(v) over rays vv of the associated fan, allowing an explicit combinatorial algorithm (Blanco et al., 2014). See the following table summarizing core steps:

Step Description Reference
Form matrices M+,M,cM^+, M^-, c from exponents (Blanco et al., 2014)
Hyperplane arrangement Build rational fan splitting Rn\mathbb{R}^n (Blanco et al., 2014)
Minimize LCT(v)\mathrm{LCT}(v) Over rays of the fan (Blanco et al., 2014)

3.2. Newton Polyhedron in Two Variables

In R2\mathbb{R}^2, the RLCT (critical integrability index) can be computed by geometric formulas derived from the Newton polyhedron, as shown by Collins (Collins, 2017). The RLCT is the minimum of expressions involving the combinatorics of compact faces of the Newton diagram, providing an explicit classification and the ascending chain condition (ACC) for RLCTs.

3.3. Plane Curves

For reduced germs of plane curves, explicit formulas for the RLCT depend only on the first two maximal contact values (semigroup generators) of branches and their pairwise intersection multiplicities. If the minimizing exceptional divisor on the complex resolution is real, the (complex) LCT and RLCT coincide (Galindo et al., 2012).

3.4. Hyperplane Arrangements

For real hyperplane arrangements f(x)=j=1nLj(x)sjf(x) = \prod_{j=1}^n L_j(x)^{s_j}, the RLCT and its multiplicity admit closed-form combinatorial formulas: λR=minWcodim(W)s(W),mR=maxchains{i:flag with codim(Wi)s(Wi)=λR},\lambda_R = \min_{W} \frac{\mathrm{codim}(W)}{s(W)}, \qquad m_R = \max_{\text{chains}} \left|\{ i : \text{flag with } \frac{\mathrm{codim}(W_i)}{s(W_i)} = \lambda_R \}\right|, where WW runs over the intersection lattice of the arrangement; implementation is efficiently achievable in SageMath (Kosta et al., 2024).

3.5. Sum-of-Products Polynomials

The RLCT for sum-of-products (SOP) polynomials can be computed via specific sequences of blow-ups, reducing the analysis to combinatoric invariants and explicit algorithms, particularly for binomials and in low dimension (Hirose, 2023). For SOPs not exactly reducible to binomials, a simplex upper bound can be established by linear programming.

4. RLCT in Singular Bayesian Models

In non-regular or singular models, such as mixture models, reduced-rank regression, tensor decompositions, or neural networks, the RLCT determines the leading term of the learning-theoretic generalization error: E[Gn]=S+λn+o(n1),\mathbb{E}[G_n] = S + \frac{\lambda}{n} + o(n^{-1}), where λ\lambda reflects the geometric nature and singularities of the model. Explicit upper bounds for the RLCT in models such as CP-tensor decompositions and mixtures are available by decomposing the Kullback-Leibler function near the realization locus and applying blow-up and resolution strategies (Yoshida et al., 2023, Kurumadani, 2024). For tensor decompositions, the RLCT upper bound is: λH0(I+J+K)22+min{m1,m2,m3},\lambda \leq \frac{H_0 (I+J+K)-2}{2} + \min\{m_1,m_2,m_3\}, where H0H_0 is the true rank and mim_i are functions of the parameter configuration.

For general singular models, the trivial bound λd1/2\lambda \leq d_1/2 (with d1d_1 the parameter dimension) is often not tight, and refinement via explicit local calculations at nonsingular points can yield sharper bounds (Kurumadani, 2024). For mixture models, such as the mixed binomial model, the RLCT at generic non-singular points is given by: λ0=3H0+H24,\lambda_0 = \frac{3H_0 + H - 2}{4}, where H0H_0 is the number of true components.

5. Statistical and Computational Applications

The RLCT has immediate implications for Bayesian model selection, generalization bound estimations, and the design of learning algorithms. When RLCTs are known for candidate models, sBIC can be computed with Op(1)O_p(1) error. A variance-based single-temperature estimator of the RLCT based on thermodynamic integration allows for practical estimation, facilitating model scoring even in the absence of closed-form RLCTs (Imai, 2019). This estimator is consistent and achieves asymptotic normality, with its implementation requiring only parallel simulations and modest MCMC diagnostics.

In high-dimensional analysis, RLCT and multiplicity govern not only statistical rates but also volume growth of sublevel sets {fε}\{|f|\leq \varepsilon\}, with asymptotics ελR(logε)mR1\varepsilon^{\lambda_R} (\log\varepsilon)^{m_R-1} (Kosta et al., 2024). Algebraic geometry further connects the RLCT to birational invariants, singularity classification, and the minimal model program.

6. Limitations and Current Research Directions

While RLCTs admit effective computation for large classes of models—binomials, monomials, hyperplane arrangements, certain polynomial classes, and resolved plane curves—general explicit calculation remains elusive for models with more intricate singularity structure, especially at singular points of the realization set Θ\Theta_*. Methods relying on blow-up algorithms and piecewise linear analysis can be infeasible in high dimensions or for functions not amenable to decomposition (Hirose, 2023, Kurumadani, 2024). The extension of RLCT calculation to singular points in parameter space and the determination of multiplicities for arbitrary models remain open problems. Further research is directed at enhancing computational algorithms, refining upper bounds, and developing new invariants that capture the geometry of more general singular statistical models.

7. Illustrative Examples

Examples concretizing RLCT computation include:

  • Hyperplane arrangements: For three lines in R2\mathbb{R}^2 with exponents si=1s_i=1, λR=2/3\lambda_R=2/3, mR=1m_R=1 (Kosta et al., 2024).
  • Plane curve cusp: f(x,y)=y2x3f(x,y)=y^2-x^3 yields RLCT $5/6$ (Galindo et al., 2012).
  • Mixed binomial model: For HH components with H0H_0 true, RLCT at a generic point is λ0=(3H0+H2)/4\lambda_0=(3H_0+H-2)/4 (Kurumadani, 2024).
  • Tensor decomposition: With true rank H0H_0, the RLCT is upper-bounded by a function of H0H_0, I,J,KI,J,K entries as above (Yoshida et al., 2023).

These concrete instances demonstrate both the diversity of models where RLCT is tractable and the centrality of combinatorial-geometric resolution techniques in its determination.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Real Log Canonical Threshold (RLCT).