Real Log Canonical Threshold (RLCT) in Bayesian Models
- Real Log Canonical Threshold (RLCT) is an invariant that quantifies the local complexity of singularities in statistical models, generalizing the classical log canonical threshold to real-analytic settings.
- RLCT governs the asymptotic behavior of Bayesian inference by determining leading error rates and influencing model selection techniques such as sBIC and WBIC.
- Its computation leverages algebraic-geometric and combinatorial methods, including resolution of singularities, Newton polyhedron analysis, and blow-up algorithms.
The real log canonical threshold (RLCT) is a birational-geometric invariant central to singular learning theory, Bayesian model selection, and the asymptotic analysis of integrals on real algebraic and analytic varieties. It quantifies the local complexity of singularities in statistical models and governs the rate at which Bayesian generalization error and marginal likelihood converge. The RLCT, often called the learning coefficient in statistics, generalizes the classical log canonical threshold (LCT) from complex algebraic geometry to the real-analytic and applied settings, where it directly determines the leading-order asymptotics of Bayesian inference in both regular and singular models.
1. Definition and Analytic Foundations
The RLCT is defined in terms of the pole structure of certain zeta functions associated with real-analytic (often polynomial) functions describing singularities. Given a real-analytic function , choose a proper real-analytic resolution of singularities such that in local coordinates ,
with units, , . The RLCT (with multiplicity) is given by
In the context of Bayesian learning, RLCT is computed for the Kullback–Leibler divergence function near the realization set .
The RLCT is the smallest real part among poles (with their multiplicity) of the zeta function
for analytic prior and domain around a true parameter (Kurumadani, 2024, Kosta et al., 2024).
2. Role in Asymptotics of Bayesian Inference and Model Selection
In singular learning theory, the RLCT governs the non-regular asymptotics of the free energy (negative log-marginal likelihood) and Bayes generalization loss. For samples,
where is the entropy of the true distribution, and is the RLCT (Kurumadani, 2024). The multiplicity appears in the second sub-leading term as a coefficient of .
In model selection, methods such as singular BIC (sBIC) and widely applicable BIC (WBIC) rely on the accurate estimation or knowledge of the RLCT. The sBIC achieves an approximation to the log-marginal likelihood when RLCTs and their multiplicities are known exactly; WBIC uses thermodynamic expectations but does not require explicit RLCTs. The bias and variance in these criteria is directly determined by the RLCT, which acts as a complexity penalty in the Bayesian evidence expansion (Imai, 2019).
3. Combinatorial and Geometric Computation Techniques
3.1. Binomial Ideals and Monomial Models
For binomial and monomial ideals, the RLCT can be reduced to piecewise-linear optimization over fans determined by the Newton polyhedron. For a binomial ideal , the RLCT is computed as the minimum of a function over rays of the associated fan, allowing an explicit combinatorial algorithm (Blanco et al., 2014). See the following table summarizing core steps:
| Step | Description | Reference |
|---|---|---|
| Form matrices | from exponents | (Blanco et al., 2014) |
| Hyperplane arrangement | Build rational fan splitting | (Blanco et al., 2014) |
| Minimize | Over rays of the fan | (Blanco et al., 2014) |
3.2. Newton Polyhedron in Two Variables
In , the RLCT (critical integrability index) can be computed by geometric formulas derived from the Newton polyhedron, as shown by Collins (Collins, 2017). The RLCT is the minimum of expressions involving the combinatorics of compact faces of the Newton diagram, providing an explicit classification and the ascending chain condition (ACC) for RLCTs.
3.3. Plane Curves
For reduced germs of plane curves, explicit formulas for the RLCT depend only on the first two maximal contact values (semigroup generators) of branches and their pairwise intersection multiplicities. If the minimizing exceptional divisor on the complex resolution is real, the (complex) LCT and RLCT coincide (Galindo et al., 2012).
3.4. Hyperplane Arrangements
For real hyperplane arrangements , the RLCT and its multiplicity admit closed-form combinatorial formulas: where runs over the intersection lattice of the arrangement; implementation is efficiently achievable in SageMath (Kosta et al., 2024).
3.5. Sum-of-Products Polynomials
The RLCT for sum-of-products (SOP) polynomials can be computed via specific sequences of blow-ups, reducing the analysis to combinatoric invariants and explicit algorithms, particularly for binomials and in low dimension (Hirose, 2023). For SOPs not exactly reducible to binomials, a simplex upper bound can be established by linear programming.
4. RLCT in Singular Bayesian Models
In non-regular or singular models, such as mixture models, reduced-rank regression, tensor decompositions, or neural networks, the RLCT determines the leading term of the learning-theoretic generalization error: where reflects the geometric nature and singularities of the model. Explicit upper bounds for the RLCT in models such as CP-tensor decompositions and mixtures are available by decomposing the Kullback-Leibler function near the realization locus and applying blow-up and resolution strategies (Yoshida et al., 2023, Kurumadani, 2024). For tensor decompositions, the RLCT upper bound is: where is the true rank and are functions of the parameter configuration.
For general singular models, the trivial bound (with the parameter dimension) is often not tight, and refinement via explicit local calculations at nonsingular points can yield sharper bounds (Kurumadani, 2024). For mixture models, such as the mixed binomial model, the RLCT at generic non-singular points is given by: where is the number of true components.
5. Statistical and Computational Applications
The RLCT has immediate implications for Bayesian model selection, generalization bound estimations, and the design of learning algorithms. When RLCTs are known for candidate models, sBIC can be computed with error. A variance-based single-temperature estimator of the RLCT based on thermodynamic integration allows for practical estimation, facilitating model scoring even in the absence of closed-form RLCTs (Imai, 2019). This estimator is consistent and achieves asymptotic normality, with its implementation requiring only parallel simulations and modest MCMC diagnostics.
In high-dimensional analysis, RLCT and multiplicity govern not only statistical rates but also volume growth of sublevel sets , with asymptotics (Kosta et al., 2024). Algebraic geometry further connects the RLCT to birational invariants, singularity classification, and the minimal model program.
6. Limitations and Current Research Directions
While RLCTs admit effective computation for large classes of models—binomials, monomials, hyperplane arrangements, certain polynomial classes, and resolved plane curves—general explicit calculation remains elusive for models with more intricate singularity structure, especially at singular points of the realization set . Methods relying on blow-up algorithms and piecewise linear analysis can be infeasible in high dimensions or for functions not amenable to decomposition (Hirose, 2023, Kurumadani, 2024). The extension of RLCT calculation to singular points in parameter space and the determination of multiplicities for arbitrary models remain open problems. Further research is directed at enhancing computational algorithms, refining upper bounds, and developing new invariants that capture the geometry of more general singular statistical models.
7. Illustrative Examples
Examples concretizing RLCT computation include:
- Hyperplane arrangements: For three lines in with exponents , , (Kosta et al., 2024).
- Plane curve cusp: yields RLCT $5/6$ (Galindo et al., 2012).
- Mixed binomial model: For components with true, RLCT at a generic point is (Kurumadani, 2024).
- Tensor decomposition: With true rank , the RLCT is upper-bounded by a function of , entries as above (Yoshida et al., 2023).
These concrete instances demonstrate both the diversity of models where RLCT is tractable and the centrality of combinatorial-geometric resolution techniques in its determination.