Papers
Topics
Authors
Recent
2000 character limit reached

CosmicFlows-3 Multi-Attractor Flow Model

Updated 30 December 2025
  • The paper introduces a maximum-likelihood orthogonal regression framework using empirical Bayes and variational inference to yield unbiased estimates in the presence of heteroscedastic noise.
  • It details a convex-analytic solution with closed-form iterations for 2D orthogonal line fitting that ensures global optimality and computational efficiency.
  • The application to CosmicFlows-3 leverages GPU acceleration and normalizing flows to robustly model complex astronomical datasets with varying measurement uncertainties.

Maximum-likelihood Orthogonal Regression (BayesLineFit) refers to a family of statistically rigorous approaches for fitting linear relationships to data with measurement uncertainty in both variables, under the maximum-likelihood principle. The most recent developments synthesize the classical errors-in-variables perspective with modern Bayesian nonparametric modeling, notably using normalizing flows. The “BayesLineFit” framework encompasses both (1) a general maximum-likelihood orthogonal regression model leveraging empirical Bayes via variational inference and (2) a convex-analytic solution to the Gaussian linear errors-in-variables problem, with efficient algorithms and closed-form reduction for the 2-D orthogonal line case. These methods yield unbiased orthogonal regression estimates, explicitly accounting for heteroscedastic measurement noise, nontrivial population distributions, and latent variable dependencies.

1. Model Formulation and Likelihood Principle

Maximum-likelihood orthogonal regression assumes each observed point

(xobs,i,σx,i,yobs,i,σy,i)(x_{\mathrm{obs},i}, \sigma_{x,i}, y_{\mathrm{obs},i}, \sigma_{y,i})

is a noisy observation of a latent point (xi,yi)(x_i, y_i). Noise is modeled as

xobs,iN(xi,σx,i2),yobs,iN(yi,σy,i2)x_{\mathrm{obs},i} \sim \mathcal{N}(x_i, \sigma_{x,i}^2), \quad y_{\mathrm{obs},i} \sim \mathcal{N}(y_i, \sigma_{y,i}^2)

In the simple 1D orthogonal line case, the regression law is: di=yikxib1+k2N(0,σint2)d_i = \frac{y_i - k x_i - b}{\sqrt{1 + k^2}} \sim \mathcal{N}(0, \sigma_{\mathrm{int}}^2) where kk is the slope, bb is the intercept, and σint\sigma_{\mathrm{int}} is the intrinsic orthogonal scatter. The joint likelihood for a single datum is: PiP(Diθ,ϕ)=dxidyi  N(xobs,ixi,σx,i2)N(yobs,iyi,σy,i2)[1+k2N(yikxib1+k20,σint2)]P(ϕx)(xi,σx,i)P(ϕy)(yi,σy,i)P_i \equiv P(D_i|\theta,\phi) = \iint dx_i dy_i\; \mathcal{N}(x_{\mathrm{obs},i}|x_i,\sigma_{x,i}^2) \mathcal{N}(y_{\mathrm{obs},i}|y_i,\sigma_{y,i}^2) \left[ \sqrt{1+k^2}\,\mathcal{N}\left(\frac{y_i-kx_i-b}{\sqrt{1+k^2}}|0,\sigma_{\mathrm{int}}^2\right)\right] P_{(\phi_x)}(x_i,\sigma_{x,i}) P_{(\phi_y)}(y_i,\sigma_{y,i}) where P(ϕx)P_{(\phi_x)} and P(ϕy)P_{(\phi_y)} characterize the "population" distributions, modeled nonparametrically, for the independent and dependent variables and their uncertainties. The global likelihood is L(θ,ϕ)=i=1NPiL(\theta,\phi) = \prod_{i=1}^N P_i, where θ=(k,b,σint)\theta = (k, b, \sigma_{\mathrm{int}}) are regression parameters and ϕ\phi are nuisance (population) parameters (Jing et al., 13 Nov 2024).

2. Empirical Bayes, Variational Inference, and Normalizing Flows

Because the true population distributions P(x)P(x) and P(σxx)P(\sigma_x|x) (and analogs for yy) are typically unknown in practice, an empirical Bayes methodology is employed. Both are modeled with normalizing flows — invertible bijective maps fϕ:(x,σx)zR2f_{\phi}: (x,\sigma_x) \to z \in \mathbb{R}^2 such that zz has a tractable base distribution, usually a standard Gaussian: Pϕ(x,σx)=Pz(fϕ(x,σx))detfϕ(x,σx)P_{\phi}(x,\sigma_x) = P_z(f_{\phi}(x,\sigma_x)) \cdot \left| \det \frac{\partial f_{\phi}}{\partial(x, \sigma_x)} \right| Empirical Bayes estimation fits these flows to maximize the marginal likelihood of observed (xobs,i,σx,i)(x_{\text{obs},i}, \sigma_{x,i}) data, which involves integrating over the latent true xx using Gauss–Hermite quadrature in 1D. An analogous process applies independently to yiy_i (Jing et al., 13 Nov 2024).

After learning ϕx\phi_x and ϕy\phi_y, regression parameters θ\theta are optimized by maximizing the full composite likelihood, leveraging automatic differentiation. The entire process is GPU-accelerated, and normalizing-flows are implemented using the pzflow library with RollingSplineCoupling layers (Jing et al., 13 Nov 2024).

3. Convex-analytic and Algorithmic Solution: GRV-ML and BayesLineFit

The "GRV-ML" (generalized RV-ML) algorithm solves the Gaussian errors-in-variables MLE for general linear regression, admitting both overdetermined and underdetermined cases (Guo et al., 15 Jul 2025). Given observed yRMy \in \mathbb{R}^M,

y=(H+E)x+ϵy = (H + E)x + \epsilon

where HH is a known matrix, EE is random Gaussian i.i.d. matrix uncertainty, and ϵ\epsilon is observation noise. Marginalizing over EE yields

yN(Hx, (σe2x22+σϵ2)IM)y \sim \mathcal{N}(Hx,~(\sigma_e^2 \|x\|_2^2 + \sigma_\epsilon^2) I_M)

The negative log-likelihood is

f(y;x)=yHx222(σe2x22+σϵ2)+M2ln(σe2x22+σϵ2)f(y; x) = \frac{\|y - Hx\|_2^2}{2(\sigma_e^2 \|x\|_2^2 + \sigma_\epsilon^2)} + \frac{M}{2} \ln(\sigma_e^2 \|x\|_2^2 + \sigma_\epsilon^2)

which is generally nonconvex in xx, but "lifting" to suitable slack variables yields a convex program. This admits strong duality, explicit KKT conditions, and a bisection/root-finding algorithm for the global solution. In the case of 2D orthogonal regression (i.e., classical line fitting with noisy xix_i and yiy_i), the solution reduces to a closed-form iteration: m(ν)=ixiyi/[1+2νσe2xi2]ixi2/[1+2νσe2xi2]m(\nu) = \frac{\sum_i x_i y_i/[1 + 2\nu\sigma_e^2 x_i^2]}{\sum_i x_i^2/[1 + 2\nu\sigma_e^2 x_i^2]}

c(ν)=i(yim(ν)xi)/[1+2νσe2xi2]i1/[1+2νσe2xi2]c(\nu) = \frac{\sum_i (y_i - m(\nu)x_i)/[1 + 2\nu\sigma_e^2 x_i^2]}{\sum_i 1/[1 + 2\nu\sigma_e^2 x_i^2]}

where the Lagrange multiplier ν\nu is solved via scalar bisection. This specific formulation is conventionally called BayesLineFit (Guo et al., 15 Jul 2025).

4. Implementation Details and Computational Scheme

Key implementation practices include:

  • Pretraining normalizing flows on "shrunk" data for stability.
  • Training flows via full-batch Adam optimizer (learning rate 3×1033 \times 10^{-3}, early stopping with patience 10 epochs).
  • Marginal likelihood evaluations via 1D Gauss–Hermite quadrature (approx. 40 nodes per integral).
  • For θ\theta optimization, automatically differentiating the log-likelihood with respect to (k,b,σint)(k, b, \sigma_{\text{int}}).
  • Use of GPU acceleration for both flow training, quadrature, and posterior-sampling.
  • For the analytic GRV-ML form, the main computational steps are SVD factorization, sign-invariant transformations, and, in the 2D case, O(M)O(M) per-iteration cost for ν\nu-bisection. Convergence is typically achieved within 10–20 iterations (Jing et al., 13 Nov 2024, Guo et al., 15 Jul 2025).

5. Performance Characteristics and Regimes of Validity

BayesLineFit exhibits unbiasedness and self-consistency—correctly accounting for heteroscedastic errors in both axes, arbitrary intrinsic distributions, and latent error correlations. Tests show that with sample sizes exceeding 1000, both ML-based and KS-test-based methods substantially outperform previous approaches, especially under low signal-to-noise. For moderate ($300 < N < 1000$) samples, the ML-based method yields the best performance. In the low-sample regime (N<300N < 300), competitive results are maintained versus state-of-the-art alternatives (Jing et al., 13 Nov 2024).

The analytic GRV-ML admits a complete KKT characterization distinguishing four regimes, depends on the rank of HH and sign of certain feasibility scalars, and, in underdetermined cases, allows for non-uniqueness or benefits from the randomness of HH for additional norm-information on xx (Guo et al., 15 Jul 2025). The Cramér–Rao bound under the marginalized model yN(Hx,(σe2x2+σϵ2)I)y \sim \mathcal{N}(Hx, (\sigma_e^2 \|x\|^2 + \sigma_\epsilon^2)I) is explicit, and in the limit σe0\sigma_e \to 0 reduces to the classic least-squares CRB (Guo et al., 15 Jul 2025).

6. Applications, Extensions, and Comparative Context

BayesLineFit is specifically developed for regression on astronomical data with realistic, non-Gaussian distributions and varying uncertainties. Its performance on both mock and real datasets from PHANGS-ALMA and PHANGS-JWST demonstrates reliability under challenging, signal-poor and structurally complex regimes (Jing et al., 13 Nov 2024). The normalizing flow approach allows the method to adapt to arbitrary intrinsic population and error structures beyond any fixed parametric form.

The GPU-based Python code "raddest" implementing these methods is slated for public release, ensuring reproducibility and extensibility. In contrast to expectation-maximization approaches and previous RV-ML/ODR solvers, the convexity and lifting arguments underlying GRV-ML/BayesLineFit yield global guarantees and superior algorithmic efficiency, as well as robust handling of both underdetermined and overdetermined problems (Guo et al., 15 Jul 2025).

7. Relation to Broader Regression Paradigms

Maximum-likelihood orthogonal regression via BayesLineFit provides a unified, statistically principled solution to the errors-in-variables linear fitting problem, augmenting classical total least squares, ODR, and Bayesian hierarchical regression by:

  • Explicitly integrating both measurement and intrinsic scatter
  • Learning and modeling population structure nonparametrically via normalizing flows
  • Accommodating arbitrary error correlations and non-Gaussian data
  • Delivering provably convex optimization and analytic KKT-based solution structure in the linear Gaussian case (Jing et al., 13 Nov 2024, Guo et al., 15 Jul 2025)

A plausible implication is that this framework generalizes to multivariate and nonlinear settings, provided the computational cost of the empirical Bayes and normalizing flow components remains tractable. The method thus advances the state-of-the-art in orthogonal regression for complex, noise-dominated scientific data sets.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to CosmicFlows-3 Multi-Attractor Flow Model.