Formal proof of convexity for the level cost function in the Lagrange-multiplier bit-width optimization

Prove that, in the per-level bit-width optimization for the nested multilevel Monte Carlo (MLMC) framework described in Section 6.1, the level objective function f_ell(λ) = sqrt(V_ell * C̃_ell(d(λ))) + sqrt(V^{Δ}_ell(d(λ)) * C_ell) is convex with respect to the Lagrange multiplier λ, where for each λ the vector of real-valued bit-widths d(λ) = (d_{1,ell}, ..., d_{m_ell,ell}) solves the stationarity conditions ∂V^{Δ}_ell/∂d_{i,ell} + λ * ∂C̃_ell/∂d_{i,ell} = 0 for all i. Here C̃_ell(d) is the fixed-point cost model defined by the per-variable quadratic form, V^{Δ}_ell(d) is the variance bound used for the correction term, and V_ell and C_ell are the full-precision variance and cost for level ell. Establishing convexity would rigorously justify the observed single optimum in the λ-based optimization procedure.

Background

The paper proposes a per-level bit-width optimization using a Lagrange multiplier to trade off the correction-term variance and the low-precision computation cost on FPGAs. The level objective to be minimized is the sum sqrt(V_ell * C̃ell(d)) + sqrt(V{Δ}_ell(d) * C_ell), where C̃_ell(d) is a quadratic cost model in the bit-widths and V{Δ}_ell(d) is an analytical upper bound on the variance due to rounding and approximate random numbers. For a given λ, the optimal bit-widths d(λ) are obtained by solving uncoupled nonlinear equations ∂V{Δ}_ell/∂d{i,ell} + λ * ∂C̃ell/∂d{i,ell} = 0.

The authors empirically plot the level objective as a function of λ and observe a convex shape, indicating a unique optimum. However, they explicitly state that they do not provide a formal proof of convexity. A rigorous proof would validate the optimization strategy and the use of golden-section search to locate the optimum λ.

References

Although we do not prove it formally we can see that the resulting function is convex which ensures the existence of an optimum.

A nested MLMC framework for efficient simulations on FPGAs (2502.07123 - Haas et al., 10 Feb 2025) in Subsection 6.1 (Bit-width optimisation using a Lagrange multiplier), following Figure 3 (fig:cost_lambda)