Papers
Topics
Authors
Recent
2000 character limit reached

Bounded Shape-Constrained Lasso Estimator

Updated 19 November 2025
  • The paper introduces a penalized regression estimator that integrates sparsity and shape constraints with explicit range (box) and sum restrictions.
  • It employs fused-lasso and nearly-isotonic penalties to promote blockwise structure and monotonicity, yielding optimal asymptotic properties under DAG-induced orders.
  • The estimator extends to discrete distribution estimation and is computed efficiently using quadratic programming and ADMM strategies.

A bounded shape-constrained lasso-type estimator is a penalized regression or signal recovery procedure that enforces both sparsity and shape constraints—such as monotonicity, unimodality, or other graph-induced order restrictions—while subjecting the estimates to explicit range (box) and total-sum constraints. This methodology is particularly relevant for estimation over discrete structures (e.g., signals indexed by vertices of a directed acyclic graph, or probability mass functions) where one wishes to combine the adaptivity and regularization of lasso-type methods with global structural properties. The theoretical formulation covers general fused and nearly-isotonic penalties, with detailed asymptotic characterization, numerical strategies, and key applications in nonparametric discrete distribution estimation and constrained function inference.

1. Statistical Model and Problem Formulation

Let the underlying signal be β=(β1,,βs)Rs\beta^{\circ} = (\beta^{\circ}_1, \dots, \beta^{\circ}_s) \in \R^s, indexed by a finite set VV. A partial order \preceq on VV is encoded by a directed acyclic graph (DAG) G=(V,E)G = (V, E), with oriented incidence matrix DRm×sD \in \R^{m \times s} for m=Em = |E|. The base estimator β^n\hat\beta_n is assumed to satisfy

nq(β^nβ)dψ,q>0,n^q (\hat\beta_n - \beta^{\circ}) \overset{d}{\longrightarrow} \psi, \quad q > 0,

for some random vector ψRs\psi \in \R^s with nondegenerate law. The target is a refined correction of β^n\hat\beta_n under shape constraints and penalties.

The constrained estimator is defined as

β^n=arg minβRs{12β^nβ22+λnFDβ1+λnNIDβ+}\hat\beta_n^* = \argmin_{\beta \in \R^s} \left\{ \tfrac{1}{2} \|\hat\beta_n - \beta\|_2^2 + \lambda_n^F \|D\beta\|_1 + \lambda_n^{NI} \|D\beta\|_{+} \right\}

subject to

i=1sβi=i=1sβ^n,i,miniβ^n,iβimaxiβ^n,i i.\sum_{i=1}^s \beta_i = \sum_{i=1}^s \hat\beta_{n,i}, \quad \min_i \hat\beta_{n,i} \leq \beta_i \leq \max_i \hat\beta_{n,i} \ \forall i.

Here, Dβ1\|D\beta\|_1 is the fused-lasso (total variation) penalty and Dβ+\|D\beta\|_{+} is the nearly-isotonic penalty (x+=max(x,0)x_{+} = \max(x, 0))—both promoting blockwise structure and monotonicity. Notably, the probability-type constraints (sum and range) are redundant: the unconstrained minimization with these penalties alone yields estimators that automatically respect these box and sum conditions (Pastukhov, 18 Nov 2025).

2. Penalty Structure and Tuning Parameter Asymptotics

The estimator incorporates two types of L1 penalties:

  • Fused-lasso penalty: Dβ1=(i,j)Eβiβj\|D\beta\|_1 = \sum_{(i,j) \in E} |\beta_i - \beta_j|, encouraging blockwise constant fits and edge sparsity.
  • Nearly-isotonic penalty: Dβ+=(i,j)E(βiβj)+\|D\beta\|_{+} = \sum_{(i,j)\in E} (\beta_i - \beta_j)_{+}, which promotes monotonicity with respect to the partial order.

Tuning parameters are scaled as:

λnF/nqλ0F[0,),λnNI/nqλ0NI[0,)\lambda_n^F / n^q \to \lambda_0^F \in [0, \infty), \quad \lambda_n^{NI} / n^q \to \lambda_0^{NI} \in [0, \infty)

to ensure neither penalty dominates the stochastic fluctuations at the nqn^{-q} rate.

3. Asymptotic Distribution and Limiting Law

Under regularity and the scaling above, the asymptotic law is as follows. Define

nq(β^nβ)darg minwRs  V(w),n^q (\hat\beta_n^* - \beta^{\circ}) \overset{d}{\longrightarrow} \underset{w \in \R^s}{\argmin} \; V(w),

where for noise ψ\psi, \begin{align*} V(w) = &-2\psiT w + |w|22 + \lambdaF_0 \sum{(i,j)\in E} (w_i - w_j) \mathrm{sign}(\beta{\circ}_i - \beta{\circ}_j) 1{\beta{\circ}_i \neq \beta{\circ}_j} \ & + \lambdaF_0 \sum_{(i,j)\in E} |w_i - w_j| 1{\beta{\circ}_i = \beta{\circ}_j} \ & + \lambda{NI}_0 \sum_{(i,j)\in E} (w_i - w_j) 1{\beta{\circ}_i > \beta{\circ}_j} + \lambda{NI}_0 \sum_{(i,j)\in E} |w_i - w_j|_{+} 1{\beta{\circ}_i = \beta{\circ}_j}. \end{align*} Equivalently, the limit is characterized as the unique minimizer of

wψ22+λ0FDw1+λ0NIDw+.\|w - \psi\|_2^2 + \lambda_0^F \|D w\|_1 + \lambda_0^{NI} \|D w\|_{+}.

This describes the distributional limit as a penalized projection of the base noise ψ\psi onto the space of allowable shapes, preserving the convergence rate nqn^{-q} of the original estimator (Pastukhov, 18 Nov 2025).

4. Specialization: Nearly-Isotonic Without Fusion Penalty

If λnF0\lambda_n^F \equiv 0 and λnNI/nqλ0NI>0\lambda_n^{NI} / n^q \to \lambda_0^{NI}>0, and β\beta^{\circ} is already isotonic under GG, the asymptotic law decomposes by maximal blocks B1,,BKB_1, \ldots, B_K on which β\beta^{\circ} is constant. The limit is the concatenation of blockwise nearly-isotonic regressions:

nq(β^nβ)d(w^(1),,w^(K)),n^q(\hat\beta_n^* - \beta^{\circ}) \overset{d}{\longrightarrow} \bigl(\hat{w}^{(1)}, \ldots, \hat{w}^{(K)}\bigr),

where each

minwwψ22+λ0NI(i,j)E(B×B)wiwj+,\min_{w} \|w - \psi\|_2^2 + \lambda^{NI}_0 \sum_{(i,j)\in E \cap (B_{\ell}\times B_{\ell})} |w_i - w_j|_{+},

mirrors classical isotonic regression limit theory and corroborates known blockwise decoupling (Pastukhov, 18 Nov 2025).

5. Discrete Distribution Estimation on Directed Acyclic Graphs

For the estimation of discrete probability mass functions under order constraints, the methodology applies directly. Letting p^n\hat{p}_n denote the empirical pmf over VV and (V,E)(V, E) representing the targeted shape (e.g., unimodality, multivariate monotonicity), the penalized estimator is

p^n=arg minpRs{12p^np22+λnFDp1+λnNIDp+}subject to p0,pi=1.\hat{p}_n^* = \argmin_{p \in \R^s} \Bigl\{ \tfrac{1}{2}\|\hat{p}_n - p\|_2^2 + \lambda_n^F \|D p\|_1 + \lambda_n^{NI} \|D p\|_{+} \Bigr\} \quad \text{subject to } p \geq 0, \, \sum p_i = 1.

This estimator enforces nonnegativity, sum, and range constraints intrinsically, with the limit law matching that of the general signal case. Notably, if p^n\hat{p}_n is empirical, ψN(0,Σ)\psi \sim N(0, \Sigma), yielding asymptotically valid confidence bands under order restrictions (Pastukhov, 18 Nov 2025).

6. Key Proof Techniques and Theoretical Structure

Principal ingredients in the distributional analysis are:

  • Redundancy of box and sum constraints is demonstrated by partitioning VV into fused constant regions and leveraging first-order optimality—region means stay within observed ranges and the total sum is preserved (Theorem 2.1).
  • The limit transition employs epiconvergence: random processes Vn(w)V_n(w) are shown to converge in distribution with convexity and unique minimizer, enabling the application of Geyer's epiconvergence theorem for convergence in argmin distribution (Theorem 2.2).
  • For λF=0\lambda^F=0 and isotonic β\beta^{\circ}, the absence of fusion penalty in the limit across constant blocks leads to independent decoupling on each block.
  • The penalized estimator retains the original estimator's rate nqn^{-q}; the shape-constraint regularization at matched scaling does not deteriorate statistical efficiency (Pastukhov, 18 Nov 2025).

7. Algorithmic and Computational Approaches

Practical computation of bounded shape-constrained lasso-type estimators follows frameworks developed for the generalized or constrained lasso:

  • Quadratic programming (for small to moderate problem sizes): recasts the L1-penalized, box, and linear constraints into a standard QP by splitting variables into positive and negative parts (Gaines et al., 2016).
  • Alternating direction method of multipliers (ADMM): splits the parameter and constraint projections, efficiently handling large pp and arbitrary box and linear equality/inequality constraints.
  • Piecewise-linear solution path algorithms: for all λ\lambda along a path, especially when repeated solutions are needed for model selection.
  • The transformation from generalized lasso to constrained lasso further enables implementation for arbitrary penalty matrices and DAG structures (Gaines et al., 2016).

The methods automatically respect the relevant shape, sum, and range constraints under the specified penalty structures, with precise optimality system equations supporting convergence, rate, and uniqueness.


References:

  • "Asymptotic Distribution of Bounded Shape Constrained Lasso-Type Estimator for Graph-Structured Signals and Discrete Distributions" (Pastukhov, 18 Nov 2025)
  • "Algorithms for Fitting the Constrained Lasso" (Gaines et al., 2016)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Bounded Shape Constrained Lasso-Type Estimator.