Bounded Shape-Constrained Lasso Estimator

Updated 19 November 2025

The paper introduces a penalized regression estimator that integrates sparsity and shape constraints with explicit range (box) and sum restrictions.
It employs fused-lasso and nearly-isotonic penalties to promote blockwise structure and monotonicity, yielding optimal asymptotic properties under DAG-induced orders.
The estimator extends to discrete distribution estimation and is computed efficiently using quadratic programming and ADMM strategies.

A bounded shape-constrained lasso-type estimator is a penalized regression or signal recovery procedure that enforces both sparsity and shape constraints—such as monotonicity, unimodality, or other graph-induced order restrictions—while subjecting the estimates to explicit range (box) and total-sum constraints. This methodology is particularly relevant for estimation over discrete structures (e.g., signals indexed by vertices of a directed acyclic graph, or probability mass functions) where one wishes to combine the adaptivity and regularization of lasso-type methods with global structural properties. The theoretical formulation covers general fused and nearly-isotonic penalties, with detailed asymptotic characterization, numerical strategies, and key applications in nonparametric discrete distribution estimation and constrained function inference.

1. Statistical Model and Problem Formulation

Let the underlying signal be $\beta^{\circ} = (\beta^{\circ}_1, \dots, \beta^{\circ}_s) \in \R^s$ , indexed by a finite set $V$ . A partial order $\preceq$ on $V$ is encoded by a directed acyclic graph (DAG) $G = (V, E)$ , with oriented incidence matrix $D \in \R^{m \times s}$ for $m = |E|$ . The base estimator $\hat\beta_n$ is assumed to satisfy

$n^q (\hat\beta_n - \beta^{\circ}) \overset{d}{\longrightarrow} \psi, \quad q > 0,$

for some random vector $\psi \in \R^s$ with nondegenerate law. The target is a refined correction of $\hat\beta_n$ under shape constraints and penalties.

The constrained estimator is defined as

$\hat\beta_n^* = \argmin_{\beta \in \R^s} \left\{ \tfrac{1}{2} \|\hat\beta_n - \beta\|_2^2 + \lambda_n^F \|D\beta\|_1 + \lambda_n^{NI} \|D\beta\|_{+} \right\}$

subject to

$\sum_{i=1}^s \beta_i = \sum_{i=1}^s \hat\beta_{n,i}, \quad \min_i \hat\beta_{n,i} \leq \beta_i \leq \max_i \hat\beta_{n,i} \ \forall i.$

Here, $\|D\beta\|_1$ is the fused-lasso (total variation) penalty and $\|D\beta\|_{+}$ is the nearly-isotonic penalty ( $x_{+} = \max(x, 0)$ )—both promoting blockwise structure and monotonicity. Notably, the probability-type constraints (sum and range) are redundant: the unconstrained minimization with these penalties alone yields estimators that automatically respect these box and sum conditions (Pastukhov, 18 Nov 2025).

2. Penalty Structure and Tuning Parameter Asymptotics

The estimator incorporates two types of L1 penalties:

Fused-lasso penalty: $\|D\beta\|_1 = \sum_{(i,j) \in E} |\beta_i - \beta_j|$ , encouraging blockwise constant fits and edge sparsity.
Nearly-isotonic penalty: $\|D\beta\|_{+} = \sum_{(i,j)\in E} (\beta_i - \beta_j)_{+}$ , which promotes monotonicity with respect to the partial order.

Tuning parameters are scaled as:

$\lambda_n^F / n^q \to \lambda_0^F \in [0, \infty), \quad \lambda_n^{NI} / n^q \to \lambda_0^{NI} \in [0, \infty)$

to ensure neither penalty dominates the stochastic fluctuations at the $n^{-q}$ rate.

3. Asymptotic Distribution and Limiting Law

Under regularity and the scaling above, the asymptotic law is as follows. Define

$n^q (\hat\beta_n^* - \beta^{\circ}) \overset{d}{\longrightarrow} \underset{w \in \R^s}{\argmin} \; V(w),$

where for noise $\psi$ , \begin{align*} V(w) = &-2\psi^T w + |w|2² + \lambda^F_0 \sum{(i,j)\in E} (w_i - w_j) \mathrm{sign}(\beta^{\circ}_i - \beta^{\circ}_j) 1{\beta^{\circ}_i \neq \beta^{\circ}_j} \ & + \lambda^F_0 \sum_{(i,j)\in E} |w_i - w_j| 1{\beta^{\circ}_i = \beta^{\circ}_j} \ & + \lambda^{NI}_0 \sum_{(i,j)\in E} (w_i - w_j) 1{\beta^{\circ}_i > \beta^{\circ}_j} + \lambda^{NI}_0 \sum_{(i,j)\in E} |w_i - w_j|_{+} 1{\beta^{\circ}_i = \beta^{\circ}_j}. \end{align*} Equivalently, the limit is characterized as the unique minimizer of

$\|w - \psi\|_2^2 + \lambda_0^F \|D w\|_1 + \lambda_0^{NI} \|D w\|_{+}.$

This describes the distributional limit as a penalized projection of the base noise $\psi$ onto the space of allowable shapes, preserving the convergence rate $n^{-q}$ of the original estimator (Pastukhov, 18 Nov 2025).

4. Specialization: Nearly-Isotonic Without Fusion Penalty

If $\lambda_n^F \equiv 0$ and $\lambda_n^{NI} / n^q \to \lambda_0^{NI}>0$ , and $\beta^{\circ}$ is already isotonic under $G$ , the asymptotic law decomposes by maximal blocks $B_1, \ldots, B_K$ on which $\beta^{\circ}$ is constant. The limit is the concatenation of blockwise nearly-isotonic regressions:

$n^q(\hat\beta_n^* - \beta^{\circ}) \overset{d}{\longrightarrow} \bigl(\hat{w}^{(1)}, \ldots, \hat{w}^{(K)}\bigr),$

where each

$\min_{w} \|w - \psi\|_2^2 + \lambda^{NI}_0 \sum_{(i,j)\in E \cap (B_{\ell}\times B_{\ell})} |w_i - w_j|_{+},$

mirrors classical isotonic regression limit theory and corroborates known blockwise decoupling (Pastukhov, 18 Nov 2025).

5. Discrete Distribution Estimation on Directed Acyclic Graphs

For the estimation of discrete probability mass functions under order constraints, the methodology applies directly. Letting $\hat{p}_n$ denote the empirical pmf over $V$ and $(V, E)$ representing the targeted shape (e.g., unimodality, multivariate monotonicity), the penalized estimator is

$\hat{p}_n^* = \argmin_{p \in \R^s} \Bigl\{ \tfrac{1}{2}\|\hat{p}_n - p\|_2^2 + \lambda_n^F \|D p\|_1 + \lambda_n^{NI} \|D p\|_{+} \Bigr\} \quad \text{subject to } p \geq 0, \, \sum p_i = 1.$

This estimator enforces nonnegativity, sum, and range constraints intrinsically, with the limit law matching that of the general signal case. Notably, if $\hat{p}_n$ is empirical, $\psi \sim N(0, \Sigma)$ , yielding asymptotically valid confidence bands under order restrictions (Pastukhov, 18 Nov 2025).

6. Key Proof Techniques and Theoretical Structure

Principal ingredients in the distributional analysis are:

Redundancy of box and sum constraints is demonstrated by partitioning $V$ into fused constant regions and leveraging first-order optimality—region means stay within observed ranges and the total sum is preserved (Theorem 2.1).
The limit transition employs epiconvergence: random processes $V_n(w)$ are shown to converge in distribution with convexity and unique minimizer, enabling the application of Geyer's epiconvergence theorem for convergence in argmin distribution (Theorem 2.2).
For $\lambda^F=0$ and isotonic $\beta^{\circ}$ , the absence of fusion penalty in the limit across constant blocks leads to independent decoupling on each block.
The penalized estimator retains the original estimator's rate $n^{-q}$ ; the shape-constraint regularization at matched scaling does not deteriorate statistical efficiency (Pastukhov, 18 Nov 2025).

7. Algorithmic and Computational Approaches

Practical computation of bounded shape-constrained lasso-type estimators follows frameworks developed for the generalized or constrained lasso:

Quadratic programming (for small to moderate problem sizes): recasts the L1-penalized, box, and linear constraints into a standard QP by splitting variables into positive and negative parts (Gaines et al., 2016).
Alternating direction method of multipliers (ADMM): splits the parameter and constraint projections, efficiently handling large $p$ and arbitrary box and linear equality/inequality constraints.
Piecewise-linear solution path algorithms: for all $\lambda$ along a path, especially when repeated solutions are needed for model selection.
The transformation from generalized lasso to constrained lasso further enables implementation for arbitrary penalty matrices and DAG structures (Gaines et al., 2016).

The methods automatically respect the relevant shape, sum, and range constraints under the specified penalty structures, with precise optimality system equations supporting convergence, rate, and uniqueness.

References:

"Asymptotic Distribution of Bounded Shape Constrained Lasso-Type Estimator for Graph-Structured Signals and Discrete Distributions" (Pastukhov, 18 Nov 2025)
"Algorithms for Fitting the Constrained Lasso" (Gaines et al., 2016)

PDF Markdown Chat (Pro)

References (2)

Asymptotic Distribution of Bounded Shape Constrained Lasso-Type Estimator for Graph-Structured Signals and Discrete Distributions (2025)

Algorithms for Fitting the Constrained Lasso (2016)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Bounded Shape Constrained Lasso-Type Estimator.