Private Mask Pre-Training (PMP)

Updated 7 February 2026

PMP is a framework that embeds a hidden sparse binary mask during pre-training to restrict unauthorized fine-tuning in foundation models.
It uses an early-bird lottery ticket algorithm to identify and stabilize the mask, ensuring only the optimized subnetwork is updated in subsequent training.
Empirical findings show that PMP maintains base performance while reducing adaptation gains from unauthorized fine-tuning by up to 20 points.

Private Mask Pre-Training (PMP) is a pre-training framework developed to embed intrinsic barriers against unauthorized downstream fine-tuning in open-sourced foundation models. PMP achieves this by identifying and privatizing a sparse subnetwork during pre-training via a binary mask, which is hidden prior to model release. As a result, unauthorized fine-tuning—absent knowledge of the mask—provably yields limited gains and is destabilized by a geometrical mismatch between the pre-training and adaptation subspaces. The PMP framework is architecture-agnostic and preserves base model usability, while granting model owners the ability to retain adaptation control without requiring architectural or policy obfuscation (Wang et al., 31 Jan 2026).

1. Formalization and Core Principles

Let $W\in\mathbb{R}^d$ denote the dense parameter vector of a foundation model. During PMP, a sparse binary mask $m\in\{0,1\}^d$ is selected, with sparsity ratio $p$ ( $\|m\|_0 = p\,d$ ), partitioning $W$ into $W_m=W\odot m$ (the active "ticket") and $W_{\bar m}=W\odot(1-m)$ (the inactive complement). Pre-training proceeds such that, after a brief warm-up, all learning updates are confined to $W_m$ , with $W_{\bar m}$ frozen at initial values. The final released model is $W^* = W_m^* + W_{\bar m}^0$ , and only the dense weights are published— $m$ remains secret.

The central PMP principle is that unauthorized fine-tuning without access to $m$ must indiscriminately update both $W_m$ and $W_{\bar m}$ . Since only $W_m$ was optimized during pre-training, downstream SGD in the orthogonal frozen subspace encounters a loss surface with high curvature and misaligned gradients, inducing instability and bounding any net fine-tuning gain.

2. Mask Identification Procedure

PMP adopts an early-bird lottery ticket-based algorithm to derive the private mask $m$ . The procedure involves a warm-up phase of $T_{\rm EB}$ steps (typically 500) on a pre-training subset $\mathcal{D}_{\rm pre}$ . At each step, the absolute parameter gradients $g^{(t)}$ are computed. The top- $k$ ( $k = \lfloor p\,d\rfloor$ ) entries by magnitude define candidate support $m^{(t)}$ . Mask stability is assessed using intersection-over-union (IoU) thresholds and a hit counter. Once the candidate stabilizes for $\Delta$ consecutive steps (e.g., $\tau=0.99$ , $\Delta=5$ ), $m$ is fixed. The main pre-training then proceeds, updating only $W_m$ .

The process is summarized in the following table:

Step	Operation	Hyperparameters
Gradient evaluation	Compute $g^{(t)}=\|\nabla_W \ell(f(W^{(t)});B^{(t)})\|$
Top-K mask assignment	$m^{(t)}_i = 1$ if $g^{(t)}_i \geq \gamma$ else $0$	$k=\lfloor p\,d\rfloor$
Mask stabilization	IoU $(m^{(t)}, m^{(t-1)}) \geq \tau$ $\Rightarrow$ increase hits	$\tau=0.99$ , $\Delta=5$

After mask selection, all subsequent training is strictly in the masked subspace.

3. Theoretical Analysis of Fine-Tuning Instability

PMP provides formal guarantees that unauthorized fine-tuning is globally destabilized by the mask secrecy. The analysis rests on a local geometrical assumption: in the vicinity of the trained parameters, the loss landscape along $W_m$ is flat (null Hessian), while along $W_{\bar m}$ , the Hessian is strictly positive definite—representing steep directions untouched during pre-training.

Let the downstream objective be $L_{\rm ft}(W_m, W_{\bar m}) = L_{\rm pre}(W_m, W_{\bar m}) + \Delta(W_m, W_{\bar m})$ , where $\Delta$ encodes task shift. For any standard SGD step with step-size $\eta$ not restricted to $W_m$ : $\E[L_{\rm pre}(W_m', W_{\bar m}')] \geq L_{\rm pre}(W_m^*, W_{\bar m}^0) + c\eta^2,$ for some $c>0$ , so long as the gradient in the frozen subspace is non-zero with positive probability. Thus, arbitrary adaptation not guided by $m$ systematically increases the pre-training loss, bounding downstream gains over many steps. The proof adapts second-order Taylor expansion to show that the quadratic curvature along $W_{\bar m}$ dominates.

4. Implementation Aspects

The PMP pipeline comprises three principal stages:

Mask Discovery: Early-bird mask selection (500 steps).
Sparse-Subspace Pre-Training: Training proceeds with gradients confined to $W_m$ using standard optimizers (AdamW, cosine lr decay, batch size $4\times8$ , gradient norm clipping).
Release: Only $W^*$ is released, with $m$ withheld.

Experiments adopt TinyLlama-1.1B (22 layers, hidden size 2048, 32 heads) and GPT-2 architectures, pre-trained on SlimPajama-6B (tokenized to 256 tokens, causal LM loss). Authorized fine-tuning can be enabled for select users by providing $m$ such that only $W_m$ updates. Storage and release do not expose $m$ . Empirically, adversaries are unable to reconstruct $m$ from observed gradients or outputs, as gradient-magnitude distributions between masked and unmasked weights overlap.

5. Empirical Findings

PMP is evaluated on base (zero-shot, pre-finetuning) capabilities and unauthorized fine-tuning across GLUE tasks (CoLA, SST-2, MRPC, QQP, STS-B, MNLI, QNLI, RTE). The experimental protocol holds all training and adaptation hyperparameters constant across non-PMP and PMP settings.

Key outcomes:

Base performance is unchanged with PMP (e.g., TinyLlama: $\sim50\%$ w/o PMP vs. $\sim51\%$ w/ PMP).
Unauthorized fine-tuning accuracy is greatly reduced (TinyLlama: $76\%$ w/o PMP vs. $46\%$ w/ PMP).
Varying the mask ratio $p$ modulates the barrier: base accuracy is stable, but unauthorized fine-tuning drops from $\sim87\%$ at $p=1$ to $\sim60\%$ at $p=0.7$ .
Across a grid of learning rates and epochs, PMP consistently suppresses adaptation gains by $\sim20$ points, while non-PMP models are robust.
Authorized fine-tuning (with $m$ ) recovers high post-adaptation accuracy ( $\sim80\%$ on GLUE).

6. Discussion and Open Problems

PMP establishes non-fine-tunability by imposing a pre-training-level geometry mismatch, graphically supported by loss-landscape sweeps indicating flat valleys along $W_m$ and steep walls in orthogonal directions. The mask ratio $p$ operates as a tunable control on the model's resilience to adaptation, trading off adaptation difficulty for training speed.

Empirical findings suggest that the mask cannot be reliably inferred from black-box queries, due to the statistical overlap of gradient magnitudes between masked and unmasked weights.

Several limitations and open directions are recognized:

Theoretical guarantees focus on single-step adaptation; multi-step SGD dynamics, especially for sophisticated adversaries, warrant deeper investigation.
The secrecy of $m$ is central; partial leakage or white-box access could present vulnerabilities, necessitating future robustness analyses.
Current experiments focus on GLUE; effectiveness on instruction-tuning, multimodal, or adversarial tasks remains to be evaluated.
Interactions with other release strategies, such as quantization or differential privacy, are unexplored.

PMP provides an architecture-agnostic, low-overhead method to regulate foundation model adaptation post-release, preserving base utility and authorized fine-tuning while bounding unauthorized reuse through theoretically and empirically supported mechanisms (Wang et al., 31 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

Towards Building Non-Fine-Tunable Foundation Models (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Private Mask Pre-Training (PMP).

Private Mask Pre-Training (PMP)

1. Formalization and Core Principles

2. Mask Identification Procedure

3. Theoretical Analysis of Fine-Tuning Instability

4. Implementation Aspects

5. Empirical Findings

6. Discussion and Open Problems

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Private Mask Pre-Training (PMP)

1. Formalization and Core Principles

2. Mask Identification Procedure

3. Theoretical Analysis of Fine-Tuning Instability

4. Implementation Aspects

5. Empirical Findings

6. Discussion and Open Problems

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research