Reassessed Labels (ReaL): Theory & Applications

Updated 31 March 2026

Reassessed Labels (ReaL) is a dual framework that refines classical rank theory in projective varieties and enhances policy optimization in reinforcement learning by using structured label decompositions.
In algebraic geometry, ReaL defines admissible decompositions by quantifying real and complex-conjugate components, offering tighter bounds and improved uniqueness compared to traditional real rank methods.
In reinforcement learning, ReaL treats verifiable rewards as binary labels, employing an anchor-logit mechanism to produce bounded, monotonic gradient updates and superior Pass@1 performance.

Reassessed Labels (ReaL) constitute a conceptual and computational framework that arises independently in two modern research contexts. The first is in the structure theory of real projective varieties, where labels quantify the composition of decompositions into real and complex-conjugate points, with deep relevance to admissible rank and typical rank phenomena. The second is in machine learning, especially reinforcement learning for LLMs, where “Rewards as Labels” (ReaL) reformulates policy optimization as a classification task, assigning discrete labels to rollouts based on verifiable reward signals. In both domains, ReaL provides refined instruments for distinguishing and improving upon classical approaches in rank theory and policy gradient methods, respectively.

1. Admissible Rank and Label Decompositions for Projective Varieties

Let $X \subset \mathbb{P}^r(\mathbb{C})$ be a nondegenerate irreducible complex projective variety defined over $\mathbb{R}$ . For a real point $q \in \mathbb{P}^r(\mathbb{R})$ , the classical $X(K)$ -rank is the minimal cardinality of a set $S \subset X(K)$ (for $K = \mathbb{R}, \mathbb{C}$ ) such that $q \in \langle S \rangle_K$ . The admissible rank, $\mathrm{r}_{X(\mathbb{C}),\mathrm{adm}}(q)$ , is defined by restricting attention to finite subsets $S \subset X(\mathbb{C})$ globally stable under complex conjugation, $o(S) = S$ , and such that $q$ lies in their real span. This intermediate notion interpolates between real and complex rank, satisfying $\mathrm{r}_{X(\mathbb{C})}(q) \le \mathrm{r}_{X(\mathbb{C}),\mathrm{adm}}(q) \le 2 \mathrm{r}_{X(\mathbb{C})}(q)$ for all real $q$ (Ballico et al., 2019).

2. Label Structure: Definition and Weight

Given an admissible decomposition $S$ , the label is the pair $(a, b) \in \mathbb{N}^2$ , where $b$ is the number of real points and $a$ is the number of complex-conjugate pairs, with the total cardinality $w(a,b) = 2a + b$ . Formally,

$b = \#(S \cap X(\mathbb{R})), \quad a = \frac{\#S - b}{2}$

Labels serve as refined invariants of decompositions and directly relate to the admissible rank through the constraint $2a + b = \mathrm{r}_{X(\mathbb{C}),\mathrm{adm}}(q)$ . For given admissible rank $k$ , labels obey $0 \leq b \leq k$ and $0 \leq a \leq \lfloor k/2\rfloor$ (Ballico et al., 2019).

3. Typical Labels and Generic Behaviour

A label $(a, b)$ is typical if there exists a full-dimensional Euclidean open set $U \subset \mathbb{P}^r(\mathbb{R})$ such that each $q \in U$ admits an admissible decomposition with exactly that label and with minimal admissible rank. Notably, if $X(\mathbb{R})$ is Zariski dense in $X(\mathbb{C})$ and $g = r_{\mathrm{gen}}$ is the generic complex rank, every $(a,b)$ with $2a + b = g$ is typical. For specific curves, further structure arises; for example, linearly normal real elliptic curves of odd degree manifest typical labels with both generic and generic+1 weight values (Ballico et al., 2019).

Context	Condition	Typical Labels
Rational normal curves	Degree $d$	$2a+b=\lceil(d+1)/2\rceil$
Generic projective var.	Zariski-dense real part	$2a+b = r_{\mathrm{gen}}$
Real rank	Only real points in support	Labels of form $(0,b)$

4. Labels for Rational Normal Curves

For the rational normal curve $X_d \subset \mathbb{P}^d$ over $\mathbb{R}$ , the admissible rank equals the complex rank for all real points, i.e., $\mathrm{r}_{X_d(\mathbb{C}),\mathrm{adm}}(q) = \mathrm{r}_{X_d(\mathbb{C})}(q)$ . All typical labels $(a, b)$ satisfy $2a + b = \lceil (d+1)/2\rceil$ , and there are no labels of larger weight occurring generically (Ballico et al., 2019).

Table of typical labels for $X_d$ :

$d$ (deg)	Generic complex rank $r_{\mathrm{gen}}$	Typical labels $(a, b)$
Odd $d=2m+1$	$m+1$	All with $2a + b = m+1$
Even $d=2m$	$m+1$	All with $2a + b = m+1$

The only extreme cases are $(0, \frac{d+1}{2})$ and $(\frac{d+1}{2}, 0)$ for odd $d$ , representing decompositions entirely over the reals or entirely nonreal in conjugate pairs.

5. Scheme-Theoretic Labels and Cactus Rank Analogues

The scheme-theoretic (cactus) version of admissible rank considers 0-dimensional schemes $Z \subset X(\mathbb{C})$ fixed by conjugation, i.e., $o(Z) = Z$ . The admissible cactus rank of $q$ is the minimal length of such a scheme whose span contains $q$ . Scheme-labels are richer, recording lengths of conjugation-stable connected components. In the reduced case, the scheme-label reduces to the pair $(a, b)$ . For points of cactus-admissible rank below the linear independence threshold, the decomposition is unique and the scheme-label is well-defined (Ballico et al., 2019).

6. Comparison With Real Rank Theory

Real rank admits only real support in decompositions, with labels restricted to $(0, b)$ . Admissible rank, in contrast, allows decompositions with complex conjugate summands, facilitating lower or more flexible bounds. For rational normal curves $X_d$ , typical real ranks range from $\lceil(d+1)/2\rceil$ to $d$ , whereas all admissible labels are concentrated at $\lceil(d+1)/2\rceil$ . Admissible rank maintains much of the good generic uniqueness and bound properties of complex rank, yet tracks real versus complex phenomena via the integer $a$ in the label (Ballico et al., 2019).

7. “Rewards as Labels” (ReaL) in Reinforcement Learning

In a distinct context, the ReaL framework for RLVR (Reinforcement Learning with Verifiable Rewards) treats verifiable reward signals as categorical binary labels—a conceptual shift from scalar-reward policy gradients to classification-style updates. Each rollout $o_k$ is labeled $y_k = 1$ if the reward $r(o_k|q) = 1$ (positive), and 0 otherwise (negative). The policy is optimized by minimizing a binary cross-entropy loss on “relative log-probability” scores $s_k$ , computed as

$s_k = \frac{1}{|o_k|} \sum_{t=1}^{|o_k|} \left[ \log \pi_\theta(o_{k,t} | q, o_{k,<t}) - \log \pi_{\text{old}}(o_{k,t} | q, o_{k,<t}) \right].$

To enhance separation of positive and negative samples, a fixed anchor logit is introduced, leading to a softmax cross-entropy loss incorporating this anchor. The resulting loss yields monotonic and bounded gradient weighting, preventing hard negatives from dominating and properly prioritizing under-confident positives. Empirical results demonstrate gains in Pass@1 on mathematical reasoning benchmarks: at 1.5B parameters, ReaL outperforms DAPO by 6.7%; at 7B, ReaL surpasses DAPO and GSPO by 6.2% and 1.7%, respectively (Zhai et al., 5 Feb 2026).

Model Size	GRPO	DAPO	GSPO	ReaL (Pass@1 %)
1.5B	43.1	45.9	51.9	52.6
7B	59.2	57.0	61.5	63.2

Key properties include monotonic, bounded gradient assignment, empirical stability, and generalization across tasks. The classification formulation and anchor-logit mechanism distinguish ReaL from prior reward-weighted policy gradient methods (Zhai et al., 5 Feb 2026).

Markdown Report Issue Upgrade to Chat

References (2)

Labels of real projective varieties (2019)

Rewards as Labels: Revisiting RLVR from a Classification Perspective (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Reassessed Labels (ReaL).

Reassessed Labels (ReaL): Theory & Applications

1. Admissible Rank and Label Decompositions for Projective Varieties

2. Label Structure: Definition and Weight

3. Typical Labels and Generic Behaviour

4. Labels for Rational Normal Curves

5. Scheme-Theoretic Labels and Cactus Rank Analogues

6. Comparison With Real Rank Theory

7. “Rewards as Labels” (ReaL) in Reinforcement Learning

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Reassessed Labels (ReaL): Theory & Applications

1. Admissible Rank and Label Decompositions for Projective Varieties

2. Label Structure: Definition and Weight

3. Typical Labels and Generic Behaviour

4. Labels for Rational Normal Curves

5. Scheme-Theoretic Labels and Cactus Rank Analogues

6. Comparison With Real Rank Theory

7. “Rewards as Labels” (ReaL) in Reinforcement Learning

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research