Margin-Preserving Curveball Null

Updated 4 July 2026

Margin-Preserving Curveball Null is a statistical model for binary matrices that exactly preserves row and column sums while randomizing specific entries.
It employs the Curveball algorithm—a heat-bath variant of switch chains—to uniformly sample from the space of matrices with fixed marginals, even in the presence of forbidden entries.
Its application in social-highlighting studies isolates reader sub-group structures by comparing observed agreement metrics to expectations under controlled margin conditions.

A margin-preserving curveball null is a null model for binary matrices or bipartite incidence structures in which randomization preserves row sums and column sums exactly while reassigning which specific row–column pairs contain $1$s. In the social-highlighting setting studied in "Factions Within, Uncertain Across: Within-Document Reader Sub-Groups in Social Highlighting," the randomized object is a binary reader $\times$ sentence matrix, and the null is used to ask whether observed reader–reader agreement exceeds what shared salience, reader-specific highlight density, and sentence popularity would predict (Nakayashiki et al., 10 Jun 2026). In the broader Markov-chain literature, the same construction is implemented by the Curveball algorithm, which samples from the space of binary matrices with fixed marginals, or with fixed marginals and forbidden entries, and is analyzed as a heat-bath variant of switch-based chains for approximately uniform sampling (Carstens et al., 2017).

1. Formal definition and state space

In the social-highlighting formulation, each document induces a binary matrix $X$ whose rows are readers $r=1,\dots,R$ , whose columns are sentences $s=1,\dots,S$ , and whose entries satisfy

$X_{rs} = \begin{cases} 1 & \text{if reader } r \text{ highlighted sentence } s \ 0 & \text{otherwise.} \end{cases}$

The row sums

$k_r=\sum_{s=1}^S X_{rs}$

are per-reader highlight counts, and the column sums

$c_s=\sum_{r=1}^R X_{rs}$

are per-sentence popularity counts. A margin-preserving null randomizes $X$ subject to

$\sum_s X_{rs}^{(\text{null})}=k_r \quad \text{for all } r,$

and

$\times$ 0

Only the assignment of marks to specific reader–sentence pairs is randomized; overall highlighting intensity by reader and overall popularity by sentence are held fixed (Nakayashiki et al., 10 Jun 2026).

In the general MCMC treatment, the state space is the set of binary matrices with fixed margins,

$\times$ 1

or, when structural zeros are present, the constrained space

$\times$ 2

A margin-preserving null model is then a probability distribution over $\times$ 3, typically the uniform distribution, used to generate randomized matrices that exactly preserve row sums, exactly preserve column sums, and honor forbidden entries (Carstens et al., 2017).

Within the highlighting study, “the margin-preserving null” and “curveball permutation null” are synonymous: both refer to curveball-based randomization of the reader $\times$ 4 sentence matrix (Nakayashiki et al., 10 Jun 2026).

2. Why preserve margins

The central inferential purpose of the null is to remove agreement that is mechanically induced by marginal structure. In the highlighting setting, the paper identifies three confounds. Shared salience or sentence popularity means that some sentences are generally more likely to be highlighted. Reader-specific density means that some readers mark many sentences while others mark few, which changes expected agreement even in the absence of sub-group structure. Document-level popularity profile means that some regions of a document are widely marked while others are scarcely marked. By preserving both margins, the null conditions on all three in the sense used by the paper: sentence popularity is fixed by the column sums, and reader density is fixed by the row sums (Nakayashiki et al., 10 Jun 2026).

This yields a specific interpretive baseline. Under the null, any pairwise agreement arises purely from marginal structure and random assignment of marks consistent with that structure. Agreement that survives the null is described in the paper as “residual reader-agreement structure beyond shared salience, density, and popularity,” which the authors interpret descriptively as reader sub-grouping (Nakayashiki et al., 10 Jun 2026).

The same rationale appears in the broader binary-matrix literature in more abstract form. A margin-preserving null is intended to test structure beyond degree sequences or marginals, not structure caused by them. This is why Curveball and switch chains are defined on $\times$ 5: they randomize within the fixed-margin state space rather than replacing the observed data by an independent model that ignores the observed row and column sums (Carstens et al., 2017).

A common misconception is that any random shuffle provides an adequate baseline. The highlighting study explicitly avoids independent Bernoulli models, uniform random assignment of marks ignoring sentence popularity, and random permutation schemes that do not preserve both reader-level densities and sentence-level popularities simultaneously, because such nulls cannot distinguish genuine sub-group structure from marginal effects (Nakayashiki et al., 10 Jun 2026).

3. Curveball trades and exact preservation of marginals

Curveball is a Markov-chain algorithm for binary matrices that preserves row sums and column sums by repeatedly trading items between pairs of rows. In the highlighting paper, each row is a reader’s set of highlighted sentences. A trade chooses two rows $\times$ 6 and $\times$ 7, forms the corresponding sentence sets $\times$ 8 and $\times$ 9, identifies their shared items $X$ 0, identifies the unique items $X$ 1 and $X$ 2, pools the unique items into $X$ 3, and then randomly repartitions $X$ 4 into two sets of sizes $X$ 5 and $X$ 6. Shared items remain with both rows, while the unique items are reassigned. This preserves the number of highlights in each row and the number of marks in each column, because items are neither created nor destroyed (Nakayashiki et al., 10 Jun 2026).

The general formulation describes the same operation as a binomial trade. For a chosen row pair $X$ 7, the tradeable columns are those where the rows differ and where moving a $X$ 8 does not violate any forbidden entry. The algorithm then samples uniformly from all $X$ 9 binary submatrices with column sums equal to $r=1,\dots,R$ 0 and row sums fixed at the observed counts $r=1,\dots,R$ 1 and $r=1,\dots,R$ 2. The resulting transition stays inside $r=1,\dots,R$ 3, preserves forbidden entries, and defines a symmetric reversible Markov chain with uniform stationary distribution on the state space (Carstens et al., 2017).

In the social-highlighting implementation, the appendix states: “Curveball null. Both margins preserved via trial swaps; per document the swap count is $r=1,\dots,R$ 4 the number of 1s, well past the mixing point.” It further states: “We use $r=1,\dots,R$ 5– $r=1,\dots,R$ 6 permutations.” The same appendix explains that the per-document $r=1,\dots,R$ 7 is a ratio to the permutation standard deviation rather than a tail count, so it is not sensitive to the exact permutation count above the mixing point (Nakayashiki et al., 10 Jun 2026).

The theoretical literature provides the structural interpretation of these trades. The state space decomposes into binomial neighborhoods, and Curveball acts as the uniform heat-bath operator on each such neighborhood, whereas switch chains move locally within the same neighborhoods. This is why Curveball is described as a heat-bath variant of switch-based sampling (Carstens et al., 2017).

4. Agreement statistics and within-document sub-groups

The highlighting study evaluates reader alignment with binary cosine similarity. For two readers with highlighted-sentence sets $r=1,\dots,R$ 8 and $r=1,\dots,R$ 9,

$s=1,\dots,S$ 0

Two per-document summaries are then used. The first is nearest-neighbour agreement: for each reader $s=1,\dots,S$ 1, compute the maximum cosine with any other reader and average over readers. The second is the variance of all pairwise cosines within the document. For each document, the observed statistic is compared with its distribution under curveball permutations, and the effect size is summarized by

$s=1,\dots,S$ 2

Permutation $s=1,\dots,S$ 3-values are also computed, but the primary effect size is the per-document $s=1,\dots,S$ 4-score (Nakayashiki et al., 10 Jun 2026).

Against the full margin-preserving curveball null, the paper reports that across 75 dense documents, with median 25 readers and 74 sentences, the observed mean nearest-neighbour cosine is $s=1,\dots,S$ 5, the mean under the curveball null is $s=1,\dots,S$ 6, and the excess is $s=1,\dots,S$ 7. The standardized effect is mean $s=1,\dots,S$ 8 with $s=1,\dots,S$ 9 CI $X_{rs} = \begin{cases} 1 & \text{if reader } r \text{ highlighted sentence } s \ 0 & \text{otherwise.} \end{cases}$ 0, and $X_{rs} = \begin{cases} 1 & \text{if reader } r \text{ highlighted sentence } s \ 0 & \text{otherwise.} \end{cases}$ 1 of documents are individually significant at $X_{rs} = \begin{cases} 1 & \text{if reader } r \text{ highlighted sentence } s \ 0 & \text{otherwise.} \end{cases}$ 2. For the variance statistic, the mean $X_{rs} = \begin{cases} 1 & \text{if reader } r \text{ highlighted sentence } s \ 0 & \text{otherwise.} \end{cases}$ 3 is $X_{rs} = \begin{cases} 1 & \text{if reader } r \text{ highlighted sentence } s \ 0 & \text{otherwise.} \end{cases}$ 4 and $X_{rs} = \begin{cases} 1 & \text{if reader } r \text{ highlighted sentence } s \ 0 & \text{otherwise.} \end{cases}$ 5 of documents are significant. The paper summarizes this as evidence that, within a document, readers form strong sub-groups and that nearest-neighbour pairs agree far beyond what shared salience, mark density, and sentence popularity predict (Nakayashiki et al., 10 Jun 2026).

The same paper reports synthetic calibration. Under a no-structure control, where readers are independent given margins, nearest-neighbour $X_{rs} = \begin{cases} 1 & \text{if reader } r \text{ highlighted sentence } s \ 0 & \text{otherwise.} \end{cases}$ 6-scores are approximately $X_{rs} = \begin{cases} 1 & \text{if reader } r \text{ highlighted sentence } s \ 0 & \text{otherwise.} \end{cases}$ 7 with $X_{rs} = \begin{cases} 1 & \text{if reader } r \text{ highlighted sentence } s \ 0 & \text{otherwise.} \end{cases}$ 8 significant documents. Under a planted-groups control with two fixed sub-groups, the statistics yield positive $X_{rs} = \begin{cases} 1 & \text{if reader } r \text{ highlighted sentence } s \ 0 & \text{otherwise.} \end{cases}$ 9. This supports the intended interpretation of the null: it does not spuriously create structure, and it responds in the expected direction when structure is present (Nakayashiki et al., 10 Jun 2026).

A plausible implication is that the null is being used not merely as a generic permutation device, but as a conditioning scheme tailored to a specific scientific question: whether agreement persists after the most salient document- and reader-level marginals have been fixed.

The highlighting study also introduces a stricter eight-block region-preserving null. Sentences are split into 8 contiguous position-based blocks of roughly equal size; in a typical document with approximately 72 sentences, each block has approximately 9 sentences. Curveball is then run independently within each block, preserving, for every reader and every block, the number of marks in that block, while still preserving per-sentence popularity within the block. Under this null, the model randomizes only which sentence within a block gets marked, not how much each reader engages each coarse region of the document (Nakayashiki et al., 10 Jun 2026).

This refinement is used to decompose the excess agreement found under the full margin-preserving null. When nearest-neighbour agreement is re-evaluated against the region-preserving null, the raw excess drops from $k_r=\sum_{s=1}^S X_{rs}$ 0 to $k_r=\sum_{s=1}^S X_{rs}$ 1. Shared region engagement therefore explains approximately $k_r=\sum_{s=1}^S X_{rs}$ 2 of the excess, about $k_r=\sum_{s=1}^S X_{rs}$ 3, and the remaining approximately $k_r=\sum_{s=1}^S X_{rs}$ 4, about $k_r=\sum_{s=1}^S X_{rs}$ 5, is interpreted as finer reader-specific agreement. Standardized against the region-preserving null, the mean $k_r=\sum_{s=1}^S X_{rs}$ 6 is $k_r=\sum_{s=1}^S X_{rs}$ 7, and $k_r=\sum_{s=1}^S X_{rs}$ 8 of documents remain significant. The paper therefore argues that shared engagement with the same coarse regions accounts for only part of the observed sub-group signal (Nakayashiki et al., 10 Jun 2026).

For cross-document analysis, the same per-document curveball baseline is used to define excess agreement for a pair of readers $k_r=\sum_{s=1}^S X_{rs}$ 9 on document $c_s=\sum_{r=1}^R X_{rs}$ 0:

$c_s=\sum_{r=1}^R X_{rs}$ 1

This quantity is the amount by which the pair’s observed agreement exceeds what their own margins and sentence popularity would predict in that document. For each pair that co-reads $c_s=\sum_{r=1}^R X_{rs}$ 2 documents, the paper repeatedly splits the documents into halves, computes the mean excess in each half, correlates the two halves over 200 random splits, and averages the resulting correlations. The goal is to test whether excess agreement is reproducible across documents (Nakayashiki et al., 10 Jun 2026).

The empirical conclusion is explicitly cautious. The cross-document split-half reproducibility of a pair’s agreement is near zero pooled, with $c_s=\sum_{r=1}^R X_{rs}$ 3 and $c_s=\sum_{r=1}^R X_{rs}$ 4 in two separately drawn samples. A power calibration shows that the test is informative only for pairs that co-read many documents. In the only informative high-overlap subset, $c_s=\sum_{r=1}^R X_{rs}$ 5, point estimates are positive but small-sample, imprecise across the separately drawn samples, never significant, and attenuate under the region-preserving null. The paper therefore leaves cross-document stability unresolved and states that the data are consistent with anything from situational grouping to a weak-to-moderate stable reader trait (Nakayashiki et al., 10 Jun 2026).

6. Switch chains, spectral comparison, and scope

The theoretical literature situates Curveball among Markov chains for sampling binary matrices with fixed marginals. A classical switch move is a local $c_s=\sum_{r=1}^R X_{rs}$ 6 checkerboard flip that preserves row sums and column sums. The 2017 comparison paper defines a unified $c_s=\sum_{r=1}^R X_{rs}$ 7-switch chain on $c_s=\sum_{r=1}^R X_{rs}$ 8 and shows that, on each binomial neighborhood, the switch chain induces a lazy random walk on a Johnson graph, whereas Curveball jumps uniformly within the same neighborhood. Formally, Curveball is the heat-bath variant of the switch chain (Carstens et al., 2017).

This yields a direct comparison of mixing behavior. For matrices with $c_s=\sum_{r=1}^R X_{rs}$ 9 columns, Theorem 4.1 gives the comparison between the Kannan–Tetali–Vempala switch chain and Curveball:

$X$ 0

In particular, if the KTV chain is rapidly mixing on a given family of margin configurations, then the Curveball chain is also rapidly mixing. The same paper also states that $X$ 1 only has non-negative eigenvalues when $X$ 2, so the KTV chain does not have to be made lazy (Carstens et al., 2017).

The analysis extends to forbidden entries, including simple directed graphs represented as binary matrices with diagonal forbidden. Under irreducibility, the general comparison applies to $X$ 3. For regular directed graphs, Theorem 4.4 gives

$X$ 4

so Curveball is at most a factor $X$ 5 slower in relaxation time than the edge-switch chain in that regular setting (Carstens et al., 2017).

Within the highlighting application, these results provide the theoretical backdrop for the statement that the curveball algorithm, run long enough to mix, produces nearly-uniform samples from the space of all binary matrices sharing the observed margins (Nakayashiki et al., 10 Jun 2026). A plausible implication is that the empirical null used for reader–sentence matrices belongs to the same degree-sequence-preserving family as network and contingency-table null models, but is specialized to a scientific question about residual agreement.

The scope and limitations are also explicit in the two papers. In the highlighting study, Curveball supports testing for heterogeneity but does not identify explicit clusters; the region-preserving null is approximate because 8 equal position-based blocks are only a proxy for natural sections or paragraphs; and the null operates within individual documents rather than modeling cross-document dependencies or reader–topic selection (Nakayashiki et al., 10 Jun 2026). In the sampling theory, rapid-mixing guarantees depend on irreducibility of the underlying state space, which is automatic in some regimes and nontrivial in others (Carstens et al., 2017). Together, these caveats delimit what a margin-preserving curveball null can establish: it can isolate structure beyond fixed marginals, but it does not by itself recover communities, explain semantics, or guarantee cross-document stability.

Markdown Report Issue Upgrade to Chat

References (2)

Factions Within, Uncertain Across: Within-Document Reader Sub-Groups in Social Highlighting (2026)

Comparing the Switch and Curveball Markov Chains for Sampling Binary Matrices with Fixed Marginals (2017)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Margin-Preserving Curveball Null.

Margin-Preserving Curveball Null

1. Formal definition and state space

2. Why preserve margins

3. Curveball trades and exact preservation of marginals

4. Agreement statistics and within-document sub-groups

5. Region-preserving refinement and cross-document stability

6. Switch chains, spectral comparison, and scope

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Margin-Preserving Curveball Null

1. Formal definition and state space

2. Why preserve margins

3. Curveball trades and exact preservation of marginals

4. Agreement statistics and within-document sub-groups

5. Region-preserving refinement and cross-document stability

6. Switch chains, spectral comparison, and scope

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics