PerAugy: Augmentation for Personalized Summaries

Updated 18 October 2025

PerAugy is a data augmentation methodology that leverages double shuffling and stochastic Markovian perturbation to create diverse user trajectories.
Its two-stage pipeline increases synthetic training data diversity, enabling more robust and subjectively adaptive summarization models.
Quantitative evaluations reveal significant improvements in metrics like AUC and PSE-SU4, underscoring its impact on personalized summarization.

PerAugy is a data augmentation methodology explicitly developed to address the challenges inherent in training personalized text summarization systems, with a particular focus on increasing the diversity and dynamic complexity of user interaction histories. Its central innovation lies in a two-stage pipeline—double shuffling and stochastic Markovian perturbation—that operates on user interaction graphs (UIGs) to create synthetic yet coherent user trajectories, thereby enabling personalized summarizers to better generalize over the subjective and evolving nature of user preferences. The technique is quantitatively evaluated using a suite of diversity and personalization metrics and has demonstrated substantial gains in user-adaptive summarization frameworks.

1. Motivation and Problem Context

PerAugy targets a major bottleneck in personalized summarization: the lack of sufficiently diverse, user-specific training data that simultaneously reflects both user preference trajectories (e.g., click/skip patterns) and associated summary targets. Canonical datasets such as MS/CAS PENS capture user behaviors but lack labeled, gold-standard summaries for each interaction, which precludes fully supervised personalization. Furthermore, existing interaction histories often suffer from limited topic-transition diversity, thereby narrowing the spectrum of user behavior observable by learning algorithms.

The methodology is founded on the hypothesis that “the trajectory diversity of training data is directly proportional to the personalization capabilities” of a summarizer. By systematically enhancing the diversity and thematic shift in the training set through augmentation, PerAugy aims to build models more robust to the multifaceted and subjective nature of summarization tasks (Chatterjee et al., 11 Oct 2025).

2. Augmentation Pipeline: Double Shuffling and Stochastic Markovian Perturbation

PerAugy comprises two principal augmentation mechanisms:

Double Shuffling (DS)

Double Shuffling is a cross-trajectory augmentation technique that modifies an existing set of user trajectories as follows:

A seed set of $m$ user trajectories is sampled from the base UIG dataset.
For each target trajectory, an offset $O$ is randomly selected to preserve the initial, natural portion.
At predetermined indices, separated by a fixed gap-length $g_l$ , a segment of the target trajectory is substituted with a segment drawn from a different (source) user's trajectory.
This segmental substitution is performed bi-directionally, producing hybrid user profiles that amplify interaction pattern diversity.

For example, given a click/skip sequence such as:

CLK:MT → CLK:YR → SKP:PR → CLK:GW → SKP:TA

the procedure may protect the first three events, then insert two consecutive steps from another user's trajectory, followed by additional substitutions downstream.

Stochastic Markovian Perturbation (SMP)

While DS injects diversity, synthetically constructed user paths may exhibit incongruent summary nodes (“s-nodes”) that do not correspond to a natural thematic flow. SMP addresses this issue through local coherence repair:

For every substituted s-node at time $t_i$ , the context is defined as a window of $k$ preceding steps.
Candidate replacement sentences for the s-node are identified within the content of the previous document node $d^{(t_{i-1})}$ .
The optimal candidate $\hat{s}^{(t_i)}$ minimizes a context-sensitive distance function:

$\hat{s}^{(t_i)} = \underset{st_p \in d^{(t_{i-1})}}{\arg\min}\; \sum_{q=1}^{k} e^{-\lambda\cdot pos(c_q)}\, \sigma(e_{st_p}, e_{c_q})$

where $e_{st_p}$ and $e_{c_q}$ are (e.g., SBERT-based) embeddings for the candidate sentence and context tokens, $\sigma$ denotes a distance metric (such as RMSD or Manhattan distance), $pos(c_q)$ is a position indicator, and $\lambda$ is an exponential decay constant.

Through the interplay of DS and SMP, PerAugy generates augmented user interaction graphs that are both thematically diverse and locally coherent.

3. Evaluation and Comparative Baselines

PerAugy is benchmarked against several state-of-the-art user-encoder and summarization methods:

Four user-encoders—NAML, NRMS, EBNR, and TrRMIo—are retrained from scratch on PerAugy-augmented data. With DS+SMP, these models consistently achieve higher accuracy (e.g., AUC $\sim$ 0.59) relative to previous augmentation methods (e.g., PENS-SH, S3-Aug).
For summarization, two architectures are studied:
- PENS: a pointer-generator network that incorporates user embeddings during summary (headline) generation.
- GTP (“General Then Personal”): which decomposes summarization into a general stage and a user-specific adaptation stage using refined user encodings.

When user encoders trained via PerAugy are integrated or fine-tuned within these frameworks, there is a demonstrable increase in personalized summarization accuracy. Notably, the GTP+TrRMIo configuration enhanced with PerAugy achieves a 61.2% average relative boost in PSE-SU4 degree-of-personalization metric.

The table below summarizes improvements with PerAugy augmentation:

Baseline	Metric	Gain with PerAugy
NAML, NRMS, EBNR, TrRMIo	AUC, MRR, nDCG	Up to 0.132↑ (AUC)
GTP+TrRMIo	PSE-SU4	61.2%↑
PENS	PSE-SU4	Up to 75%↑ (in some settings)

4. Metrics: Accuracy, Personalization, and Diversity

PerAugy’s performance is evaluated along two main axes:

Accuracy of User-Encoders

AUC (Area Under the ROC Curve): Probability that a positive (clicked) document ranks above negative (skipped) ones.
MRR (Mean Reciprocal Rank): Measures the earliest correct prediction.
nDCG@k (Normalized Discounted Cumulative Gain): Assesses both the relevance and position of positive items in the ranking.

All metrics improve consistently when training with PerAugy-augmented user trajectories.

Personalization: PSE-SU4

PSE-SU4 (PerSEval with ROUGE-SU4): Measures the degree of personalization, defined using skip-bigram overlap between generated and reference summaries for each user:

$R_{\text{SU4}} = \frac{\text{Matching skip-bigrams in } G \text{ and } R}{\text{Total skip-bigrams in } R}$

$P_{\text{SU4}} = \frac{\text{Matching skip-bigrams in } G \text{ and } R}{\text{Total skip-bigrams in } G}$

$F1_{\text{SU4}} = \frac{2 P_{\text{SU4}} R_{\text{SU4}}}{P_{\text{SU4}} + R_{\text{SU4}}}$

Successful augmentation notably raises PSE-SU4 by up to 75% depending on the model, confirming improvements in personalized response.

Diversity Metrics

Three explicit metrics quantify the degree of variation in user trajectories:

Topics per Trajectory (TP): Number of unique topics addressed in a trajectory.

$TP(\tau^{u_j}) = \left| \{ \text{topic}(d^{(t_i)}),\; i = 1,\ldots,l \} \right|$

Rate of Topic Change (RTC): Proportion of adjacent time steps with a topic switch.

$RTC(\tau^{u_j}) = \frac{1}{l-1}\sum_{i=1}^{l-1} \mathbf{1}[\text{topic}(d^{(t_i)}) \neq \text{topic}(d^{(t_{i+1})})]$

Degree-of-Diversity (DegreeD): Measures alignment between the divergence of document nodes and summary nodes across a user’s trajectory.

$DegreeD(\mathcal{D}) = \frac{\alpha}{|\mathbf{U}|} \sum_{j=1}^{|\mathbf{U}|} \delta_{s_j} \cdot \mathds{E}_j$

Empirical analysis confirms that both TP and DegreeD correlate strongly with user-encoder accuracy (Pearson’s $r\sim 0.7-0.8$ ), while RTC can overestimate diversity if topic switching is limited to a small topic subset.

5. Impact on Personalized Summarization Frameworks

Injection of PerAugy-augmented user-encoder representations into leading summarizer frameworks translates directly into improved personalization and generalization. Both PENS and GTP benefit from user encoders expressing more diverse and representative behavioral patterns: the resultant summaries show higher divergence across user profiles, increased alignment with user histories, and robust performance even in regimes of limited labeled data.

A key implication is that dataset-induced diversity, particularly as captured by TP and DegreeD, is a controlling variable for model personalization performance. This facilitates more nuanced assessments of dataset quality and encourages targeted data construction or augmentation strategies as a core component in system design.

6. Broader Implications and Future Directions

PerAugy sets a precedent for augmentation-based personalization in sequence--modeling applications beyond summarization. Potential extensions include:

Combining trajectory augmentation with LLM-driven context perturbations for more adaptive and realistic user-path synthesis. This suggests opportunities for hybrid augmentation pipelines.
Establishing formal connections to stochastic process theory, such as the Itô process, to ground the perturbation steps, which may lead to better modeled user evolution.
Applying the learnt diversity metrics (TP, RTC, DegreeD) to inform selection and pruning of synthetic samples, further optimizing for effective training resource allocation.

PerAugy provides a generalizable augmentation paradigm with an extensible toolkit for quantifying and exploiting user-level diversity, serving as a reference point for future work in user-adaptive text summarization and personalized sequence modeling.

7. Summary

PerAugy is a two-stage data augmentation framework (DS and SMP) operating on user interaction graphs to increase diversity and thematic realism in synthetic user trajectories. Its effectiveness is validated via substantial improvements in user-encoder and personalized summarizer accuracy across AUC, MRR, nDCG, and PSE-SU4, with accompanying quantitative validation using TP, RTC, and DegreeD diversity metrics (Chatterjee et al., 11 Oct 2025). The methodology demonstrates that increased and well-aligned diversity in user data is pivotal to improved personalization in text summarization, and it offers both design principles and evaluative criteria applicable across a range of sequence-based personalization systems.

PDF Markdown Chat (Pro)

References (1)

Diversity Augmentation of Dynamic User Preference Data for Boosting Personalized Text Summarizers (2025)

Follow Topic

Get notified by email when new papers are published related to PerAugy.