Papers
Topics
Authors
Recent
Search
2000 character limit reached

PlanPers: Personalized Plan-RAG Framework

Updated 28 January 2026
  • The paper introduces PlanPers, which integrates planning modules and personalized signals into RAG systems to achieve up to 15% performance gains on key benchmarks.
  • PlanPers is a modular architecture that formalizes personalization as a control problem, using structured planning and contrastive retrieval to tailor responses based on user-specific features.
  • The system employs a multi-stage pipeline—personalized query rewriting, adaptive retrieval, and action plan generation—resulting in enhanced metrics like ROUGE, BLEU, and improved stylistic fidelity.

Plan-RAG Personalization (PlanPers) is a class of architectures that extend retrieval-augmented generation (RAG) systems with explicit personalization through planning modules, fine-grained user or author features, and contrastive or adaptive retrieval protocols. PlanPers frameworks are characterized by their integration of structured, user- or context-tailored planning and retrieval strategies into the RAG pipeline. The goal is to enhance the semantic alignment, stylistic fidelity, and goal-orientation of generated responses or actions, leveraging diverse personalization signals and advanced pipeline control.

1. Formal Definitions and Mathematical Foundations

PlanPers formalizes personalization as a control problem over the RAG pipeline in which user-specific signals and an explicit planning module guide both retrieval and generation. Let qQq \in Q denote the user query, pPp \in P the user (or author) profile, CC an external corpus, and MM a dynamic user memory. Model parameters are grouped as θP\theta_P (planner), θR\theta_R (retriever), and θG\theta_G (generator). Four core stages are:

  • Pre-retrieval: Personalized query rewriting, q=Q(q,p)q^* = \mathcal{Q}(q, p)
  • Retrieval: Conditional retrieval, D=R(q,C,p;θR)D^* = \mathcal{R}(q^*, C, p; \theta_R)
  • Planning: Generation of action plans, π=P(q,p,D,M;θP)\pi = \mathcal{P}(q^*, p, D^*, M; \theta_P) with π=(a1,,aT)\pi = (a_1,\ldots,a_T)
  • Generation: Response synthesis, g=G(D,p,π;θG)g = \mathcal{G}(D^*, p, \pi; \theta_G)

The composite objective incorporates task success and personalization alignment: L(θP,θG)=Eq,p[Ltask(g,y)+λLpref(g,p)]L(\theta_P, \theta_G) = \mathbb{E}_{q,p}\left[ L_{\mathrm{task}}(g, y) + \lambda L_{\mathrm{pref}}(g, p) \right] where yy is the ground-truth response. In reinforcement settings, rewards r(q,p)=αrsucc(g,y)+βralign(g,p)r(q, p) = \alpha r_{\mathrm{succ}}(g, y) + \beta r_{\mathrm{align}}(g, p) are used to update planning/generation policies by policy gradient (Li et al., 14 Apr 2025, Yazan et al., 24 Mar 2025).

2. Pipeline Architectures and Algorithmic Modules

PlanPers extends canonical RAG by sandwiching a planner between retrieval and generation. This planner outputs a structured plan π\pi (e.g., sub-queries, tool/API calls, memory operations) reflecting personal context.

Generic PlanPers Pipeline:

  1. Feature extraction and context summarization (user/author features, historical traits)
  2. Personalized or adaptive retrieval, including contrastive selection (e.g., hard negatives from out-of-profile sources)
  3. Action plan generation via neural planner or sequence generator
  4. Per-action retrieval/generation as dictated by π\pi
  5. Final integration and output synthesis

A typical PlanPers pseudocode block in author/persona modeling (Yazan et al., 24 Mar 2025, Li et al., 14 Apr 2025) is:

1
2
3
4
5
6
7
Input: query q, profile p, profile history P_p, global profile P_all
Output: personalized generation ŷ
1. Extract user-specific features (sentiment, word frequency, syntax patterns)
2. Retrieve top-K from P_p based on q
3. Sample hard-negative documents from globally least-similar profiles
4. Form enriched prompt with retrieved contexts and user feature vectors
5. Generate response ŷ with LLM using the personalized context

In dialogue or multi-source settings, PlanPers can be realized as a unified sequence-to-sequence system with special tokens (“acting tokens” for source/tool selection, “evaluation tokens” for relevance scoring) (Wang et al., 2024). Each phase—planning, retrieval, generation—is cast as token prediction within a single autoregressive paradigm.

3. Personalization Mechanisms and Signal Integration

Personalization in PlanPers operates at several abstraction levels:

  • Explicit Feature Injection: Features such as sentiment polarity sas_a, word-frequency vectors waw_a, and dependency-pattern counts dad_a are projected via learned matrices into the LLM’s embedding space, concatenated to the context, and made available to the generator (Yazan et al., 24 Mar 2025).
  • Contrastive Examples: Negative or contrastive samples from dissimilar profiles/authors are incorporated, refining the system’s discrimination between target and generic traits. Scoring functions penalize redundancy with sampled contrastive negatives:

score(q,r)=qrλrqr\mathrm{score}(q,r) = q^\top r - \lambda \sum_{r^-} q^\top r^-

Adaptive retriever training can include a contrastive loss, pushing embeddings of personal content close and negatives apart (Yazan et al., 24 Mar 2025).

  • Planner-Guided Personalization: An LLM-based (or similar) planner proposes a high-level sequence of actions, including staged retrievals keyed to personal memory “slots,” API invocations, or evidence fusion steps (Li et al., 14 Apr 2025).
  • Personalized Query Expansion: Systems such as PBR layer user style and semantic structure signals directly into the embedding used for retrieval. Components include Pseudo-Relevance Feedback (P-PRF) for style, and P-Anchor for graph-based structural alignment, culminating in a fused personalized query representation (Zhang et al., 10 Oct 2025).

4. Evaluation Protocols and Experimental Evidence

Benchmarks cover domains including news, scholarly abstracts, tweet paraphrasing (LaMP-4/5/7), synthetic dialogue (PersonaBench), medical planning (MedPlan), and e-commerce. Key evaluation metrics include:

Metric Definition (where applicable)
ROUGE-1, ROUGE-L Lexical n-gram overlap (information preservation, fluency)
Personalization Accuracy Task-specific measure; e.g., frequency of personalized entities/style in output
BLEU, METEOR, BERTScore Lexical and semantic overlap (medical, dialogue)
Planning Success Rate (PSR) Fraction achieving user goals: 1Ni1(exec(πi)goali)\displaystyle \frac{1}{N} \sum_{i} \mathbf{1}({\rm exec}(\pi_i) \models {\rm goal}_i)
Preference Alignment Score Average cosine similarity of user profile–output embeddings

Key outcomes presented in (Yazan et al., 24 Mar 2025):

  • PlanPers (WF + CE) attains ROUGE-L of 0.210 on LaMP-4, a 7.1% increase over baseline RAG.
  • On LaMP-7, PlanPers (DPF + CE) yields a 16.1% relative gain in ROUGE-L over RAG.
  • Ablations confirm the importance of both author features (e.g., dependency patterns) and contrastive examples, which contribute cumulative improvements of up to 15% relative.

Qualitative findings include enhanced alignment with idiosyncratic entity naming, syntactic variation, and domain-specific term use.

5. Exemplars and Domain-Specific Variations

Medical Plan-RAG (MedPlan) (Hsu et al., 23 Mar 2025):

Implements a two-stage personalized pipeline:

  • Assessment generation (S+O → A) via retrieval-augmented LLMs using both cross-patient and self-history.
  • Plan generation (S+O+A → P) further personalized by incorporating both current and historical patient context.
  • Retrieval employs bi-encoder semantic search followed by cross-encoder re-ranking; LLM attention mechanisms process structured longitudinal data.
  • Evaluation on >350,000 notes demonstrates superior performance in BLEU, METEOR, ROUGE, and BERTScore, with a 66% higher clinical appropriateness rating over direct generation baselines.

Dialogue and Multi-Source Planning (UniMS-RAG) (Wang et al., 2024):

All planning, retrieval, and generation functions are integrated via sequence-to-sequence modeling with “acting tokens” for knowledge source selection and “evaluation tokens” for adaptive, per-evidence scoring. Self-refinement iterations optimize evidence coherence and persona consistency. Experimental evaluation achieves superior BLEU/ROUGE/persona consistency over domain and persona selection baselines.

Personalized Query Expansion (PBR) (Zhang et al., 10 Oct 2025):

User-aware query expansion fuses pseudo-feedback on user style and reasoning paths with graph-based semantic anchors derived from user corpora. This integration boosts recall and ranking metrics by up to 10% relative to strong query expansion baselines.

6. Limitations, Challenges, and Future Directions

Several constraints shape current PlanPers architectures:

  • Scalability and Efficiency: Multi-step planning and dynamic memory management remain computationally intensive, particularly for large dynamic user graphs or long-horizon tasks (Li et al., 14 Apr 2025).
  • Cold-Start and Data Sparsity: Sparse or noisy user data impedes effective personalization and may erode retrieval performance (Zhang et al., 10 Oct 2025).
  • Evaluation Gaps: Many benchmarks treat single-domain or one-shot scenarios, lacking direct measures for user satisfaction or longitudinal personalization quality.
  • Hallucination and Drift: Erroneous plan steps may induce retrieval drift or incoherence in downstream generation.

Active research directions include:

  • Lightweight adaptation: Meta-planning with adapters for resource-constrained or on-device scenarios.
  • Meta-reasoning: Planners that dynamically arbitrate between memory recall and tool invocation using uncertainty or context.
  • Advanced feature projection: Learning richer, continuous encodings for user attributes (e.g., NumeroLogic-inspired) (Yazan et al., 24 Mar 2025).
  • Multi-modal extension: Extending plan-personalized retrieval and execution to visual and audio modalities.
  • Privacy preservation: Federated learning for on-device personalization, minimizing centralized data exposure (Li et al., 14 Apr 2025).

A plausible implication is that further integration of reinforcement learning, meta-learning for contrastive selection, and robust multi-modal orchestration will generalize the PlanPers approach to a broader set of user-adaptive, knowledge-intensive applications, including complex dialogue systems, personalized QA, and agent-based planning.

7. Synthesis and Outlook

Plan-RAG Personalization (PlanPers) unifies fine-grained user modeling, dynamic planning, contrastive retrieval, and flexible action sequencing within the RAG pipeline. Across domains, PlanPers demonstrates significant quantitative and qualitative improvements—up to 15% relative gain—over baseline RAG in capturing and deploying idiosyncratic user traits without the need for per-user parameter updates or model retraining (Yazan et al., 24 Mar 2025, Li et al., 14 Apr 2025). PlanPers architectures combine explicit feature engineering, neural planning modules, and integrated contrastive strategies, setting a foundation for the next generation of user-adaptive retrieval-augmented generation systems.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Plan-RAG Personalization (PlanPers).