Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
4 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

REXHA: Hierarchical Recommendation Explanation

Updated 16 July 2025
  • The paper introduces a novel framework that overcomes profile deviation and high retrieval overhead in LLM-based explainable recommendation systems.
  • It deploys a two-stage hierarchical review aggregation and dual-query retrieval strategy to construct concise, noise-reduced user/item profiles.
  • Experimental results show up to 12.6% improvement in explanation quality and inference retrieval under 1 second, highlighting both efficacy and efficiency.

Retrieval-Augmented Recommendation Explanation Generation with Hierarchical Aggregation (REXHA) is a framework designed to generate high-quality, efficient, and personalized explanations within explainable recommender systems that utilize LLMs. REXHA combines collaborative filtering signals, hierarchical review aggregation, and retrieval-augmented natural language generation to address both the issue of “profile deviation” (loss of context in user/item profiles) and high retrieval overhead often seen in earlier LLM-based explainable recommendation models (Sun et al., 12 Jul 2025).

1. System Overview and Motivation

REXHA aims to enhance the transparency and trustworthiness of the recommendation process by generating explanations that are not only accurate and contextually grounded, but also computationally efficient. The motivation is to address two key shortcomings of previous LLM-powered explainable recommender systems:

  • Profile Deviation: Naïve approaches construct user/item profiles by randomly sampling a small set of reviews, which leads to lost contextual information and limited profile representativeness.
  • High Retrieval Overhead: Traditional retrieval techniques (e.g., graph traversal or exhaustive search) have high computational complexity, leading to impractical inference latency for large-scale deployments.

REXHA integrates three main modules:

  • Collaborative Signal Extraction (e.g., LightGCN)
  • Hierarchical Aggregation-Based Profiling (multi-layer review summarization)
  • Efficient Review Retrieval (pseudo-document queries)

The resulting system produces enriched user/item profiles and relevant review evidence, which are combined with collaborative embeddings and input to an LLM (such as LLaMA-2-7B) to generate contextualized explanation texts.

2. Hierarchical Aggregation for Profile Construction

The hierarchical aggregation module is foundational for constructing comprehensive and noise-reduced user and item profiles. The process unfolds in two major stages:

a. Raw Review Summarization

  • At the leaf level, each individual review is considered a raw input.
  • Reviews are grouped (typically as pairs) and summarized using an LLM with a specifically designed prompt. Each group is condensed to a shorter summary.
  • This process forms the first layer of a hierarchical tree, analogous to pairwise merging in a k-ary tree.

b. Multi-Layered Aggregation

  • The first-level summaries are further grouped and summarized recursively, layer by layer.
  • Summarization continues up the hierarchy until a single root summary (“profile”) is produced for the user or item.
  • For users, aggregation is performed across all reviews of interacted items; for items, it includes all written reviews.

This design addresses typical issues with sequence length in LLMs and minimizes information loss, ensuring the resulting profile succinctly captures long-term preferences and thematic consilience while filtering out review noise and redundancy.

3. Efficient Review Retrieval via Pseudo-Document Queries

To supply the explanation generator with additional auxiliary evidence, REXHA introduces a dual-query retrieval module that sharply reduces retrieval latency while maintaining recall of relevant review information:

a. Latent Representation Query

  • All summaries of the user’s and item’s reviews are embedded via an encoder ff.
  • User and item embedding sets are Qu={r^u,1,,r^u,N}Q_u = \{\hat{r}_{u,1}, \ldots, \hat{r}_{u,N} \} and Qv={r^v,1,,r^v,M}Q_v = \{\hat{r}_{v,1}, \ldots, \hat{r}_{v,M} \}.
  • The latent query vector is constructed as the average:

qlatent=12(1Ni=1Nr^u,i+1Mj=1Mr^v,j)q_\text{latent} = \frac{1}{2} \bigg( \frac{1}{N}\sum_{i=1}^{N} \hat{r}_{u,i} + \frac{1}{M}\sum_{j=1}^{M} \hat{r}_{v,j} \bigg)

  • This query captures salient, global review semantics for efficient retrieval.

b. Profile Query

  • The constructed user and item profiles are encoded (potentially with a distinct, contrastively trained embedding model ff').
  • The retrieval objective aligns the profile’s embedding to relevant review embeddings by minimizing the contrastive loss:

Lcontrastive=logexp(sim(pu,v,r^u,v)/τ)exp(sim(pu,v,r^u,v)/τ)+(u,v)(u,v)exp(sim(pu,v,r^u,v)/τ)\mathcal{L}_\text{contrastive} = -\log \frac{\exp(\text{sim}(p_{u,v}, \hat{r}_{u,v})/\tau)}{\exp(\text{sim}(p_{u,v}, \hat{r}_{u,v})/\tau) + \sum_{(u',v')\neq(u,v)} \exp(\text{sim}(p_{u',v'}, \hat{r}_{u',v'})/\tau)}

where sim(,)\text{sim}(\cdot, \cdot) denotes cosine similarity and τ\tau is a temperature parameter.

Both queries retrieve top-qq reviews by cosine similarity, providing rich and relevant review segments to ground explanation generation. This approach is computationally efficient, with inference retrieval consistently below 1 second, in contrast to multi-minute latencies in prior art.

4. Integration with Collaborative Filtering and Explanation Generation

A graph neural network (often LightGCN) is employed to extract collaborative signals from the user–item interaction graph. These embeddings encode broader behavior patterns and interaction structure, complementing the textual profiles and retrieved reviews.

All these inputs—user profile, item profile, retrieved reviews, and collaborative embeddings—are concatenated and passed to an LLM for natural language explanation generation. The LLM, leveraging this enriched, multi-source context, generates explanations that exhibit both factual precision (by referencing retrieved evidence) and holistic understanding of user–item relations.

5. Experimental Evaluation and Empirical Results

REXHA was evaluated across standard explainable recommendation datasets, including Amazon-books, Yelp, and Google-reviews. Key findings include:

  • Explanation Quality: REXHA achieved up to 12.6% improvement in BERT precision scores over state-of-the-art models (XRec, G-Refer), with consistent gains in recall, F1, and GPT-based (LLM) evaluation scores.
  • Retrieval Efficiency: Inference retrieval time for REXHA was under 1 second, compared to over 4 minutes for competitive baselines.
  • Ablation Studies: Both hierarchical aggregation and the retrieval module were shown to be critical; removing either degraded quality and/or efficiency.

Relevant formulas:

  • Collaborative signal extraction (LightGCN propagation):

eu(l+1)=iNu1NuNiei(l)e_u^{(l+1)} = \sum_{i \in \mathcal{N}_u} \frac{1}{\sqrt{|\mathcal{N}_u||\mathcal{N}_i|}} e_i^{(l)}

  • Latent query construction:

qlatent=(qu+qv)/2q_\text{latent} = (q_u + q_v)/2

Performance was measured with both automatic (BERT_score, GPT_score) and human-centric stability (standard deviation) metrics.

6. Analysis of Computational and Practical Considerations

While the hierarchical aggregation module incurs higher preprocessing cost (potentially up to 20 hours for large datasets), it is significantly faster and more scalable than prior review summarization approaches. Inference retrieval and explanation generation are highly efficient, supporting scalable real-world deployment. The review retrieval module is parallelizable and compatible with fast nearest neighbor search implementations.

Identified limitations include preprocessing overhead and the absence of deeper post-retrieval filtering (e.g., re-ranking or information extraction), suggesting directions for further optimization. The framework’s modular design accommodates advances in both LLM architectures and collaborative filtering models.

7. Implications, Limitations, and Future Directions

REXHA marks a significant advance in explainable recommender systems, combining hierarchical review aggregation, efficient retrieval, and LLM-based generation for high-quality, trustworthy explanations. Its efficient retrieval module supports low-latency inference, and hierarchical summarization ensures context-rich profiles.

Potential areas for future work include:

  • Dynamic pruning of aggregation trees to further reduce computation.
  • Enhanced post-retrieval processing (e.g., re-ranking, keyphrase extraction).
  • Stronger alignment of textual profiles and review representations via improved contrastive learning, especially for lexically diverse domains.

The framework holds strong promise for real-world explainable recommendation settings where both interpretability and computational efficiency are required. Its combination of architectural rigor and empirical performance positions REXHA as a compelling state-of-the-art solution for retrieval-augmented, hierarchically aggregated recommendation explanation generation (Sun et al., 12 Jul 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this topic yet.