Interactive Recommendation Feed

Updated 28 September 2025

Interactive Recommendation Feed is a paradigm enabling real-time, explicit user control via natural language commands for adaptive, personalized recommendations.
It employs a dual-agent architecture with a parser and planner to update structured user preferences and dynamically re-rank items through modular tools like filtering and semantic matching.
Empirical evaluations on datasets like Amazon and MovieLens show IRF’s impact on improving recall, NDCG, and business metrics through fewer interaction rounds and enhanced user satisfaction.

An Interactive Recommendation Feed (IRF) is a modern recommender system paradigm that integrates fine-grained user interactions—ranging from explicit natural language commands and attribute constraints to nuanced sequence-based feedback—directly within the mainstream recommendation interface or feed. This approach replaces or augments traditional recommendation pipelines that rely mainly on passive signals (such as clicks and likes), enabling adaptive, controllable, real-time personalization at scale. IRF architectures typically employ modular agents to parse user input, update structured preference models, and dynamically orchestrate the ranking and selection of items in response to user commands, all with the objective of improving both user satisfaction and business outcomes (Tang et al., 25 Sep 2025).

1. Paradigm Shift: From Passive Feedback to Active Explicit Control

Traditional recommender systems aggregate implicit, often ambiguous feedback like clicks, likes, or dwell time. This approach suffers inherent limitations:

Coarse-grained signals cannot distinguish which item attributes drive satisfaction or dissatisfaction.
The system cannot easily model nuanced behavioral motivations or rapid intention drift.
User influence over recommendations is strictly indirect, leading to a persistent gap between user intentions and system interpretations.

In contrast, IRF enables active explicit control by allowing users to express preferences, constraints, or objectives in real time via natural language commands within the mainstream feed. These commands can specify desired attributes, exclude unwanted features, or combine multiple requirements (e.g., “show only long skirts for autumn, not floral, under $200”). Users can iteratively refine their interests through ongoing interaction, and these modifications immediately influence the feed composition and ranking (Tang et al., 25 Sep 2025).

2. Architecture: Modular Dual-Agent Design (RecBot)

The IRF framework is operationalized via RecBot, which introduces a modular dual-agent architecture:

Parser Agent: Processes free-form natural language commands. It transforms the tuple of the current recommendation list, raw user command, and previous preference state, $(R_t, c_t, P_t) $, into updated structured preferences$ P_{t+1} $:</li> </ul> $ \mathcal{P}: (R_t, c_t, P_t) \rightarrow P_{t+1} $ Preferences are decomposed into positive ($ P_{t+1}^+ $) and negative ($ P_{t+1}^- $) groups, and further differentiated into “hard” and “soft” constraints. Hard constraints enforce strict filtration (e.g., price ceilings), while soft constraints bias the recommendation scoring toward nuanced, context-dependent interests. <ul> <li>Planner Agent: Adapts recommendation policy using an extensible toolset. The planner maps the updated preference state, user history$ H_t $, and candidate pool$ I $to an item scoring vector$ S_{t+1} $:</li> </ul> $ \mathcal{A}: (P_{t+1}, H_t, I) \rightarrow S_{t+1} $ Tools include: - Filter: Strictly removes candidates violating explicit hard constraints: $ I' = \{ i \in I : \mathcal{C}^+(i, C_{t+1}^{(+, \text{hard})}) = 1 \wedge \mathcal{C}^-(i, C_{t+1}^{(-, \text{hard})}) = 0 \} $ <ul> <li>Matcher: Computes positive relevance scores by leveraging both semantic similarity (e.g.,$ s_\text{sem}(i, P_{t+1}^+) = \text{sim}(e_{\text{item}}(i), e_{\text{intent}}(P_{t+1}^+)) $using BGE or Sentence-BERT) and collaborative attention over historical preferences.</li> <li>Attenuator: Penalizes items according to negative constraints.</li> <li>Aggregator: Produces final item scores as a weighted sum: $ s_\text{final}(i) = \alpha \cdot s_\text{match}(i) + (1-\alpha)\cdot s_\text{atten}(i) $</li> </ul> The top-$ K $items are selected to form the updated feed$ R_{t+1} $. Cross-turn memory consolidation allows user preference states to evolve fluidly across multiple commands within the same browsing session, supporting multi-turn, context-sensitive adaptation. <h2 class='paper-heading' id='linguistic-command-processing-and-structured-preference-extraction'>3. Linguistic Command Processing and Structured Preference Extraction</h2> Key technical advancement lies in RecBot’s parser, which interprets free-form linguistics (ranging from attribute requests to negative exclusions and preference ranking) and produces actionable, structured representations. Preferences are both: <ul> <li>Attribute-level: Capturing user requirements for specific features (e.g., color, price, style).</li> <li>Item-level: Expressing explicit like, dislike, or neutrality toward particular entities.</li> </ul> Preferences are modeled as a set$ P_{t+1} = \{ P_{t+1}^+, P_{t+1}^- \} $, partitioned into hard and soft categories. The parser consolidates new user feedback with existing memory via dynamic preference state updates, supporting both intention drift and explicit retraction/overwriting of earlier requirements. Multi-turn consolidation ensures robust, up-to-date models of user intention. <h2 class='paper-heading' id='dynamic-policy-adjustment-tool-orchestration-and-real-time-feed-update'>4. Dynamic Policy Adjustment: Tool-Orchestration and Real-Time Feed Update</h2> Upon parsing linguistic input, the planner agent instantiates, composes, and orchestrates a modular sequence of “tools” that adapt the recommendation pipeline: <ul> <li>The filter tool first strictly enforces hard constraints, pruning the candidate pool.</li> <li>The matcher evaluates candidates against positive signals, using both semantic and collaborative (historical) features.</li> <li>The attenuator applies negative scoring based on exclusion rules or dislikes.</li> <li>The aggregator then fuses all signals, producing a re-ranked list with explicit trade-offs between encouragement and penalization (tunable via weight$ \alpha$).

Semantic similarity calculations leverage pre-trained embedding models, while collaborative matching is realized through multi-head attention mechanisms over user–item interaction histories. This schema allows flexible, real-time, per-session adaptation without retraining the entire underlying model.

5. Deployment: Simulation-Augmented Knowledge Distillation

Practical deployment in large-scale commercial systems is achieved by combining simulation-augmented knowledge distillation with a lightweight model architecture:

Teacher models (e.g., GPT-4.1) generate synthetic multi-turn command–response interactions covering diverse intent expressions and edge cases.
Student models (e.g., Qwen3-14B) are fine-tuned to emulate the teacher’s parsing and planning strategies, maintaining strong reasoning while significantly reducing inference latency and resource consumption.

This approach ensures that the RecBot system can deliver cost-effective real-time adaptation and nuanced reasoning in production settings, at the scale required for mainstream e-commerce or content platforms.

6. Empirical Results and Impact

Comprehensive evaluations, both offline and in long-term online field experiments, show that IRF powered by RecBot provides substantial improvements:

On public datasets such as Amazon, MovieLens, and Taobao, IRF significantly outperforms baseline sequential and agent methods in Recall@N, NDCG@N, and Condition Satisfaction Rate.
In multi-turn scenarios with changing user intentions, RecBot achieves the target recommendation with fewer average rounds.
Long-term online A/B tests demonstrate measurable business gains: negative feedback frequency decreases by 0.71%, while add-to-cart rates and gross merchandise volume increase by 1.28% and 1.40%, respectively. These improvements are directly linked to the system’s capacity for rapid, precise adaptation to explicit user commands.

7. Future Directions

Several avenues for further evolution of IRF systems are evident:

Continuous online learning using in-production feedback to refine preference parsing and ranking strategies.
Enhanced transparency and explainability, enabling user-facing “rationales” for each item’s inclusion based on explicit parsed constraints and collaborative signals.
Broader multi-agent integrations (e.g., incorporating proactive suggestion agents or natural language explanation agents).
Advanced control interfaces, including multimodal (text, voice, touch) interaction and real-time scenario-based preference adjustment.
Extending manipulation beyond item ranking—enabling controllable optimization over multiple objectives (fairness, revenue, satisfaction) and broader task types (e.g., watch-time maximization, multi-objective trade-offs).

In summary, the Interactive Recommendation Feed, as instantiated by RecBot’s dual-agent, tool-driven architecture, enables a paradigm shift from passive, implicit behavioral modeling to active, explicit, real-time preference control using natural language commands directly within the recommendation feed. Empirical evidence substantiates both usability and business gains, while the modular design, supporting formulas, and deployment strategies delineate a robust foundation for future interactive recommender system development (Tang et al., 25 Sep 2025).

PDF Markdown Chat (Pro)

References (1)

Interactive Recommendation Agent with Active User Commands (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Interactive Recommendation Feed (IRF).