Interactive Recommendation Feed is a paradigm enabling real-time, explicit user control via natural language commands for adaptive, personalized recommendations.
It employs a dual-agent architecture with a parser and planner to update structured user preferences and dynamically re-rank items through modular tools like filtering and semantic matching.
Empirical evaluations on datasets like Amazon and MovieLens show IRF’s impact on improving recall, NDCG, and business metrics through fewer interaction rounds and enhanced user satisfaction.
An Interactive Recommendation Feed (IRF) is a modern recommender system paradigm that integrates fine-grained user interactions—ranging from explicit natural language commands and attribute constraints to nuanced sequence-based feedback—directly within the mainstream recommendation interface or feed. This approach replaces or augments traditional recommendation pipelines that rely mainly on passive signals (such as clicks and likes), enabling adaptive, controllable, real-time personalization at scale. IRF architectures typically employ modular agents to parse user input, update structured preference models, and dynamically orchestrate the ranking and selection of items in response to user commands, all with the objective of improving both user satisfaction and business outcomes (Tang et al., 25 Sep 2025).
1. Paradigm Shift: From Passive Feedback to Active Explicit Control
Traditional recommender systems aggregate implicit, often ambiguous feedback like clicks, likes, or dwell time. This approach suffers inherent limitations:
Coarse-grained signals cannot distinguish which item attributes drive satisfaction or dissatisfaction.
The system cannot easily model nuanced behavioral motivations or rapid intention drift.
User influence over recommendations is strictly indirect, leading to a persistent gap between user intentions and system interpretations.
In contrast, IRF enables active explicit control by allowing users to express preferences, constraints, or objectives in real time via natural language commands within the mainstream feed. These commands can specify desired attributes, exclude unwanted features, or combine multiple requirements (e.g., “show only long skirts for autumn, not floral, under $200”). Users can iteratively refine their interests through ongoing interaction, and these modifications immediately influence the feed composition and ranking (Tang et al., 25 Sep 2025).
The IRF framework is operationalized via RecBot, which introduces a modular dual-agent architecture:
Parser Agent: Processes free-form natural language commands. It transforms the tuple of the current recommendation list, raw user command, and previous preference state, $(R_t, c_t, P_t),intoupdatedstructuredpreferencesP_{t+1}:</li></ul><p>\mathcal{P}: (R_t, c_t, P_t) \rightarrow P_{t+1}</p><p>Preferencesaredecomposedintopositive(P_{t+1}^+)andnegative(P_{t+1}^-)groups,andfurtherdifferentiatedinto“hard”and“soft”constraints.Hardconstraintsenforcestrictfiltration(e.g.,priceceilings),whilesoftconstraintsbiastherecommendationscoringtowardnuanced,context−dependentinterests.</p><ul><li><strong>PlannerAgent</strong>:Adaptsrecommendationpolicyusinganextensibletoolset.Theplanner<ahref="https://www.emergentmind.com/topics/cho−maison−monopole−antimonopole−pairs−maps"title=""rel="nofollow"data−turbo="false"class="assistant−link"x−datax−tooltip.raw="">maps</a>theupdatedpreferencestate,userhistoryH_t,andcandidatepoolItoanitemscoringvectorS_{t+1}:</li></ul><p>\mathcal{A}: (P_{t+1}, H_t, I) \rightarrow S_{t+1}</p><p>Toolsinclude:−<strong>Filter</strong>:Strictlyremovescandidatesviolatingexplicithardconstraints:</p><p>I' = \{ i \in I : \mathcal{C}^+(i, C_{t+1}^{(+, \text{hard})}) = 1 \wedge \mathcal{C}^-(i, C_{t+1}^{(-, \text{hard})}) = 0 \}</p><ul><li><strong>Matcher</strong>:Computespositiverelevancescoresbyleveragingbothsemanticsimilarity(e.g.,s_\text{sem}(i, P_{t+1}^+) = \text{sim}(e_{\text{item}}(i), e_{\text{intent}}(P_{t+1}^+))usingBGEorSentence−BERT)andcollaborativeattentionoverhistoricalpreferences.</li><li><strong>Attenuator</strong>:Penalizesitemsaccordingtonegativeconstraints.</li><li><strong>Aggregator</strong>:Producesfinalitemscoresasaweightedsum:</p><p>s_\text{final}(i) = \alpha \cdot s_\text{match}(i) + (1-\alpha)\cdot s_\text{atten}(i)</li></ul><p>Thetop−KitemsareselectedtoformtheupdatedfeedR_{t+1}.</p><p>Cross−turnmemoryconsolidationallowsuserpreferencestatestoevolvefluidlyacrossmultiplecommandswithinthesamebrowsingsession,supportingmulti−turn,context−sensitiveadaptation.</p><h2class=′paper−heading′id=′linguistic−command−processing−and−structured−preference−extraction′>3.LinguisticCommandProcessingandStructuredPreferenceExtraction</h2><p>KeytechnicaladvancementliesinRecBot’sparser,whichinterpretsfree−formlinguistics(rangingfromattributerequeststonegativeexclusionsandpreferenceranking)andproducesactionable,structuredrepresentations.Preferencesareboth:</p><ul><li><strong>Attribute−level</strong>:Capturinguserrequirementsforspecificfeatures(e.g.,color,price,style).</li><li><strong>Item−level</strong>:Expressingexplicitlike,dislike,orneutralitytowardparticularentities.</li></ul><p>PreferencesaremodeledasasetP_{t+1} = \{ P_{t+1}^+, P_{t+1}^- \},partitionedintohardandsoftcategories.Theparserconsolidatesnewuserfeedbackwithexistingmemoryviadynamicpreferencestateupdates,supportingbothintentiondriftandexplicitretraction/overwritingofearlierrequirements.Multi−turnconsolidationensuresrobust,up−to−datemodelsofuserintention.</p><h2class=′paper−heading′id=′dynamic−policy−adjustment−tool−orchestration−and−real−time−feed−update′>4.DynamicPolicyAdjustment:Tool−OrchestrationandReal−TimeFeedUpdate</h2><p>Uponparsinglinguisticinput,theplanneragentinstantiates,composes,andorchestratesamodularsequenceof“tools”thatadapttherecommendationpipeline:</p><ul><li>Thefiltertoolfirststrictlyenforceshardconstraints,pruningthecandidatepool.</li><li>Thematcherevaluatescandidatesagainstpositivesignals,usingbothsemanticandcollaborative(historical)features.</li><li>Theattenuatorappliesnegativescoringbasedonexclusionrulesordislikes.</li><li>Theaggregatorthenfusesallsignals,producingare−rankedlistwithexplicittrade−offsbetweenencouragementandpenalization(tunableviaweight\alpha$).
Semantic similarity calculations leverage pre-trained embedding models, while collaborative matching is realized through multi-head attention mechanisms over user–item interaction histories. This schema allows flexible, real-time, per-session adaptation without retraining the entire underlying model.
Practical deployment in large-scale commercial systems is achieved by combining simulation-augmented knowledge distillation with a lightweight model architecture:
Teacher models (e.g., GPT-4.1) generate synthetic multi-turn command–response interactions covering diverse intent expressions and edge cases.
Student models (e.g., Qwen3-14B) are fine-tuned to emulate the teacher’s parsing and planning strategies, maintaining strong reasoning while significantly reducing inference latency and resource consumption.
This approach ensures that the RecBot system can deliver cost-effective real-time adaptation and nuanced reasoning in production settings, at the scale required for mainstream e-commerce or content platforms.
6. Empirical Results and Impact
Comprehensive evaluations, both offline and in long-term online field experiments, show that IRF powered by RecBot provides substantial improvements:
On public datasets such as Amazon, MovieLens, and Taobao, IRF significantly outperforms baseline sequential and agent methods in Recall@N, NDCG@N, and Condition Satisfaction Rate.
In multi-turn scenarios with changing user intentions, RecBot achieves the target recommendation with fewer average rounds.
Long-term online A/B tests demonstrate measurable business gains: negative feedback frequency decreases by 0.71%, while add-to-cart rates and gross merchandise volume increase by 1.28% and 1.40%, respectively. These improvements are directly linked to the system’s capacity for rapid, precise adaptation to explicit user commands.
7. Future Directions
Several avenues for further evolution of IRF systems are evident:
Continuous online learning using in-production feedback to refine preference parsing and ranking strategies.
Enhanced transparency and explainability, enabling user-facing “rationales” for each item’s inclusion based on explicit parsed constraints and collaborative signals.
Broader multi-agent integrations (e.g., incorporating proactive suggestion agents or natural language explanation agents).
Advanced control interfaces, including multimodal (text, voice, touch) interaction and real-time scenario-based preference adjustment.
In summary, the Interactive Recommendation Feed, as instantiated by RecBot’s dual-agent, tool-driven architecture, enables a paradigm shift from passive, implicit behavioral modeling to active, explicit, real-time preference control using natural language commands directly within the recommendation feed. Empirical evidence substantiates both usability and business gains, while the modular design, supporting formulas, and deployment strategies delineate a robust foundation for future interactive recommender system development (Tang et al., 25 Sep 2025).