Leveraging Translation For Optimal Recall: Tailoring LLM Personalization With User Profiles (2402.13500v1)
Abstract: This paper explores a novel technique for improving recall in cross-language information retrieval (CLIR) systems using iterative query refinement grounded in the user's lexical-semantic space. The proposed methodology combines multi-level translation, semantic embedding-based expansion, and user profile-centered augmentation to address the challenge of matching variance between user queries and relevant documents. Through an initial BM25 retrieval, translation into intermediate languages, embedding lookup of similar terms, and iterative re-ranking, the technique aims to expand the scope of potentially relevant results personalized to the individual user. Comparative experiments on news and Twitter datasets demonstrate superior performance over baseline BM25 ranking for the proposed approach across ROUGE metrics. The translation methodology also showed maintained semantic accuracy through the multi-step process. This personalized CLIR framework paves the path for improved context-aware retrieval attentive to the nuances of user language.
- Information retrieval with semantic annotation. no, 1:24–26, 2019.
- Dr V Suma. A novel information retrieval system for distributed cloud using hybrid deep fuzzy hashing algorithm. Journal of Information Technology and Digital World, 2(3):151–160, 2020.
- Semantic sensitive tf-idf to determine word relevance in documents. In Advances in Computing and Network Communications: Proceedings of CoCoNet 2020, Volume 2, pages 327–337. Springer, 2021.
- Medical information retrieval systems for e-health care records using fuzzy based machine learning model. Microprocessors and Microsystems, page 103344, 2020.
- Group-based personalization using topical user profile. In Adjunct Publication of the 28th ACM Conference on User Modeling, Adaptation and Personalization, pages 181–186, 2020.
- A fuzzy ontology framework in information retrieval using semantic query expansion. International Journal of Information Management Data Insights, 1(1):100009, 2021.
- Automatic image annotation based on an improved nearest neighbor technique with tag semantic extension model. Procedia Computer Science, 183:616–623, 2021.
- S. Dahir and A. El Qadi. A query expansion method based on topic modelling and dbpedia features. International Journal of Information Management Data Insights, 1(2):e100043, 2021.
- L. Ballesteros and W. Bruce Croft. Dictionary methods for cross-lingual information retrieval. In Proceedings of the 7th International Conference on Database and Expert Systems Applications, DEXA ’96, pages 791–801, 1996.
- D. A. Hull and G. Grefenstette. Querying across languages: A dictionary-based approach to multilingual information retrieval. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’96, pages 49–57, 1996.
- Looking inside the box: Context-sensitive translation for cross-language information retrieval. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1105–1106. ACM, 2012.
- A comparative study of online translation services for cross language information retrieval. In Proceedings of the 24th International Conference on World Wide Web, pages 859–864, 2015.
- Combining query translation techniques to improve cross-language information retrieval. In Advances in Information Retrieval. ECIR 2011. Lecture Notes in Computer Science, vol 6611. Springer, Berlin, Heidelberg, 2011.
- Ari Pirkola. The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval. In SIGIR ’98, pages 55–63, 1998.
- Dictionary-based techniques for cross-language information retrieval. Information Processing & Management, 41(3):523–547, 2005.
- Automatic cross-language information retrieval using latent semantic indexing. In Cross-language information retrieval, pages 51–62. Springer, 1998.
- They are out there, if you know where to look: Mining transliterations of oov query terms for cross-language information retrieval. In European Conference on Information Retrieval, pages 437–448. Springer Berlin Heidelberg, 2009.
- Hindi to english and marathi to english cross language information retrieval evaluation. In 8th Workshop of the Cross-Language Evaluation Forum, CLEF 2007, Revised Selected Papers, pages 111–118, Budapest, Hungary, 2008.
- Using word embeddings for query translation for hindi to english cross language information retrieval. Computación y Sistemas, 20(3):435–447, 2016.
- D. Zhou et al. Query expansion with enriched user profiles for personalized search utilizing folksonomy data. IEEE Transactions on Knowledge and Data Engineering, 29(7):1536–1548, 2017.
- Rethinking query expansion for bert reranking. In J. Jose et al., editors, Advances in Information Retrieval, volume 12036 of Lecture Notes in Computer Science. Springer, Cham, 2020.
- Hir: a hybrid ir ranking model. In IEEE 45th Annual Computers, Software, and Applications Conference, pages 1717–1722, 2021.
- Lamp: When large language models meet personalization. arXiv preprint arXiv:2304.11406, 2023.
- Twitter sentiment classification using distant supervision. CS224N project report, Stanford, 1(12):2009, 2009.
- Karthik Ravichandran (1 paper)
- Sarmistha Sarna Gomasta (2 papers)