Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Recommending Missed Citations Identified by Reviewers: A New Task, Dataset and Baselines (2403.01873v1)

Published 4 Mar 2024 in cs.IR

Abstract: Citing comprehensively and appropriately has become a challenging task with the explosive growth of scientific publications. Current citation recommendation systems aim to recommend a list of scientific papers for a given text context or a draft paper. However, none of the existing work focuses on already included citations of full papers, which are imperfect and still have much room for improvement. In the scenario of peer reviewing, it is a common phenomenon that submissions are identified as missing vital citations by reviewers. This may lead to a negative impact on the credibility and validity of the research presented. To help improve citations of full papers, we first define a novel task of Recommending Missed Citations Identified by Reviewers (RMC) and construct a corresponding expert-labeled dataset called CitationR. We conduct an extensive evaluation of several state-of-the-art methods on CitationR. Furthermore, we propose a new framework RMCNet with an Attentive Reference Encoder module mining the relevance between papers, already-made citations, and missed citations. Empirical results prove that RMC is challenging, with the proposed architecture outperforming previous methods in all metrics. We release our dataset and benchmark models to motivate future research on this challenging new task.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. Peer grading the peer reviews: A dual-role approach for lightening the scholarly paper review process. Proceedings of the Web Conference 2021.
  2. Scibert: A pretrained language model for scientific text. In Conference on Empirical Methods in Natural Language Processing.
  3. Content-based citation recommendation. In North American Chapter of the Association for Computational Linguistics.
  4. A method for automatically estimating the informativeness of peer reviews. In ICON.
  5. Systematic Approaches to a Successful Literature Review.
  6. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
  7. Generative adversarial network based heterogeneous bibliographic network representation for personalized citation recommendation. In AAAI.
  8. Covidsum: A linguistically enriched scibert-based summarization model for covid-19 scientific papers. Journal of Biomedical Informatics, 127:103999 – 103999.
  9. Aspect-based sentiment analysis of scientific reviews. Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020.
  10. Evaluating large language models trained on code. ArXiv, abs/2107.03374.
  11. Specter: Document-level representation learning using citation-informed transformers. In Annual Meeting of the Association for Computational Linguistics.
  12. Hierarchical bi-directional self-attention networks for paper review rating recommendation. ArXiv, abs/2011.00802.
  13. Bert: Pre-training of deep bidirectional transformers for language understanding. ArXiv, abs/1810.04805.
  14. Neural citation network for context-aware citation recommendation. Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.
  15. Recommender systems in the era of large language models (llms). ArXiv, abs/2307.02046.
  16. Michael Färber and Adam Jatowt. 2020. Citation recommendation: approaches and datasets. International Journal on Digital Libraries, pages 1 – 31.
  17. Michael Färber and Ashwath Sampath. 2020. Hybridcite: A hybrid model for context-aware citation recommendation. Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020.
  18. Local citation recommendation with hierarchical-attention text encoder and scibert-based reranking. In ECIR.
  19. Shashank Gupta and Vasudeva Varma. 2017. Scientific article recommendation by using distributed representations of text and graph. Proceedings of the 26th International Conference on World Wide Web Companion.
  20. A systematic study of bias amplification. ArXiv, abs/2201.11706.
  21. hyperdoc2vec: Distributed representations of hypertext documents. In ACL.
  22. Citation recommendation without author supervision. In WSDM ’11.
  23. Context-aware citation recommendation. In WWW ’10.
  24. Large language models are zero-shot rankers for recommender systems. ArXiv, abs/2305.08845.
  25. Fusion of domain knowledge and text features for query expansion in citation recommendation. In KSEM.
  26. Argument mining for understanding peer reviews. ArXiv, abs/1903.10104.
  27. Recommending citations: translating papers into references. Proceedings of the 21st ACM international conference on Information and knowledge management.
  28. A neural probabilistic model for context based citation recommendation. In AAAI.
  29. Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst., 20(4):422–446.
  30. Cross-language citation recommendation via hierarchical representation learning on heterogeneous graph. The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval.
  31. Deepaspeer: Towards an aspect-level sentiment controllable framework for decision prediction from academic peer reviews. 2022 ACM/IEEE Joint Conference on Digital Libraries (JCDL), pages 1–11.
  32. Multi-task peer-review score prediction. In SDP.
  33. Interpretable aspect-aware capsule network for peer review based citation count prediction. ACM Transactions on Information Systems.
  34. A neural citation count prediction model based on peer review text. In Conference on Empirical Methods in Natural Language Processing.
  35. Automated scholarly paper review: Technologies and challenges.
  36. Once: Boosting content-based recommendation with both open- and closed-source large language models.
  37. Meta-path-based ranking with pseudo relevance feedback on heterogeneous graph for citation recommendation. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM 2014, Shanghai, China, November 3-7, 2014, pages 121–130. ACM.
  38. Ilya Loshchilov and Frank Hutter. 2017. Fixing weight decay regularization in adam. ArXiv, abs/1711.05101.
  39. Explaining relationships between scientific documents. In Annual Meeting of the Association for Computational Linguistics.
  40. Zoran Medic and Jan Snajder. 2022. Large-scale evaluation of transformer-based article encoders on the task of citation recommendation. In SDP.
  41. Neighborhood contrastive learning for scientific document representations with citation embeddings. ArXiv, abs/2202.06671.
  42. Richard Pears and Graham Shields. 2013. Cite Them Right: The Essential Referencing Guide.
  43. Enhancing citation recommendation using citation network embedding. Scientometrics, 127:233 – 264.
  44. Cluscite: effective citation recommendation by information network-based clustering. Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining.
  45. Stephen E. Robertson and Steve Walker. 1999. Okapi/keenbow at trec-8. In Text Retrieval Conference.
  46. Kazunari Sugiyama and Min-Yen Kan. 2013. Exploiting potential citation papers in scholarly paper recommendation. In 13th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL ’13, Indianapolis, IN, USA, July 22 - 26, 2013, pages 153–162. ACM.
  47. Kazunari Sugiyama and Min-Yen Kan. 2015. A comprehensive evaluation of scholarly paper recommendation using potential citation papers. Int. J. Digit. Libr., 16(2):91–109.
  48. Jie Tang and Jing Zhang. 2009. A discriminative approach to topic-based citation recommendation. In PAKDD.
  49. Llama: Open and efficient foundation language models. ArXiv, abs/2302.13971.
  50. Llama 2: Open foundation and fine-tuned chat models. ArXiv, abs/2307.09288.
  51. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 5998–6008.
  52. Ellen M. Voorhees. 1999. The TREC-8 question answering track report. In Proceedings of The Eighth Text REtrieval Conference, TREC 1999, Gaithersburg, Maryland, USA, November 17-19, 1999, volume 500-246 of NIST Special Publication. National Institute of Standards and Technology (NIST).
  53. Sentiment analysis of peer review texts for scholarly papers. The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval.
  54. Reviewrobot: Explainable paper review generation based on knowledge synthesis. ArXiv, abs/2010.06119.
  55. Collaborative filtering with network representation learning for citation recommendation. IEEE Transactions on Big Data, 8:1233–1246.
  56. Dustin Wright and Isabelle Augenstein. 2021. Citeworth: Cite-worthiness detection for improved scientific document understanding. ArXiv, abs/2105.10912.
  57. A survey on large language models for recommendation. ArXiv, abs/2305.19860.
  58. Graph neural collaborative topic model for citation recommendation. ACM Transactions on Information Systems (TOIS), 40:1 – 30.
  59. Harnessing the power of llms in practice: A survey on chatgpt and beyond. ArXiv, abs/2304.13712.
  60. Linkbert: Pretraining language models with document links. ArXiv, abs/2203.15827.
  61. Jun Yin and Xiaoming Li. 2017. Personalized citation recommendation via convolutional neural networks. In APWeb/WAIM.
  62. Can we automate scientific reviewing? J. Artif. Intell. Res., 75:171–212.
  63. Yang Zhang and Qiang Ma. 2020. Dual attention model for citation recommendation. In COLING.
  64. Codegeex: A pre-trained model for code generation with multilingual benchmarking on humaneval-x. Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
  65. A comprehensive survey on pretrained foundation models: A history from bert to chatgpt. ArXiv, abs/2302.09419.
  66. Large language models for information retrieval: A survey. ArXiv, abs/2308.07107.

Summary

We haven't generated a summary for this paper yet.