Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Overview of the TREC 2023 NeuCLIR Track (2404.08071v1)

Published 11 Apr 2024 in cs.IR

Abstract: The principal goal of the TREC Neural Cross-Language Information Retrieval (NeuCLIR) track is to study the impact of neural approaches to cross-language information retrieval. The track has created four collections, large collections of Chinese, Persian, and Russian newswire and a smaller collection of Chinese scientific abstracts. The principal tasks are ranked retrieval of news in one of the three languages, using English topics. Results for a multilingual task, also with English topics but with documents from all three newswire collections, are also reported. New in this second year of the track is a pilot technical documents CLIR task for ranked retrieval of Chinese technical documents using English topics. A total of 220 runs across all tasks were submitted by six participating teams and, as baselines, by track coordinators. Task descriptions and results are presented.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. A System for Efficient High-Recall Retrieval. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 1317–1320.
  2. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. arXiv:1611.09268 [cs.CL]
  3. mMARCO: A Multilingual Version of the MS MARCO Passage Ranking Dataset. arXiv:2108.13897 [cs.CL]
  4. UMass at TREC 2023 NeuCLIR Track. In Proceedings of The Sixteenth Text REtrieval Conference Proceedings (TREC 2023).
  5. Naverloo @ TREC Deep Learning and NeuCLIR 2023: As easy as zero, one, two, three — Cascading dual encoders, mono, duo, and listo for ad-hoc retrieval. In Proceedings of The Sixteenth Text REtrieval Conference Proceedings (TREC 2023).
  6. Overview of the TREC 2022 NeuCLIR Track. In Proceedings of The Thirty-First Text REtrieval Conference.
  7. HC4: A New Suite of Test Collections for Ad Hoc CLIR. In Proceedings of the 44th European Conference on Information Retrieval (ECIR).
  8. Multilingual ColBERT-X. arXiv preprint arXiv:2209.01335 (2022).
  9. CSL: A Large-scale Chinese Scientific Literature Dataset. In Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Gyeongju, Republic of Korea, 3917–3923. https://aclanthology.org/2022.coling-1.344
  10. ISI’s SEARCHER II System for TREC’s 2023 NeuCLIR Track. In Proceedings of The Sixteenth Text REtrieval Conference Proceedings (TREC 2023).
  11. Suraj Nair and Douglas W. Oard. 2023. BLADE: The University of Maryland at the TREC 2023 NeuCLIR Track. In Proceedings of The Sixteenth Text REtrieval Conference Proceedings (TREC 2023).
  12. Transfer Learning Approaches for Building Cross-Language Dense Retrieval Models. In Proceedings of the 44th European Conference on Information Retrieval (ECIR).
  13. Learning Translational and Knowledge-based Similarities from Relevance Rankings for Cross-Language Retrieval. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. https://www.cl.uni-heidelberg.de/~riezler/publications/papers/ACL2014short.pdf
  14. Ashudeep Singh and Thorsten Joachims. 2018. Fairness of exposure in rankings. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 2219–2228.
  15. Luca Soldaini. 2023. AI2 at TREC 2023 NeuCLIR Track. In Proceedings of The Sixteenth Text REtrieval Conference Proceedings (TREC 2023).
  16. Shuo Sun and Kevin Duh. 2020. CLIRMatrix: A massively large collection of bilingual and multilingual datasets for Cross-Lingual Information Retrieval. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 4160–4170. https://doi.org/10.18653/v1/2020.emnlp-main.340
  17. HLTCOE at TREC 2023 NeuCLIR Track. In Proceedings of The Sixteenth Text REtrieval Conference Proceedings (TREC 2023).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Dawn Lawrie (31 papers)
  2. Sean MacAvaney (75 papers)
  3. James Mayfield (21 papers)
  4. Paul McNamee (10 papers)
  5. Douglas W. Oard (18 papers)
  6. Luca Soldaini (62 papers)
  7. Eugene Yang (38 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.