Tevatron: An Efficient and Flexible Toolkit for Dense Retrieval

Published 11 Mar 2022 in cs.IR and cs.CL | (2203.05765v1)

Abstract: Recent rapid advancements in deep pre-trained LLMs and the introductions of large datasets have powered research in embedding-based dense retrieval. While several good research papers have emerged, many of them come with their own software stacks. These stacks are typically optimized for some particular research goals instead of efficiency or code structure. In this paper, we present Tevatron, a dense retrieval toolkit optimized for efficiency, flexibility, and code simplicity. Tevatron provides a standardized pipeline for dense retrieval including text processing, model training, corpus/query encoding, and search. This paper presents an overview of Tevatron and demonstrates its effectiveness and efficiency across several IR and QA data sets. We also show how Tevatron's flexible design enables easy generalization across datasets, model architectures, and accelerator platforms(GPU/TPU). We believe Tevatron can serve as an effective software foundation for dense retrieval system research including design, modeling, and optimization.

Abstract PDF Upgrade to Chat

Citations (67)

View on Semantic Scholar

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

We haven't generated follow-up questions for this paper yet.

Generate Now

Tevatron: An Efficient and Flexible Toolkit for Dense Retrieval

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (4)

Collections

Tevatron: An Efficient and Flexible Toolkit for Dense Retrieval

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (4)

Collections