Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bipartite Graph Matching Algorithms for Clean-Clean Entity Resolution: An Empirical Evaluation (2112.14030v3)

Published 28 Dec 2021 in cs.DB

Abstract: Entity Resolution (ER) is the task of finding records that refer to the same real-world entities. A common scenario is when entities across two clean sources need to be resolved, which we refer to as Clean-Clean ER. In this paper, we perform an extensive empirical evaluation of 8 bipartite graph matching algorithms that take in as input a bipartite similarity graph and provide as output a set of matched entities. We consider a wide range of matching algorithms, including algorithms that have not previously been applied to ER, or have been evaluated only in other ER settings. We assess the relative performance of the algorithms with respect to accuracy and time efficiency over 10 established, real datasets, from which we extract >700 different similarity graphs. Our results provide insights into the relative performance of these algorithms and guidelines for choosing the best one, depending on the data at hand.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. George Papadakis (31 papers)
  2. Vasilis Efthymiou (14 papers)
  3. Emanouil Thanos (1 paper)
  4. Oktie Hassanzadeh (16 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.