Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large scale deduplication based on fingerprints (2101.04976v1)

Published 13 Jan 2021 in cs.CV

Abstract: In fingerprint-based systems, the size of databases increases considerably with population growth. In developing countries, because of the difficulty in using a central system when enlisting voters, it often happens that several regional voter databases are created and then merged to form a central database. A process is used to remove duplicates and ensure uniqueness by voter. Until now, companies specializing in biometrics use several costly computing servers with algorithms to perform large-scale deduplication based on fingerprints. These algorithms take a considerable time because of their complexity in O (n2), where n is the size of the database. This article presents an algorithm that can perform this operation in O (2n), with just a computer. It is based on the development of an index obtained using a 5 * 5 matrix performed on each fingerprint. This index makes it possible to build clusters of O (1) in size in order to compare fingerprints. This approach has been evaluated using close to 11 4000 fingerprints, and the results obtained show that this approach allows a penetration rate of less than 1%, an almost O (1) identification, and an O (n) deduplication. A base of 10 000 000 fingerprints can be deduplicated with a just computer in less than two hours, contrary to several days and servers for the usual tools. Keywords: fingerprint, cluster, index, deduplication.

Summary

We haven't generated a summary for this paper yet.