Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Application of Advanced Record Linkage Techniques for Complex Population Reconstruction (1612.04286v1)

Published 13 Dec 2016 in cs.DB and cs.AI

Abstract: Record linkage is the process of identifying records that refer to the same entities from several databases. This process is challenging because commonly no unique entity identifiers are available. Linkage therefore has to rely on partially identifying attributes, such as names and addresses of people. Recent years have seen the development of novel techniques for linking data from diverse application areas, where a major focus has been on linking complex data that contain records about different types of entities. Advanced approaches that exploit both the similarities between record attributes as well as the relationships between entities to identify clusters of matching records have been developed. In this application paper we study the novel problem where rather than different types of entities we have databases where the same entity can have different roles, and where these roles change over time. We specifically develop novel techniques for linking historical birth, death, marriage and census records with the aim to reconstruct the population covered by these records over a period of several decades. Our experimental evaluation on real Scottish data shows that even with advanced linkage techniques that consider group, relationship, and temporal aspects it is challenging to achieve high quality linkage from such complex data.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Peter Christen (31 papers)
Citations (9)