Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

It's AI Match: A Two-Step Approach for Schema Matching Using Embeddings (2203.04366v1)

Published 8 Mar 2022 in cs.DB and cs.CL

Abstract: Since data is often stored in different sources, it needs to be integrated to gather a global view that is required in order to create value and derive knowledge from it. A critical step in data integration is schema matching which aims to find semantic correspondences between elements of two schemata. In order to reduce the manual effort involved in schema matching, many solutions for the automatic determination of schema correspondences have already been developed. In this paper, we propose a novel end-to-end approach for schema matching based on neural embeddings. The main idea is to use a two-step approach consisting of a table matching step followed by an attribute matching step. In both steps we use embeddings on different levels either representing the whole table or single attributes. Our results show that our approach is able to determine correspondences in a robust and reliable way and compared to traditional schema matching approaches can find non-trivial correspondences.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Benjamin Hättasch (5 papers)
  2. Michael Truong-Ngoc (1 paper)
  3. Andreas Schmidt (12 papers)
  4. Carsten Binnig (38 papers)
Citations (13)