Optimal column serialization for LLMs on tabular data
Determine effective column serialization strategies for converting tabular column content into textual inputs for large language models in schema matching, so that the resulting representations enable accurate assessment of column correspondences when used in LLM-based reranking pipelines.
Sponsor
References
Selecting the right serialization strategy is still an open research problem that has attracted substantial attention.
— Magneto: Combining Small and Large Language Models for Schema Matching
(2412.08194 - Liu et al., 11 Dec 2024) in Section 1, Introduction (Our Approach)