2000 character limit reached
How to Encode Domain Information in Relation Classification (2404.13760v1)
Published 21 Apr 2024 in cs.CL
Abstract: Current LLMs require a lot of training data to obtain high performance. For Relation Classification (RC), many datasets are domain-specific, so combining datasets to obtain better performance is non-trivial. We explore a multi-domain training setup for RC, and attempt to improve performance by encoding domain information. Our proposed models improve > 2 Macro-F1 against the baseline setup, and our analysis reveals that not all the labels benefit the same: The classes which occupy a similar space across domains (i.e., their interpretation is close across them, for example "physical") benefit the least, while domain-dependent relations (e.g., "part-of'') improve the most when encoding domain information.
- Elisa Bassignana (14 papers)
- Viggo Unmack Gascou (1 paper)
- Frida Nøhr Laustsen (1 paper)
- Gustav Kristensen (1 paper)
- Marie Haahr Petersen (1 paper)
- Rob van der Goot (38 papers)
- Barbara Plank (130 papers)