German FinBERT: A German Pre-trained Language Model

Published 15 Nov 2023 in cs.CL and stat.ML | (2311.08793v1)

Abstract: This study presents German FinBERT, a novel pre-trained German LLM tailored for financial textual data. The model is trained through a comprehensive pre-training process, leveraging a substantial corpus comprising financial reports, ad-hoc announcements and news related to German companies. The corpus size is comparable to the data sets commonly used for training standard BERT models. I evaluate the performance of German FinBERT on downstream tasks, specifically sentiment prediction, topic recognition and question answering against generic German LLMs. My results demonstrate improved performance on finance-specific data, indicating the efficacy of German FinBERT in capturing domain-specific nuances. The presented findings suggest that German FinBERT holds promise as a valuable tool for financial text analysis, potentially benefiting various applications in the financial domain.