2000 character limit reached
RuCoCo: a new Russian corpus with coreference annotation (2206.04925v1)
Published 10 Jun 2022 in cs.CL
Abstract: We present a new corpus with coreference annotation, Russian Coreference Corpus (RuCoCo). The goal of RuCoCo is to obtain a large number of annotated texts while maintaining high inter-annotator agreement. RuCoCo contains news texts in Russian, part of which were annotated from scratch, and for the rest the machine-generated annotations were refined by human annotators. The size of our corpus is one million words and around 150,000 mentions. We make the corpus publicly available.
- Vladimir Dobrovolskii (2 papers)
- Mariia Michurina (1 paper)
- Alexandra Ivoylova (2 papers)