Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards a Brazilian History Knowledge Graph (2403.19856v1)

Published 28 Mar 2024 in cs.AI and cs.DL

Abstract: This short paper describes the first steps in a project to construct a knowledge graph for Brazilian history based on the Brazilian Dictionary of Historical Biographies (DHBB) and Wikipedia/Wikidata. We contend that large repositories of Brazilian-named entities (people, places, organizations, and political events and movements) would be beneficial for extracting information from Portuguese texts. We show that many of the terms/entities described in the DHBB do not have corresponding concepts (or Q items) in Wikidata, the largest structured database of entities associated with Wikipedia. We describe previous work on extracting information from the DHBB and outline the steps to construct a Wikidata-based historical knowledge graph.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (8)
  1. Exploratory information extraction from a historical dictionary. In 2014 IEEE 10th International Conference on e-Science, volume 2, pages 11–18. IEEE.
  2. Openwordnet-pt: An open Brazilian Wordnet for reasoning. In Proceedings of COLING 2012: Demonstration Papers, pages 353–360, Mumbai, India. The COLING 2012 Organizing Committee. Published also as Techreport http://hdl.handle.net/10438/10274.
  3. Of seringueiros and sambistas: Occupation mappings in historical text. In DHandNLP@ PROPOR, pages 32–40.
  4. Text mining for history: first steps on building a large dataset. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018).
  5. Automatic information extraction: a distant reading of the brazilian historical-biographical dictionary. In International Conference on Computational Processing of the Portuguese Language, pages 148–155. Springer.
  6. Distant reading brazilian politics. In quot; In Proceedings of 4 th Conference of The Association Digital Humanities in the Nordic Countries (Copenhagen Março de 2019.
  7. Lluís Padró and Evgeny Stanilovsky. 2012. Freeling 3.0: Towards wider multilinguality. In LREC2012.
  8. The construction of a corpus from the brazilian historical-biographical dictionary. In International Conference on Computational Processing of the Portuguese Language, pages 109–117. Springer.

Summary

We haven't generated a summary for this paper yet.

HackerNews