Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Evaluation Framework for Mapping News Headlines to Event Classes in a Knowledge Graph (2312.02334v1)

Published 4 Dec 2023 in cs.CL and cs.AI

Abstract: Mapping ongoing news headlines to event-related classes in a rich knowledge base can be an important component in a knowledge-based event analysis and forecasting solution. In this paper, we present a methodology for creating a benchmark dataset of news headlines mapped to event classes in Wikidata, and resources for the evaluation of methods that perform the mapping. We use the dataset to study two classes of unsupervised methods for this task: 1) adaptations of classic entity linking methods, and 2) methods that treat the problem as a zero-shot text classification problem. For the first approach, we evaluate off-the-shelf entity linking systems. For the second approach, we explore a) pre-trained natural language inference (NLI) models, and b) pre-trained large generative LLMs. We present the results of our evaluation, lessons learned, and directions for future work. The dataset and scripts for evaluation are made publicly available.

Citations (1)

Summary

  • The paper introduces a framework that evaluates classic entity linking, zero-shot classifiers, and large language models for mapping news headlines to event classes.
  • Researchers demonstrated that large language models consistently identify event classes even with subtle lexical cues, outperforming other methods.
  • The study proposes an ensemble approach to integrate unsupervised methods, aiming to improve the accuracy of news event forecasting.

An evaluation framework has been introduced for efficiently categorizing news headlines into event-related classes within a knowledge graph. The paper emphasizes the relevance of this task in the context of knowledge-based event analysis and forecasting solutions, which benefit businesses and organizations by identifying news events that could impact their operations.

The assessment of different methods for this categorization task forms a substantial part of the paper. Researchers focused on two types of unsupervised methods: classic entity linking adaptations and zero-shot text classification techniques. The former approach uses pre-existing entity linking systems modified to recognize event-related classes from news headlines, while the latter employs models that make classifications without using specific training datasets on the task.

The paper further compares these methods with pre-trained large generative LLMs. These LLMs, often featured in AI advancements, are also explored as tools for mapping news headlines to event classes. With their capacity for understanding and generating human-like text, such models are applied to the unsupervised categorization of news headlines, representing a significant area of interest in the research.

A novel dataset was created for the paper, derived from Wikidata and Wikinews articles, containing 110 mappings of news headlines to various event classes within Wikidata. This dataset was used to evaluate the performance of the aforementioned methods, with results suggesting that while classic entity linking methods and zero-shot classifiers have their strengths, LLM-based methods exhibited a more consistent ability to associate news headlines with event classes that do not have explicit mention within the headlines.

The paper details the results of these evaluations, revealing insights into the performance of each method type. It was found that LLMs showed promise in categorizing news headlines, even when there was no clear lexical match to event labels. This ability could be attributed to the extensive training on vast text corpora that LLMs undergo, enabling them to recognize subtler patterns and associations beyond mere keyword matching.

Looking forward, the researchers point out the potential in creating an ensemble approach that integrates techniques from both entity linking and LLMs. This would potentially increase accuracy in headline-event mapping tasks. The paper also signals the possibility of extending the current dataset to a more comprehensive collection for future studies and experiments with a broader range of LLMs to harness the advancements in AI for more effective news event categorization.

Finally, the paper underscores the importance of continued research in developing frameworks to analyze and forecast events. As the media landscape continues to grow in complexity, with an ever-increasing volume of news content, AI-based solutions like the ones discussed in the paper will become increasingly fundamental for organizations aiming to monitor and analyze news events effectively. The researchers have also made their datasets and evaluation scripts publicly available, encouraging further studies and progress in this field.