Papers
Topics
Authors
Recent
Search
2000 character limit reached

Fast and accurate annotation of short texts with Wikipedia pages

Published 17 Jun 2010 in cs.IR | (1006.3498v2)

Abstract: We address the problem of cross-referencing text fragments with Wikipedia pages, in a way that synonymy and polysemy issues are resolved accurately and efficiently. We take inspiration from a recent flow of work [Cucerzan 2007, Mihalcea and Csomai 2007, Milne and Witten 2008, Chakrabarti et al 2009], and extend their scenario from the annotation of long documents to the annotation of short texts, such as snippets of search-engine results, tweets, news, blogs, etc.. These short and poorly composed texts pose new challenges in terms of efficiency and effectiveness of the annotation process, that we address by designing and engineering TAGME, the first system that performs an accurate and on-the-fly annotation of these short textual fragments. A large set of experiments shows that TAGME outperforms state-of-the-art algorithms when they are adapted to work on short texts and it results fast and competitive on long texts.

Citations (279)

Summary

  • The paper presents a novel framework for annotating short texts by linking them to Wikipedia pages.
  • It employs efficient disambiguation and indexing techniques to resolve ambiguities in succinct content.
  • Experimental results demonstrate significant improvements in both annotation speed and accuracy over previous methods.

Overview of LaTeX Formatting for ACM SIG Proceedings

The paper "Alternate ACM SIG Proceedings Paper in LaTeX Format" serves as a template and guide for preparing documents that conform to the Association for Computing Machinery's (ACM) Special Interest Group (SIG) conference proceedings format using LaTeX. While the primary intent of this document is instructional rather than exploratory research, the systematic approach and insightful compilation of formatting guidelines present significant utility for academics and professionals involved in preparing conference documents.

Key Contributions

  1. Sample Document: The paper presents an alternate, tighter-looking formatting style for ACM SIG proceedings. This addresses author concerns regarding page constraints and uniform appearance, thereby improving overall document aesthetics.
  2. LaTeX Implementation Guidance: The document provides comprehensive examples and hands-on commands for a wide range of LaTeX functionalities such as text formatting, equation management, and figure handling. This includes the demonstration of typeface changes, inline and display mathematics, citation handling with BibTeX, and floating elements like tables and figures.
  3. Theorem and Proof Constructs: It introduces users to theorem-like environments, which are frequently required in scientific documentation. The document explains how to use \newtheorem for generating theorem and definition constructs, and advocates for the featured proof environment, ensuring logical clarity and consistency within manuscripts.
  4. Structured Elements: The article systematically explains hierarchical structure within a LaTeX document, discussing the use of sections, subsections, and the nuances of handling these within appendices. This hierarchical structuring is crucial for clarity and navigation within academic articles.
  5. Caveat for Manual Commands: Emphasizing on the technical stability of documents, the paper advises against using manual TeX \def commands, highlighting potential issues during manuscript conversion processes, such as to HTML, which is pertinent for accessibility and sharing.

Implications and Future Directions

This formal template acts as an educational resource, promoting the adoption of LaTeX for document preparation within the ACM community and the wider academic sphere. By doing so, it facilitates higher standards of document uniformity and professionalism across publications.

In practice, this could enhance authors' preparation efficiency and ensure a smooth peer review and publishing process. The instructional design may spur further development of LaTeX templates across different types of scholarly articles, not limited to conference proceedings but including journals and books.

Beyond the immediate utility, these guidelines contribute to the structured evolution of document formats in scholarly communications—a critical need as digital and open-access platforms continue to grow. Future explorations may involve integrating dynamic elements, such as interactive graphics and data visualizations, while maintaining compliance with established academic standards.

Overall, while not a typical research paper, this document serves a practical and essential role in the academic documentation landscape by streamlining document preparation, promoting consistency, and potentially influencing future developments in academic publishing technology.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.