Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Contrastive Document Representation Learning with Graph Attention Networks (2110.10778v1)

Published 20 Oct 2021 in cs.CL

Abstract: Recent progress in pretrained Transformer-based LLMs has shown great success in learning contextual representation of text. However, due to the quadratic self-attention complexity, most of the pretrained Transformers models can only handle relatively short text. It is still a challenge when it comes to modeling very long documents. In this work, we propose to use a graph attention network on top of the available pretrained Transformers model to learn document embeddings. This graph attention network allows us to leverage the high-level semantic structure of the document. In addition, based on our graph document model, we design a simple contrastive learning strategy to pretrain our models on a large amount of unlabeled corpus. Empirically, we demonstrate the effectiveness of our approaches in document classification and document retrieval tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Peng Xu (357 papers)
  2. Xinchi Chen (15 papers)
  3. Xiaofei Ma (31 papers)
  4. Zhiheng Huang (33 papers)
  5. Bing Xiang (74 papers)
Citations (8)