Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

KFCNet: Knowledge Filtering and Contrastive Learning Network for Generative Commonsense Reasoning (2109.06704v1)

Published 14 Sep 2021 in cs.CL

Abstract: Pre-trained LLMs have led to substantial gains over a broad range of NLP tasks, but have been shown to have limitations for natural language generation tasks with high-quality requirements on the output, such as commonsense generation and ad keyword generation. In this work, we present a novel Knowledge Filtering and Contrastive learning Network (KFCNet) which references external knowledge and achieves better generation performance. Specifically, we propose a BERT-based filter model to remove low-quality candidates, and apply contrastive learning separately to each of the encoder and decoder, within a general encoder--decoder architecture. The encoder contrastive module helps to capture global target semantics during encoding, and the decoder contrastive module enhances the utility of retrieved prototypes while learning general features. Extensive experiments on the CommonGen benchmark show that our model outperforms the previous state of the art by a large margin: +6.6 points (42.5 vs. 35.9) for BLEU-4, +3.7 points (33.3 vs. 29.6) for SPICE, and +1.3 points (18.3 vs. 17.0) for CIDEr. We further verify the effectiveness of the proposed contrastive module on ad keyword generation, and show that our model has potential commercial value.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Haonan Li (43 papers)
  2. Yeyun Gong (78 papers)
  3. Jian Jiao (44 papers)
  4. Ruofei Zhang (24 papers)
  5. Timothy Baldwin (125 papers)
  6. Nan Duan (172 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.