Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

$gen$CNN: A Convolutional Architecture for Word Sequence Prediction (1503.05034v2)

Published 17 Mar 2015 in cs.CL

Abstract: We propose a novel convolutional architecture, named $gen$CNN, for word sequence prediction. Different from previous work on neural network-based LLMing and generation (e.g., RNN or LSTM), we choose not to greedily summarize the history of words as a fixed length vector. Instead, we use a convolutional neural network to predict the next word with the history of words of variable length. Also different from the existing feedforward networks for LLMing, our model can effectively fuse the local correlation and global correlation in the word sequence, with a convolution-gating strategy specifically designed for the task. We argue that our model can give adequate representation of the history, and therefore can naturally exploit both the short and long range dependencies. Our model is fast, easy to train, and readily parallelized. Our extensive experiments on text generation and $n$-best re-ranking in machine translation show that $gen$CNN outperforms the state-of-the-arts with big margins.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Mingxuan Wang (83 papers)
  2. Zhengdong Lu (35 papers)
  3. Hang Li (277 papers)
  4. Wenbin Jiang (18 papers)
  5. Qun Liu (230 papers)
Citations (28)