Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhanced Transformer Architecture for Natural Language Processing (2310.10930v1)

Published 17 Oct 2023 in cs.CL and cs.AI

Abstract: Transformer is a state-of-the-art model in the field of NLP. Current NLP models primarily increase the number of transformers to improve processing performance. However, this technique requires a lot of training resources such as computing capacity. In this paper, a novel structure of Transformer is proposed. It is featured by full layer normalization, weighted residual connection, positional encoding exploiting reinforcement learning, and zero masked self-attention. The proposed Transformer model, which is called Enhanced Transformer, is validated by the bilingual evaluation understudy (BLEU) score obtained with the Multi30k translation dataset. As a result, the Enhanced Transformer achieves 202.96% higher BLEU score as compared to the original transformer with the translation dataset.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Woohyeon Moon (3 papers)
  2. Taeyoung Kim (23 papers)
  3. Bumgeun Park (6 papers)
  4. Dongsoo Har (34 papers)