Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
Gemini 2.5 Pro
GPT-5
GPT-4o
DeepSeek R1 via Azure
2000 character limit reached

Improving Transformers using Faithful Positional Encoding (2405.09061v2)

Published 15 May 2024 in cs.LG

Abstract: We propose a new positional encoding method for a neural network architecture called the Transformer. Unlike the standard sinusoidal positional encoding, our approach is based on solid mathematical grounds and has a guarantee of not losing information about the positional order of the input sequence. We show that the new encoding approach systematically improves the prediction performance in the time-series classification task.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (10)
  1. Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer-Verlag.
  2. Ersoy, O. (1985). Real discrete fourier transform. IEEE transactions on acoustics, speech, and signal processing, 33(4):880–882.
  3. Rethinking positional encoding in language pre-training. In International Conference on Learning Representations.
  4. Diagnostic spatio-temporal transformer with faithful encoding. Knowledge-Based Systems, 274:110639.
  5. Pre-training context and time aware location embeddings from spatial-temporal trajectories for user next location prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 4241–4248.
  6. Formal algorithms for transformers. arXiv e-prints, pages arXiv–2207.
  7. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(140):1–67.
  8. Self-attention with relative position representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 464–468.
  9. Multi-time attention networks for irregularly sampled time series. In International Conference on Learning Representations.
  10. Attention is all you need. In Advances in neural information processing systems, pages 5998–6008.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com