Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Building Blocks for a Complex-Valued Transformer Architecture (2306.09827v1)

Published 16 Jun 2023 in cs.LG, cs.CV, and cs.NE

Abstract: Most deep learning pipelines are built on real-valued operations to deal with real-valued inputs such as images, speech or music signals. However, a lot of applications naturally make use of complex-valued signals or images, such as MRI or remote sensing. Additionally the Fourier transform of signals is complex-valued and has numerous applications. We aim to make deep learning directly applicable to these complex-valued signals without using projections into $\mathbb{R}2$. Thus we add to the recent developments of complex-valued neural networks by presenting building blocks to transfer the transformer architecture to the complex domain. We present multiple versions of a complex-valued Scaled Dot-Product Attention mechanism as well as a complex-valued layer normalization. We test on a classification and a sequence generation task on the MusicNet dataset and show improved robustness to overfitting while maintaining on-par performance when compared to the real-valued transformer architecture.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. “Analysis of deep complex-valued convolutional neural networks for MRI reconstruction and phase-focused applications,” Magnetic Resonance in Medicine, vol. 86, no. 2, pp. 1093–1109, 2021.
  2. “Compressed sensing mri reconstruction with Co-VeGAN: Complex-valued generative adversarial network,” in WACV, 2022, pp. 1779–1788.
  3. “SSCV-GANs: Semi-supervised complex-valued GANs for PolSAR image classification,” IEEE Access, vol. 8, pp. 146560–146576, 2020.
  4. “Phase-aware speech enhancement with deep complex U-Net,” in ICLR, 2018.
  5. “Complex transformer: A framework for modeling complex-valued sequence,” in ICASSP, 2020, pp. 4232–4236.
  6. Y. Yang and S. Soatto, “FDA: Fourier domain adaptation for semantic segmentation,” in CVPR, 2020, pp. 4084–4094.
  7. “A Fourier-based framework for domain generalization,” in CVPR, 2021, pp. 14383–14392.
  8. “PHASEN: A phase-and-harmonics-aware speech enhancement network,” in AAAI, 2020.
  9. “Deep complex networks,” in ICLR, 2018.
  10. “Attention is all you need,” in NIPS, 2017, pp. 5998–6008.
  11. “Music information retrieval system using complex-valued recurrent neural networks,” in IEEE SMC, 1998, pp. 4290–4295.
  12. “On energy function for complex-valued neural networks and its applications,” in ICONIP, 2002.
  13. A. Hirose, Complex-Valued Neural Networks: Theories and Applications, World Scientific, 2003.
  14. “A survey of complex-valued neural networks,” arXiv:2101.12249, 2021.
  15. “End-to-end recognition of similar space cone–cylinder targets based on complex-valued coordinate attention networks,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–14, 2021.
  16. S. Ren and F. Zhou, “Polsar image classification with complex-valued residual attention enhanced U-Net,” in IGARSS, 2021, pp. 3045–3048.
  17. “Better than real: Complex-valued neural nets for MRI fingerprinting,” in ICIP, 2017, pp. 3953–3957.
  18. A. Hirose and S. Yoshida, “Generalization characteristics of complex-valued feedforward neural networks in relation to signal coherence,” IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 4, pp. 541–551, 2012.
  19. “DCCRN: Deep complex convolution recurrent network for phase-aware speech enhancement,” in INTERSPEECH, 2020.
  20. H. Zhang et al., “An optical neural chip for implementing complex-valued neural network,” Nature Communications, vol. 12, pp. 457, 2021.
  21. “A survey of the usages of deep learning for natural language processing,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 2, pp. 604–624, 2020.
  22. D. Hu, “An introductory survey on attention mechanisms in NLP problems,” in IntelliSys, 2019.
  23. “An image is worth 16x16 words: Transformers for image recognition at scale,” in ICLR, 2021.
  24. “Transformers in vision: A survey,” ACM Computing Surveys, vol. 54, pp. 200:1–200:41, 2021.
  25. “Music transformer: Generating music with long-term structure,” in ICLR, 2019.
  26. “A survey of transformers,” AI Open, vol. 3, pp. 111–132, 2022.
  27. “Efficient transformers: A survey,” ACM Computing Surveys, vol. 55, no. 6, pp. 109:1–109:28, 2023.
  28. “Signal transformer: Complex-valued attention and meta-learning for signal recognition,” arXiv:2106.04392, 2021.
  29. “Complex-valued channel attention and application in ego-velocity estimation with automotive radar,” IEEE Access, vol. 9, pp. 17717–17727, 2021.
  30. “Encoding word order in complex embeddings,” in ICLR, 2020.
  31. “Conditional positional encodings for vision transformers,” arXiv:2102.10882, 2021.
  32. “Layer normalization,” arXiv:1607.06450, 2016.
  33. “Learning features of music from scratch,” in ICLR, 2017.
  34. J. O. Smith, “Digital audio resampling home page,” https://ccrma.stanford.edu/~jos/resample/”, 2020.
  35. N. Guberman, “On complex valued convolutional neural networks,” arXiv:1602.09046, 2016.
  36. “Semi-supervised recurrent complex-valued convolution neural network for polsar image classification,” in IGARSS, 2019.
Citations (4)

Summary

We haven't generated a summary for this paper yet.