Building Blocks for a Complex-Valued Transformer Architecture (2306.09827v1)
Abstract: Most deep learning pipelines are built on real-valued operations to deal with real-valued inputs such as images, speech or music signals. However, a lot of applications naturally make use of complex-valued signals or images, such as MRI or remote sensing. Additionally the Fourier transform of signals is complex-valued and has numerous applications. We aim to make deep learning directly applicable to these complex-valued signals without using projections into $\mathbb{R}2$. Thus we add to the recent developments of complex-valued neural networks by presenting building blocks to transfer the transformer architecture to the complex domain. We present multiple versions of a complex-valued Scaled Dot-Product Attention mechanism as well as a complex-valued layer normalization. We test on a classification and a sequence generation task on the MusicNet dataset and show improved robustness to overfitting while maintaining on-par performance when compared to the real-valued transformer architecture.
- “Analysis of deep complex-valued convolutional neural networks for MRI reconstruction and phase-focused applications,” Magnetic Resonance in Medicine, vol. 86, no. 2, pp. 1093–1109, 2021.
- “Compressed sensing mri reconstruction with Co-VeGAN: Complex-valued generative adversarial network,” in WACV, 2022, pp. 1779–1788.
- “SSCV-GANs: Semi-supervised complex-valued GANs for PolSAR image classification,” IEEE Access, vol. 8, pp. 146560–146576, 2020.
- “Phase-aware speech enhancement with deep complex U-Net,” in ICLR, 2018.
- “Complex transformer: A framework for modeling complex-valued sequence,” in ICASSP, 2020, pp. 4232–4236.
- Y. Yang and S. Soatto, “FDA: Fourier domain adaptation for semantic segmentation,” in CVPR, 2020, pp. 4084–4094.
- “A Fourier-based framework for domain generalization,” in CVPR, 2021, pp. 14383–14392.
- “PHASEN: A phase-and-harmonics-aware speech enhancement network,” in AAAI, 2020.
- “Deep complex networks,” in ICLR, 2018.
- “Attention is all you need,” in NIPS, 2017, pp. 5998–6008.
- “Music information retrieval system using complex-valued recurrent neural networks,” in IEEE SMC, 1998, pp. 4290–4295.
- “On energy function for complex-valued neural networks and its applications,” in ICONIP, 2002.
- A. Hirose, Complex-Valued Neural Networks: Theories and Applications, World Scientific, 2003.
- “A survey of complex-valued neural networks,” arXiv:2101.12249, 2021.
- “End-to-end recognition of similar space cone–cylinder targets based on complex-valued coordinate attention networks,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–14, 2021.
- S. Ren and F. Zhou, “Polsar image classification with complex-valued residual attention enhanced U-Net,” in IGARSS, 2021, pp. 3045–3048.
- “Better than real: Complex-valued neural nets for MRI fingerprinting,” in ICIP, 2017, pp. 3953–3957.
- A. Hirose and S. Yoshida, “Generalization characteristics of complex-valued feedforward neural networks in relation to signal coherence,” IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 4, pp. 541–551, 2012.
- “DCCRN: Deep complex convolution recurrent network for phase-aware speech enhancement,” in INTERSPEECH, 2020.
- H. Zhang et al., “An optical neural chip for implementing complex-valued neural network,” Nature Communications, vol. 12, pp. 457, 2021.
- “A survey of the usages of deep learning for natural language processing,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 2, pp. 604–624, 2020.
- D. Hu, “An introductory survey on attention mechanisms in NLP problems,” in IntelliSys, 2019.
- “An image is worth 16x16 words: Transformers for image recognition at scale,” in ICLR, 2021.
- “Transformers in vision: A survey,” ACM Computing Surveys, vol. 54, pp. 200:1–200:41, 2021.
- “Music transformer: Generating music with long-term structure,” in ICLR, 2019.
- “A survey of transformers,” AI Open, vol. 3, pp. 111–132, 2022.
- “Efficient transformers: A survey,” ACM Computing Surveys, vol. 55, no. 6, pp. 109:1–109:28, 2023.
- “Signal transformer: Complex-valued attention and meta-learning for signal recognition,” arXiv:2106.04392, 2021.
- “Complex-valued channel attention and application in ego-velocity estimation with automotive radar,” IEEE Access, vol. 9, pp. 17717–17727, 2021.
- “Encoding word order in complex embeddings,” in ICLR, 2020.
- “Conditional positional encodings for vision transformers,” arXiv:2102.10882, 2021.
- “Layer normalization,” arXiv:1607.06450, 2016.
- “Learning features of music from scratch,” in ICLR, 2017.
- J. O. Smith, “Digital audio resampling home page,” https://ccrma.stanford.edu/~jos/resample/”, 2020.
- N. Guberman, “On complex valued convolutional neural networks,” arXiv:1602.09046, 2016.
- “Semi-supervised recurrent complex-valued convolution neural network for polsar image classification,” in IGARSS, 2019.