CECT: Controllable Ensemble CNN and Transformer for COVID-19 Image Classification (2302.02314v4)
Abstract: The COVID-19 pandemic has resulted in hundreds of million cases and numerous deaths worldwide. Here, we develop a novel classification network CECT by controllable ensemble convolutional neural network and transformer to provide a timely and accurate COVID-19 diagnosis. The CECT is composed of a parallel convolutional encoder block, an aggregate transposed-convolutional decoder block, and a windowed attention classification block. Each block captures features at different scales from 28 $\times$ 28 to 224 $\times$ 224 from the input, composing enriched and comprehensive information. Different from existing methods, our CECT can capture features at both multi-local and global scales without any sophisticated module design. Moreover, the contribution of local features at different scales can be controlled with the proposed ensemble coefficients. We evaluate CECT on two public COVID-19 datasets and it reaches the highest accuracy of 98.1% in the intra-dataset evaluation, outperforming existing state-of-the-art methods. Moreover, the developed CECT achieves an accuracy of 90.9% on the unseen dataset in the inter-dataset evaluation, showing extraordinary generalization ability. With remarkable feature capture ability and generalization ability, we believe CECT can be extended to other medical scenarios as a powerful diagnosis tool. Code is available at https://github.com/NUS-Tim/CECT.
- Clinical features of patients infected with 2019 novel coronavirus in wuhan, china. The lancet, 395(10223):497–506, 2020.
- Covid-19 chest x-ray images and lung masks database. https://coronavirus.jhu.edu/map.html. Accessed 17 July 2023.
- Medical image analysis based on transformer: A review. arXiv preprint, 2022. https://doi.org/10.48550/arXiv.2208.06643.
- Mm-glcm-cnn: A multi-scale and multi-level based glcm-cnn for polyp classification. Computerized Medical Imaging and Graphics, page 102257, 2023.
- Convolutional neural networks: an overview and application in radiology. Insights into imaging, 9(4):611–629, 2018.
- Attention is all you need. Adv Neural Inf Process Syst, 30:1–11, 2017.
- Covid-19 chest x-ray images and lung masks database. https://www.kaggle.com/datasets/tawsifurrahman/covid19-radiography-database. Accessed 19 October 2022.
- Can ai help in screening viral and covid-19 pneumonia? IEEE Access, 8:132665–132676, 2020. https://doi.org/10.1109/ACCESS.2020.3010287.
- Exploring the effect of image enhancement techniques on covid-19 detection using chest x-ray images. Comput. Biol. Med., 132:104319, 2021. https://doi.org/10.1016/j.compbiomed.2021.104319.
- Chest x-ray images for the detection of covid-19. https://www.kaggle.com/datasets/andyczhao/covidx-cxr2. Accessed 19 October 2022.
- Covid-net: A tailored deep convolutional neural network design for detection of covid-19 cases from chest x-ray images. Sci Rep, 10(1):1–12, 2020. https://doi.org/10.1038/s41598-020-76550-z.
- Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016. https://doi.org/10.1109/CVPR.2016.90.
- Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1492–1500, 2017.
- A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11976–11986, 2022.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2020.
- Metaformer baselines for vision. arXiv preprint arXiv:2210.13452, 2022.
- Scaling up your kernels to 31x31: Revisiting large kernel design in cnns. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11963–11975, 2022.
- Inceptionnext: When inception meets convnext. arXiv preprint arXiv:2303.16900, 2023.
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10012–10022, 2021.
- Training data-efficient image transformers & distillation through attention. In International conference on machine learning, pages 10347–10357. PMLR, 2021.
- Transformer in transformer. Advances in neural information processing systems, 34:15908–15919, 2021.
- Detection of tuberculosis from chest x-ray images: boosting the performance with vision transformer and transfer learning. Expert Syst. Appl., 184:115519, 2021. https://doi.org/10.1016/j.eswa.2021.115519.
- Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International conference on machine learning, pages 6105–6114. PMLR, 2019.
- Cswin transformer: A general vision transformer backbone with cross-shaped windows. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12124–12134, 2022.
- Maxvit: Multi-axis vision transformer. In Proceedings of the European Conference on Computer Vision, pages 459–479. Springer, 2022.
- Crossvit: Cross-attention multi-scale vision transformer for image classification. In Proceedings of the IEEE/CVF international conference on computer vision, pages 357–366, 2021.
- Cmt: Convolutional neural networks meet vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12175–12185, 2022.
- Co-scale conv-attentional image transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9981–9990, 2021.
- Slide-transformer: Hierarchical vision transformer with local self-attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2082–2091, 2023.
- Classification of covid-19 in chest x-ray images using detrac deep convolutional neural network. Appl. Intell., 51(2):854–864, 2021. https://doi.org/10.1007/s10489-020-01829-7.
- Gsda: Generative adversarial network-based semi-supervised data augmentation for ultrasound image classification. Heliyon, 9(9), 2023.
- Fp-cnn: Fuzzy pooling-based convolutional neural network for lung ultrasound image classification with explainable ai. Computers in Biology and Medicine, 165:107407, 2023.
- A classification of mri brain tumor based on two stage feature level ensemble of deep cnn models. Computers in biology and medicine, 146:105539, 2022.
- Vision transformer based covid-19 detection using chest x-rays. In Proceedings of the International Conference on Signal Processing, Computing and Control, pages 644–648. IEEE, 2021. https://doi.org/10.1109/ISPCC53510.2021.9609375.
- An effective skin cancer classification mechanism via medical vision transformer. Sensors, 22(11):4008, 2022. https://doi.org/10.3390/s22114008.
- Vision-transformer-based transfer learning for mammogram classification. Diagnostics, 13(2):178, 2023. https://doi.org/10.3390/diagnostics13020178.
- Genevit: gene vision transformer with improved deepinsight for cancer classification. Computers in Biology and Medicine, 155:106643, 2023.
- Vision transformers for classification of breast ultrasound images. In 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages 480–483. IEEE, 2022.
- Breast-net: Multi-class classification of breast cancer from histopathological images using ensemble of swin transformers. Mathematics, 10(21):4109, 2022.
- Transmed: Transformers advance multi-modal medical image classification. Diagnostics, 11(8):1384, 2021.
- A light-weight vision transformer for covid-19 chest x-ray images classification. In 2023 IEEE 18th Conference on Industrial Electronics and Applications (ICIEA), pages 292–297. IEEE, 2023.
- Mobilevit: Light-weight, general-purpose, and mobile-friendly vision transformer. In International Conference on Learning Representations, 2021.
- Medvit: a robust vision transformer for generalized medical image classification. Computers in Biology and Medicine, 157:106791, 2023.
- An evolutionary attention-based network for medical image classification. International Journal of Neural Systems, 33(03):2350010, 2023.
- Mxt: A new variant of pyramid vision transformer for multi-label chest x-ray image classification. Cognitive Computation, 14(4):1362–1377, 2022.
- Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations, pages 1–14, 05 2015.
- Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1314–1324, 2019. https://doi.org/10.1109/ICCV.2019.00140.
- Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- Designing network design spaces. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10428–10436, 2020.
- Ghostnet: More features from cheap operations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1580–1589, 2020.
- Cspnet: A new backbone that can enhance learning capability of cnn. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 390–391, 2020.
- Selective kernel networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 510–519, 2019.
- An energy and gpu-computation efficient backbone network for real-time object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 0–0, 2019.
- Convit: Improving vision transformers with soft convolutional inductive biases. In International conference on machine learning, pages 2286–2296. PMLR, 2021.
- Levit: a vision transformer in convnet’s clothing for faster inference. In Proceedings of the IEEE/CVF international conference on computer vision, pages 12259–12269, 2021.
- Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.