Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

YOLO-MED : Multi-Task Interaction Network for Biomedical Images (2403.00245v1)

Published 1 Mar 2024 in cs.CV

Abstract: Object detection and semantic segmentation are pivotal components in biomedical image analysis. Current single-task networks exhibit promising outcomes in both detection and segmentation tasks. Multi-task networks have gained prominence due to their capability to simultaneously tackle segmentation and detection tasks, while also accelerating the segmentation inference. Nevertheless, recent multi-task networks confront distinct limitations such as the difficulty in striking a balance between accuracy and inference speed. Additionally, they often overlook the integration of cross-scale features, which is especially important for biomedical image analysis. In this study, we propose an efficient end-to-end multi-task network capable of concurrently performing object detection and semantic segmentation called YOLO-Med. Our model employs a backbone and a neck for multi-scale feature extraction, complemented by the inclusion of two task-specific decoders. A cross-scale task-interaction module is employed in order to facilitate information fusion between various tasks. Our model exhibits promising results in balancing accuracy and speed when evaluated on the Kvasir-seg dataset and a private biomedical image dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. “The medical segmentation decathlon,” Nature communications, vol. 13, no. 1, pp. 4128, 2022.
  2. “Deep learning-enabled medical computer vision,” NPJ digital medicine, vol. 4, no. 1, pp. 5, 2021.
  3. “Study of deep learning techniques for medical image analysis: A review,” Materials Today: Proceedings, vol. 56, pp. 209–214, 2022.
  4. “Ai in health and medicine,” Nature medicine, vol. 28, no. 1, pp. 31–38, 2022.
  5. “Machine learning in medical applications: A review of state-of-the-art methods,” Computers in Biology and Medicine, vol. 145, 2022.
  6. “A review on deep learning in medical image analysis,” International Journal of Multimedia Information Retrieval, vol. 11, no. 1, pp. 19–38, 2022.
  7. “You only look once: Unified, real-time object detection,” in CVPR, 2016.
  8. “Yolov4: Optimal speed and accuracy of object detection,” arXiv preprint, 2020.
  9. “Retina u-net: Embarrassingly simple exploitation of segmentation supervision for medical object detection,” in Machine Learning for Health Workshop, 2020, pp. 171–183.
  10. “Polyp-pvt: Polyp segmentation with pyramid vision transformers,” arXiv preprint, 2021.
  11. “U-net: Convolutional networks for biomedical image segmentation,” in MICCAI, 2015, pp. 234–241.
  12. “Pranet: Parallel reverse attention network for polyp segmentation,” in MICCAI, 2020, pp. 263–273.
  13. “Progressively normalized self-attention network for video polyp segmentation,” in MICCAI, 2021.
  14. “Cross-level feature aggregation network for polyp segmentation,” Pattern Recognition, vol. 140, pp. 109555, 2023.
  15. “Diagnosis and segmentation effect of the me-nbi-based deep learning model on gastric neoplasms in patients with suspected superficial lesions-a multicenter study,” Frontiers in Oncology, vol. 12, 2023.
  16. “Uolo-automatic object detection and segmentation in biomedical images,” in MICCAI Workshop, 2018, pp. 165–173.
  17. “Mulan: multitask universal lesion analysis network for joint lesion detection, tagging, and segmentation,” in MICCAI, 2019, pp. 194–202.
  18. “Demt: Deformable mixer transformer for multi-task learning of dense prediction,” in AAAI, 2023.
  19. “Scale-aware task message transferring for multi-task learning,” in ICME, 2023, pp. 1859–1864.
  20. “Prompt guided transformer for multi-task dense prediction,” arXiv preprint arXiv:2307.15362, 2023.
  21. “Kvasir-seg: A segmented polyp dataset,” in MMM, 2020, pp. 451–462.
  22. “Spatial pyramid pooling in deep convolutional networks for visual recognition,” IEEE TPAMI, vol. 37, no. 9, pp. 1904–1916, 2015.
  23. “Feature pyramid networks for object detection,” in CVPR, 2017.
  24. “Path aggregation network for instance segmentation,” in CVPR, 2018, pp. 8759–8768.
  25. “Yolox: Exceeding yolo series in 2021,” arXiv preprint, 2021.
  26. “Rethinking the faster r-cnn architecture for temporal action localization,” in CVPR, 2018.
  27. “Faster r-cnn: Towards real-time object detection with region proposal networks,” NeurIPS, vol. 28, 2015.
  28. “An image is worth 16x16 words: Transformers for image recognition at scale,” in ICLR, 2021.
  29. “Focal loss for dense object detection,” TPAMI, 2018.
  30. “Distance-iou loss: Faster and better learning for bounding box regression,” in AAAI, 2020.
Citations (2)

Summary

We haven't generated a summary for this paper yet.