Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Poly Kernel Inception Network for Remote Sensing Detection (2403.06258v2)

Published 10 Mar 2024 in cs.CV

Abstract: Object detection in remote sensing images (RSIs) often suffers from several increasing challenges, including the large variation in object scales and the diverse-ranging context. Prior methods tried to address these challenges by expanding the spatial receptive field of the backbone, either through large-kernel convolution or dilated convolution. However, the former typically introduces considerable background noise, while the latter risks generating overly sparse feature representations. In this paper, we introduce the Poly Kernel Inception Network (PKINet) to handle the above challenges. PKINet employs multi-scale convolution kernels without dilation to extract object features of varying scales and capture local context. In addition, a Context Anchor Attention (CAA) module is introduced in parallel to capture long-range contextual information. These two components work jointly to advance the performance of PKINet on four challenging remote sensing detection benchmarks, namely DOTA-v1.0, DOTA-v1.5, HRSC2016, and DIOR-R.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (84)
  1. Hybrid task cascade for instance segmentation. In CVPR, pages 4974–4983, 2019.
  2. Stitcher: Feedback-driven data provider for object detection. arXiv preprint arXiv:2004.12432, 2(7):12, 2020.
  3. Anchor-free oriented proposal generator for object detection. IEEE TGRS, 60:1–11, 2022a.
  4. Dual-aligned oriented detector. IEEE TGRS, 60:1–11, 2022b.
  5. Pearson correlation coefficient. Noise Reduction in Speech Applications, pages 1–4, 2009.
  6. MMPreTrain Contributors. Openmmlab’s pre-training toolbox and benchmark. https://github.com/open-mmlab/mmpretrain, 2023.
  7. Ao2-detr: Arbitrary-oriented object detection transformer. IEEE TCSVT, 2022.
  8. Multi-scale depthwise separable convolution for semantic segmentation in street–road scenes. Remote Sensing, 15(10):2649, 2023.
  9. Imagenet: A large-scale hierarchical image database. In CVPR, pages 248–255, 2009.
  10. Learning roi transformer for oriented object detection in aerial images. In CVPR, pages 2849–2858, 2019.
  11. Object detection in aerial images: A large-scale benchmark and challenges. IEEE TPAMI, 44(11):7778–7796, 2021.
  12. Scaling up your kernels to 31x31: Revisiting large kernel design in cnns. In CVPR, pages 11963–11975, 2022.
  13. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html, a.
  14. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html, b.
  15. Point-based estimator for arbitrary-oriented object detection in aerial images. IEEE TGRS, 59(5):4370–4387, 2020.
  16. A rotational libra r-cnn method for ship detection. IEEE TGRS, 58(8):5772–5781, 2020.
  17. Segnext: Rethinking convolutional attention design for semantic segmentation. NeurIPS, 35:1140–1156, 2022.
  18. Visual attention network. CVM, 9(4):733–752, 2023.
  19. Geospatial object detection in high resolution satellite images based on multi-scale convolutional neural network. Remote Sensing, 10(1):131, 2018.
  20. Align deep features for oriented object detection. IEEE TGRS, 60:1–11, 2021a.
  21. Redet: A rotation-equivariant detector for aerial object detection. In CVPR, pages 2786–2795, 2021b.
  22. Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
  23. Mask r-cnn. In ICCV, pages 2961–2969, 2017.
  24. Self-adaptive aspect ratio anchor for oriented object detection in remote sensing images. Remote Sensing, 13(7):1318, 2021.
  25. Refined one-stage oriented object detection method for remote sensing images. IEEE TIP, 31:1545–1558, 2022a.
  26. Shape-adaptive selection and measurement for oriented object detection. In AAAI, pages 923–932, 2022b.
  27. G-rep: Gaussian representation for arbitrary-oriented object detection. Remote Sensing, 15(3):757, 2023.
  28. A general gaussian heatmap label assignment for arbitrary-oriented object detection. IEEE TIP, 31:1895–1910, 2022.
  29. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  30. Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS J. P&\&&RS, 159:296–307, 2020.
  31. Oriented reppoints for aerial object detection. In CVPR, pages 1829–1838, 2022.
  32. Large selective kernel network for remote sensing object detection. In ICCV, pages 16794–16805, 2023.
  33. Small object detection in unmanned aerial vehicle images using feature fusion and scaling-based single shot detector with spatial context analysis. IEEE TCSVT, 30(6):1758–1770, 2019.
  34. Microsoft coco: Common objects in context. In ECCV, pages 740–755, 2014.
  35. Feature pyramid networks for object detection. In CVPR, pages 2117–2125, 2017a.
  36. Focal loss for dense object detection. In ICCV, pages 2980–2988, 2017b.
  37. Squeeze and excitation rank faster r-cnn for ship detection in sar images. IEEE GRSL, 16(5):751–755, 2018.
  38. More convnets in the 2020s: Scaling up kernels beyond 51x51 using sparsity. arXiv preprint arXiv:2207.03620, 2022a.
  39. Ssd: Single shot multibox detector. In ECCV, pages 21–37, 2016a.
  40. Ship rotated bounding box space for ship extraction from high-resolution optical satellite images with complex backgrounds. IEEE GRSL, 13(8):1074–1078, 2016b.
  41. A high resolution optical satellite image dataset for ship recognition and some new baselines. In ICPRAM, pages 324–331, 2017.
  42. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV, pages 10012–10022, 2021.
  43. A convnet for the 2020s. In CVPR, pages 11976–11986, 2022b.
  44. On creating benchmark dataset for aerial image interpretation: Reviews, guidances, and million-aid. IEEE J-STARS, 14:4205–4230, 2021.
  45. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, 2016.
  46. Rtmdet: An empirical study of designing real-time object detectors. arXiv preprint arXiv:2212.07784, 2022.
  47. Oriented object detection with transformer. arXiv preprint arXiv:2106.03146, 2021.
  48. Dynamic anchor learning for arbitrary-oriented object detection. In AAAI, pages 2355–2363, 2021.
  49. Dynamic refinement network for oriented and densely packed object detection. In CVPR, pages 11207–11216, 2020.
  50. Adaptive rotated convolution for rotated object detection. In ICCV, pages 6589–6600, 2023.
  51. Learning modulated loss for rotated object detection. In AAAI, pages 2458–2466, 2021.
  52. A2rmnet: Adaptively aspect ratio multi-scale network for object detection in remote sensing images. Remote Sensing, 11(13):1594, 2019.
  53. Faster r-cnn: Towards real-time object detection with region proposal networks. NeurIPS, 28, 2015.
  54. Rotation equivariant feature image pyramid network for object detection in optical remote sensing imagery. IEEE TGRS, 60:1–14, 2021.
  55. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  56. Fair1m: A benchmark dataset for fine-grained object recognition in high-resolution remote sensing imagery. ISPRS J. P&\&&RS, 184:116–130, 2022.
  57. Rethinking the inception architecture for computer vision. In CVPR, pages 2818–2826, 2016.
  58. Ghostnetv2: enhance cheap operation with long-range attention. NeurIPS, 35:9969–9982, 2022.
  59. Fcos: Fully convolutional one-stage object detection. In ICCV, pages 9627–9636, 2019.
  60. Cspnet: A new backbone that can enhance learning capability of cnn. In CVPRW, pages 390–391, 2020.
  61. Fsod-net: Full-scale object detection from optical remote sensing imagery. IEEE TGRS, 60:1–18, 2022.
  62. Mask obb: A semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images. Remote Sensing, 11(24):2930, 2019.
  63. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In ICCV, pages 568–578, 2021.
  64. Dota: A large-scale dataset for object detection in aerial images. In CVPR, pages 3974–3983, 2018.
  65. Oriented r-cnn for object detection. In ICCV, pages 3520–3529, 2021.
  66. Dynamic coarse-to-fine learning for oriented tiny object detection. In CVPR, pages 7318–7328, 2023.
  67. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE TPAMI, 43(4):1452–1459, 2020.
  68. Arbitrary-oriented object detection with circular smooth label. In ECCV, pages 677–694, 2020.
  69. Scrdet: Towards more robust detection for small, cluttered and rotated objects. In ICCV, pages 8232–8241, 2019.
  70. Dense label encoding for boundary discontinuity free rotation detection. In CVPR, pages 15819–15829, 2021a.
  71. R3det: Refined single-stage detector with feature refinement for rotating object. In AAAI, pages 3163–3171, 2021b.
  72. Rethinking rotated object detection with gaussian wasserstein distance loss. In ICML, pages 11830–11841, 2021c.
  73. Learning high-precision bounding box for rotated object detection via kullback-leibler divergence. NeurIPS, 34:18381–18394, 2021d.
  74. The kfiou loss for rotated object detection. arXiv preprint arXiv:2201.12558, 2022.
  75. Automated object recognition in high-resolution optical remote sensing imagery. NSR, 10(6):nwad122, 2023.
  76. Oriented object detection in aerial images with box boundary-aware vectors. pages 2150–2159, 2021.
  77. Inceptionnext: When inception meets convnext. arXiv preprint arXiv:2303.16900, 2023.
  78. Ars-detr: Aspect ratio sensitive oriented object detection with transformer. arXiv preprint arXiv:2303.04989, 2023.
  79. Multiscale depthwise separable convolution based network for high-resolution image segmentation. IJRS, 43(18):6624–6643, 2022.
  80. Laplacian feature pyramid network for object detection in vhr optical remote sensing images. IEEE TGRS, 60:1–14, 2021.
  81. Hierarchical and robust convolutional neural network for very high-resolution remote sensing object detection. IEEE TGRS, 57(8):5535–5548, 2019.
  82. Multi-scale image block-level f-cnn for remote sensing images object detection. IEEE Access, 7:43607–43621, 2019.
  83. Hynet: Hyper-scale object detection network framework for multiple spatial resolution remote sensing imagery. ISPRS J. P&\&&RS, 166:1–14, 2020.
  84. Mmrotate: A rotated object detection benchmark using pytorch. In ACM MM, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xinhao Cai (4 papers)
  2. Qiuxia Lai (15 papers)
  3. Yuwei Wang (60 papers)
  4. Wenguan Wang (103 papers)
  5. Zeren Sun (13 papers)
  6. Yazhou Yao (52 papers)
Citations (30)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com