Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Asymmetric Non-local Neural Networks for Semantic Segmentation (1908.07678v5)

Published 21 Aug 2019 in cs.CV and cs.LG

Abstract: The non-local module works as a particularly useful technique for semantic segmentation while criticized for its prohibitive computation and GPU memory occupation. In this paper, we present Asymmetric Non-local Neural Network to semantic segmentation, which has two prominent components: Asymmetric Pyramid Non-local Block (APNB) and Asymmetric Fusion Non-local Block (AFNB). APNB leverages a pyramid sampling module into the non-local block to largely reduce the computation and memory consumption without sacrificing the performance. AFNB is adapted from APNB to fuse the features of different levels under a sufficient consideration of long range dependencies and thus considerably improves the performance. Extensive experiments on semantic segmentation benchmarks demonstrate the effectiveness and efficiency of our work. In particular, we report the state-of-the-art performance of 81.3 mIoU on the Cityscapes test set. For a 256x128 input, APNB is around 6 times faster than a non-local block on GPU while 28 times smaller in GPU running memory occupation. Code is available at: https://github.com/MendelXu/ANN.git.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Zhen Zhu (64 papers)
  2. Mengde Xu (8 papers)
  3. Song Bai (87 papers)
  4. Tengteng Huang (13 papers)
  5. Xiang Bai (222 papers)
Citations (569)

Summary

Asymmetric Non-local Neural Networks for Semantic Segmentation

The paper introduces an innovative approach named Asymmetric Non-local Neural Networks (ANN) to enhance semantic segmentation by addressing the computational challenges associated with traditional non-local networks. The authors propose two main components: Asymmetric Pyramid Non-local Block (APNB) and Asymmetric Fusion Non-local Block (AFNB), which together offer improved efficiency and performance without the computational and memory overhead traditionally associated with non-local modules.

Key Contributions

  1. Asymmetric Pyramid Non-local Block (APNB):
    • APNB leverages pyramid sampling to reduce computational complexity and memory usage substantially.
    • It achieves a nearly six-fold increase in processing speed over traditional non-local blocks while requiring significantly less GPU memory (28 times less for input size 256 x 128).
    • Despite the reduced computational load, APNB maintains performance, achieving an impressive 81.3% mIoU on the Cityscapes dataset.
  2. Asymmetric Fusion Non-local Block (AFNB):
    • AFNB facilitates efficient fusion of multi-level features by considering long-range dependencies across varying stages of the network.
    • The integration results in substantial performance improvements, highlighting the efficacy of combining high-level and low-level features for enhanced segmentation accuracy.

Technical Approach

  • The conventional non-local block is resource-intensive due to the need for extensive matrix multiplications with a complexity of O(CHW)O(CHW). This paper presents a refinement by substituting these extensive operations with a pyramid sampling strategy, thereby reducing the matrix size involved in computations.
  • APNB utilizes a spatial pyramid pooling mechanism to retain critical semantic statistics while significantly trimming down computational requirements.
  • AFNB further extends the adaptive feature integration by connecting spatially coherent and semantically relevant features from distinct layers, enhancing the network's representational power and segmentation precision.

Results and Implications

  • Quantitative Performance: The network achieves state-of-the-art results across multiple benchmarks, including 81.3% on Cityscapes, 45.24% on ADE20K, and 52.8% on PASCAL Context.
  • Efficiency: APNB demonstrates substantial improvements in GPU time and memory efficiency, supporting more practical deployment scenarios without compromise on performance.
  • Algorithmic Impact: The integration of pyramid sampling within non-local networks presents a potential new standard for handling high-resolution feature maps efficiently, paving the way for further exploration and adaptation across different applications of semantic segmentation.

Future Perspectives

The advancements presented in this paper suggest a noteworthy direction for balancing computational efficiency with high-level performance in AI models. Future work could focus on extending these approaches to other domains, such as 3D segmentation and real-time video processing, where efficiency is paramount. Additionally, exploring similar sampling techniques within other architectures could yield further improvements in performance and scalability, particularly within resource-constrained environments.

In summary, the proposed Asymmetric Non-local Neural Networks provide a valuable contribution to the domain of semantic segmentation by addressing critical efficiency bottlenecks while enhancing segmentation accuracy and performance across challenging datasets.

Github Logo Streamline Icon: https://streamlinehq.com