Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Boundary Attention: Learning curves, corners, junctions and grouping (2401.00935v3)

Published 1 Jan 2024 in cs.CV

Abstract: We present a lightweight network that infers grouping and boundaries, including curves, corners and junctions. It operates in a bottom-up fashion, analogous to classical methods for sub-pixel edge localization and edge-linking, but with a higher-dimensional representation of local boundary structure, and notions of local scale and spatial consistency that are learned instead of designed. Our network uses a mechanism that we call boundary attention: a geometry-aware local attention operation that, when applied densely and repeatedly, progressively refines a pixel-resolution field of variables that specify the boundary structure in every overlapping patch within an image. Unlike many edge detectors that produce rasterized binary edge maps, our model provides a rich, unrasterized representation of the geometric structure in every local region. We find that its intentional geometric bias allows it to be trained on simple synthetic shapes and then generalize to extracting boundaries from noisy low-light photographs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. A high-quality denoising dataset for smartphone cameras. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  2. SLIC superpixels. Technical report, 2010.
  3. Deep ViT features as dense visual descriptors. In ECCV Workshops on What is Motion For?, page 4, 2021.
  4. John Canny. A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, (6):679–698, 1986.
  5. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9650–9660, 2021.
  6. Edwin Earl Catmull. A subdivision algorithm for computer display of curved surfaces. The University of Utah, 1974.
  7. Active contours without edges. IEEE Transactions on image processing, 10(2):266–277, 2001.
  8. Feature detection in human vision: A phase-dependent energy model. Proc. Royal Soc. B, 235(1280):221–245, 1988.
  9. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Transactions on image processing, 16(8):2080–2095, 2007.
  10. Fast edge detection using structured forests, 2014.
  11. Efficient graph-based image segmentation. International Journal of Computer Vision, 59:167–181, 2004.
  12. Fundamentals of interactive computer graphics. Addison-Wesley Longman Publishing Co., Inc., 1982.
  13. William T. Freeman. Steerable filters and local analysis of image structure. PhD thesis, Massachusetts Institute of Technology, 1992.
  14. The design and use of steerable filters. IEEE Transactions on pattern analysis and machine intelligence, 13(9):891–906, 1991.
  15. A combined corner and edge detector. In Alvey vision conference, pages 10–5244. Citeseer, 1988.
  16. Paul S. Heckbert. Fundamentals of texture mapping and image warping. Citeseer, 1989.
  17. Bridging nonlinearities and stochastic regularizers with gaussian error linear units. CoRR, abs/1606.08415, 2016.
  18. Snakes: Active contour models. International journal of computer vision, 1(4):321–331, 1988.
  19. Segment anything. arXiv:2304.02643, 2023.
  20. Efficient inference in fully connected CRFs with Gaussian edge potentials. Advances in neural information processing systems, 2011.
  21. Using contours to detect and localize junctions in natural images. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8. IEEE, 2008.
  22. Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on pattern analysis and machine intelligence, 26(5):530–549, 2004.
  23. Fast detection of curved edges at low SNR. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  24. On detection of faint edges in noisy images. IEEE Transactions on pattern analysis and machine intelligence, 42(4):894–908, 2019.
  25. Trace inference, curvature consistency, and curve detection. IEEE Transactions on pattern analysis and machine intelligence, 11(8):823–839, 1989.
  26. Ken Perlin. An image synthesizer. ACM Siggraph Computer Graphics, 19(3):287–296, 1985.
  27. Juan Pineda. A parallel algorithm for polygon rasterization. In Proceedings of the 15th annual conference on Computer graphics and interactive techniques, pages 17–20, 1988.
  28. EDTER: Edge detection with transformer. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  29. Learning conditional random fields for stereo. In 2007 IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8, 2007.
  30. High-accuracy stereo depth maps using structured light. In 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings., pages I–I, 2003.
  31. Pixel difference networks for efficient edge detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 5117–5127, 2021.
  32. MLP-Mixer: An all-MLP architecture for vision. Advances in neural information processing systems, 34:24261–24272, 2021.
  33. Field of junctions: Extracting boundary structure at low SNR. In Proceedings of the IEEE/CVF international conference on computer vision, 2021.
  34. A physics-based noise formation model for extreme low-light raw denoising. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2758–2767, 2020.
  35. Holistically-nested edge detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1395–1403, 2015.
  36. Anisotropic-scale junction detection and matching for indoor images. IEEE Transactions on Image Processing, 27(1):78–91, 2017.
Citations (1)

Summary

  • The paper introduces a novel boundary attention network that iteratively refines local geometric representations for accurate sub-pixel detection.
  • It employs a local attention mechanism that adapts to varying noise levels, outperforming state-of-the-art methods in speed and accuracy.
  • The model demonstrates strong generalization by effectively processing real images of any size or aspect ratio using low-level cues.

Introduction

In computer vision, one of the significant tasks is to detect and interpret boundaries in images, such as edges, corners, and junctions. These boundaries are crucial for understanding the geometric details in a scene or object. Existing techniques often struggle with faint boundary signals or high noise levels, which can obscure critical details. Classical edge-detection methods have limitations in accuracy, particularly near corners and junctions. Recent deep learning models show promise but come with their own challenges, including a dependency on training datasets and difficulty in achieving sub-pixel precision.

Representing Boundaries with Attention Mechanisms

A novel network design is proposed to model boundaries in images more robustly and accurately. This design introduces "boundary attention," a concept that entails iteratively refining the local geometric representation around every pixel in an image. The network essentially builds a field of boundary descriptors that evolve to capture the image's local geometry precisely.

Adaptable Accuracy and Noise Resilience

The ability to adapt to various noise levels and geometric detail is a standout feature of the model. It achieves this through a local attention mechanism that adjusts its processing based on the particular image region, enabling it to handle faint boundaries amid noise efficiently. Unlike some earlier methods, this model does not rely on global features or human annotation during training, focusing instead on low-level cues and geometric consistency. This focus endows the model with the ability to accurately find sub-pixel level boundaries while being resilient to high amounts of noise.

Evaluating the Model

The performance of the network is noteworthy. It's been evaluated on images with severe noise conditions, demonstrating better or comparable results to other state-of-the-art methods while running significantly faster. Additionally, despite being trained on simple synthetic data, the model shows strong generalization capabilities to real images. Importantly, it can handle images at any size and aspect ratio, making it highly flexible and applicable in various practical scenarios.

Conclusion

This research presents a significant step forward in boundary detection, especially in challenging conditions, such as noisy environments or when dealing with fine details. By combining deep learning with a focus on low-level cues and adaptability, the model sets itself apart from traditional methods and stands out as an efficient and robust solution for a wide range of applications in computer vision.

Youtube Logo Streamline Icon: https://streamlinehq.com