DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection (2401.02032v2)
Abstract: Limited by the encoder-decoder architecture, learning-based edge detectors usually have difficulty predicting edge maps that satisfy both correctness and crispness. With the recent success of the diffusion probabilistic model (DPM), we found it is especially suitable for accurate and crisp edge detection since the denoising process is directly applied to the original image size. Therefore, we propose the first diffusion model for the task of general edge detection, which we call DiffusionEdge. To avoid expensive computational resources while retaining the final performance, we apply DPM in the latent space and enable the classic cross-entropy loss which is uncertainty-aware in pixel level to directly optimize the parameters in latent space in a distillation manner. We also adopt a decoupled architecture to speed up the denoising process and propose a corresponding adaptive Fourier filter to adjust the latent features of specific frequencies. With all the technical designs, DiffusionEdge can be stably trained with limited resources, predicting crisp and accurate edge maps with much fewer augmentation strategies. Extensive experiments on four edge detection benchmarks demonstrate the superiority of DiffusionEdge both in correctness and crispness. On the NYUDv2 dataset, compared to the second best, we increase the ODS, OIS (without post-processing) and AC by 30.2%, 28.1% and 65.1%, respectively. Code: https://github.com/GuHuangAI/DiffusionEdge.
- Contour detection and hierarchical image segmentation. IEEE transactions on pattern analysis and machine intelligence, 33(5): 898–916.
- Structured denoising diffusion models in discrete state-spaces. Advances in Neural Information Processing Systems, 34: 17981–17993.
- Blended diffusion for text-driven editing of natural images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18208–18218.
- High-for-low and low-for-high: Efficient boundary detection from deep object features and its applications to high-level vision. In Proceedings of the IEEE international conference on computer vision, 504–512.
- Denoising pretraining for semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 4175–4186.
- Canny, J. 1986. A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, (6): 679–698.
- Diffusiondet: Diffusion model for object detection. arXiv preprint arXiv:2211.09788.
- Boundary-preserving mask r-cnn. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIV 16, 660–676. Springer.
- Learning to predict crisp boundaries. In Proceedings of the European conference on computer vision (ECCV), 562–578.
- Fast edge detection using structured forests. IEEE transactions on pattern analysis and machine intelligence, 37(8): 1558–1570.
- -fields: neural network nearest neighbor fields for image transforms. In Asian conference on computer vision, 536–551. Springer.
- Vector quantized diffusion model for text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10696–10706.
- Oriented edge forests for boundary detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1732–1740.
- Bi-directional cascade network for perceptual edge detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 3828–3837.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
- Denoising diffusion probabilistic models. Advances in neural information processing systems, 33: 6840–6851.
- Unmixing convolutional features for crisp edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10): 6602–6609.
- Decoupled Diffusion Models with Explicit Transition Probability. arXiv preprint arXiv:2306.13720.
- Kittler, J. 1983. On the accuracy of the Sobel edge detector. Image and Vision Computing, 1(1): 37–42.
- Kokkinos, I. 2015. Pushing the boundaries of boundary detection using deep learning. arXiv preprint arXiv:1511.07386.
- Deep-learning-based object-level contour detection with CCG and CRF optimization. In 2017 IEEE International Conference on Multimedia and Expo (ICME), 859–864. IEEE.
- Richer convolutional features for edge detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3000–3009.
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, 10012–10022.
- Convolutional oriented boundaries: From image segmentation to high-level tasks. IEEE transactions on pattern analysis and machine intelligence, 40(4): 819–833.
- A systematic comparison between visual cues for boundary detection. Vision research, 120: 93–107.
- Edgeconnect: Generative image inpainting with adversarial edge learning. arXiv preprint arXiv:1901.00212.
- Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32.
- Dense extreme inception network: Towards a robust cnn model for edge detection. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, 1923–1932.
- Grad-tts: A diffusion probabilistic model for text-to-speech. In International Conference on Machine Learning, 8599–8608. PMLR.
- Edter: Edge detection with transformer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 1402–1412.
- Epicflow: Edge-preserving interpolation of correspondences for optical flow. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1164–1172.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 10684–10695.
- Deepcontour: A deep convolutional feature learned by positive-sharing loss for contour detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3982–3991.
- Indoor segmentation and support inference from rgbd images. In Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part V 12, 746–760. Springer.
- Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, 2256–2265. PMLR.
- Pixel difference networks for efficient edge detection. In Proceedings of the IEEE/CVF international conference on computer vision, 5117–5127.
- Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, 6105–6114. PMLR.
- Deep crisp boundaries: From boundaries to higher-level tasks. IEEE Transactions on Image Processing, 28(3): 1285–1298.
- MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model. In Medical Imaging with Deep Learning.
- Holistically-nested edge detection. In Proceedings of the IEEE international conference on computer vision, 1395–1403.
- Foreground-aware image inpainting. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 5840–5848.
- Learning deep structured multi-scale features using attention-gated crfs for contour prediction. Advances in neural information processing systems, 30.
- Object contour detection with a fully convolutional encoder-decoder network. In Proceedings of the IEEE conference on computer vision and pattern recognition, 193–202.
- Delving into Crispness: Guided Label Refinement for Crisp Edge Detection. IEEE Transactions on Image Processing.
- NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8486–8495.
- The Treasure Beneath Multiple Annotations: An Uncertainty-aware Edge Detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15507–15517.
- Edge boxes: Locating object proposals from edges. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, 391–405. Springer.