Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 59 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 32 tok/s Pro
GPT-5 High 33 tok/s Pro
GPT-4o 127 tok/s Pro
Kimi K2 189 tok/s Pro
GPT OSS 120B 421 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Learning Multiple Representations with Inconsistency-Guided Detail Regularization for Mask-Guided Matting (2403.19213v1)

Published 28 Mar 2024 in cs.CV

Abstract: Mask-guided matting networks have achieved significant improvements and have shown great potential in practical applications in recent years. However, simply learning matting representation from synthetic and lack-of-real-world-diversity matting data, these approaches tend to overfit low-level details in wrong regions, lack generalization to objects with complex structures and real-world scenes such as shadows, as well as suffer from interference of background lines or textures. To address these challenges, in this paper, we propose a novel auxiliary learning framework for mask-guided matting models, incorporating three auxiliary tasks: semantic segmentation, edge detection, and background line detection besides matting, to learn different and effective representations from different types of data and annotations. Our framework and model introduce the following key aspects: (1) to learn real-world adaptive semantic representation for objects with diverse and complex structures under real-world scenes, we introduce extra semantic segmentation and edge detection tasks on more diverse real-world data with segmentation annotations; (2) to avoid overfitting on low-level details, we propose a module to utilize the inconsistency between learned segmentation and matting representations to regularize detail refinement; (3) we propose a novel background line detection task into our auxiliary learning framework, to suppress interference of background lines or textures. In addition, we propose a high-quality matting benchmark, Plant-Mat, to evaluate matting methods on complex structures. Extensively quantitative and qualitative results show that our approach outperforms state-of-the-art mask-guided methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. N. Xu, B. Price, S. Cohen, and T. Huang, “Deep image matting,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2970–2979.
  2. A. Rao et al., “A coarse-to-fine framework for automatic video unscreen,” IEEE Transactions on Multimedia, 2022.
  3. K.-T. Ng, Z.-Y. Zhu, C. Wang, S.-C. Chan, and H.-Y. Shum, “A multi-camera approach to image-based rendering and 3-d/multiview display of ancient chinese artifacts,” IEEE transactions on multimedia, vol. 14, no. 6, pp. 1631–1641, 2012.
  4. F.-L. Zhang, M. Wang, and S.-M. Hu, “Aesthetic image enhancement by dependence-aware object recomposition,” IEEE Transactions on Multimedia, vol. 15, no. 7, pp. 1480–1490, 2013.
  5. C. L. Zitnick and S. B. Kang, “Stereo for image-based rendering using image over-segmentation,” International Journal of Computer Vision, vol. 75, no. 1, pp. 49–65, 2007.
  6. Q. Yu et al., “Mask guided matting via progressive refinement network,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 1154–1163.
  7. A. Levin, D. Lischinski, and Y. Weiss, “A closed-form solution to natural image matting,” IEEE transactions on pattern analysis and machine intelligence, vol. 30, no. 2, pp. 228–242, 2007.
  8. Y.-Y. Chuang, B. Curless, D. H. Salesin, and R. Szeliski, “A bayesian approach to digital matting,” in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, vol. 2.   IEEE, 2001, pp. II–II.
  9. J. Wang and M. F. Cohen, “Optimized color sampling for robust matting,” in 2007 IEEE Conference on Computer Vision and Pattern Recognition.   IEEE, 2007, pp. 1–8.
  10. K. He, C. Rhemann, C. Rother, X. Tang, and J. Sun, “A global sampling method for alpha matting,” in CVPR 2011.   IEEE, 2011, pp. 2049–2056.
  11. E. S. Gastal and M. M. Oliveira, “Shared sampling for real-time alpha matting,” in Computer Graphics Forum, vol. 29, no. 2.   Wiley Online Library, 2010, pp. 575–584.
  12. Q. Chen, D. Li, and C.-K. Tang, “Knn matting,” IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 9, pp. 2175–2188, 2013.
  13. K. He, J. Sun, and X. Tang, “Fast matting using large kernel matting laplacian matrices,” in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.   IEEE, 2010, pp. 2165–2172.
  14. Y. Liu, J. Xie, Y. Qiao, Y. Tang, and X. Yang, “Prior-induced information alignment for image matting,” IEEE Transactions on Multimedia, 2021.
  15. Y. Li and H. Lu, “Natural image matting via guided contextual attention,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, 2020, pp. 11 450–11 457.
  16. Y. Dai, H. Lu, and C. Shen, “Learning affinity-aware upsampling for deep image matting,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 6841–6850.
  17. G. Park, S. Son, J. Yoo, S. Kim, and N. Kwak, “Matteformer: Transformer-based image matting via prior-tokens,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11 696–11 706.
  18. Y. Sun, C.-K. Tang, and Y.-W. Tai, “Semantic image matting,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 11 120–11 129.
  19. S. Cai et al., “Disentangled image matting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 8819–8828.
  20. M. Forte and F. Pitié, “f𝑓fitalic_f, b𝑏bitalic_b, alpha matting,” arXiv preprint arXiv:2003.07711, 2020.
  21. Q. Liu, H. Xie, S. Zhang, B. Zhong, and R. Ji, “Long-range feature propagating for natural image matting,” in Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp. 526–534.
  22. Y. Sun, C.-K. Tang, and Y.-W. Tai, “Semantic image matting: General and specific semantics,” International Journal of Computer Vision, pp. 1–21, 2023.
  23. H. Yu, N. Xu, Z. Huang, Y. Zhou, and H. Shi, “High-resolution deep image matting,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 4, 2021, pp. 3217–3224.
  24. J. Tang, Y. Aksoy, C. Oztireli, M. Gross, and T. O. Aydin, “Learning-based sampling for natural image matting,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 3055–3063.
  25. S. Lutz, K. Amplianitis, and A. Smolic, “Alphagan: Generative adversarial networks for natural image matting,” in BMVC, 2018.
  26. Y. Dai, B. Price, H. Zhang, and C. Shen, “Boosting robustness of image matting with context assembling and strong data augmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11 707–11 716.
  27. Y. Li, J. Zhang, W. Zhao, W. Jiang, and H. Lu, “Inductive guided filter: Real-time deep matting with weakly annotated masks on mobile devices,” in 2020 IEEE International Conference on Multimedia and Expo (ICME).   IEEE, 2020, pp. 1–6.
  28. Q. Hou and F. Liu, “Context-aware image matting for simultaneous foreground and alpha estimation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 4130–4139.
  29. J. Dai et al., “Deformable convolutional networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 764–773.
  30. X. Zhu, Y. Xiong, J. Dai, L. Yuan, and Y. Wei, “Deep feature flow for video recognition,” in CVPR, July 2017.
  31. X. Li et al., “Semantic flow for fast and accurate scene parsing,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16.   Springer, 2020, pp. 775–793.
  32. R. G. Von Gioi, J. Jakubowicz, J.-M. Morel, and G. Randall, “Lsd: A fast line segment detector with a false detection control,” IEEE transactions on pattern analysis and machine intelligence, vol. 32, no. 4, pp. 722–732, 2008.
  33. R. Pautrat, D. Barath, V. Larsson, M. R. Oswald, and M. Pollefeys, “Deeplsd: Line segment detection and refinement with deep image gradients,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 17 327–17 336.
  34. J. Li, J. Zhang, and D. Tao, “Deep automatic natural image matting,” in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, Z.-H. Zhou, Ed.   International Joint Conferences on Artificial Intelligence Organization, 8 2021, pp. 800–806, main Track.
  35. J. Li, J. Zhang, S. J. Maybank, and D. Tao, “Bridging composite and real: towards end-to-end deep image matting,” International Journal of Computer Vision, vol. 130, no. 2, pp. 246–266, 2022.
  36. Z. Ke, J. Sun, K. Li, Q. Yan, and R. W. Lau, “Modnet: Real-time trimap-free portrait matting via objective decomposition,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 1, 2022, pp. 1140–1147.
  37. H. Lu, Y. Dai, C. Shen, and S. Xu, “Indices matter: Learning to index for deep image matting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3266–3275.
  38. Y. Zhang et al., “A late fusion cnn for digital matting,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 7469–7478.
  39. S. Ma, J. Li, J. Zhang, H. Zhang, and D. Tao, “Rethinking portrait matting with pirvacy preserving,” International Journal of Computer Vision, 2023.
  40. Q. Chen, T. Ge, Y. Xu, Z. Zhang, X. Yang, and K. Gai, “Semantic human matting,” in Proceedings of the 26th ACM international conference on Multimedia, 2018, pp. 618–626.
  41. S. Lin, L. Yang, I. Saleemi, and S. Sengupta, “Robust high-resolution video matting with temporal guidance,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022, pp. 238–247.
  42. K. Park, S. Woo, S. W. Oh, I. S. Kweon, and J.-Y. Lee, “Mask-guided matting in the wild,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 1992–2001.
  43. A. Tarvainen and H. Valpola, “Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results,” Advances in neural information processing systems, vol. 30, 2017.
  44. C. Akinlar and C. Topal, “Edlines: Real-time line segment detection by edge drawing (ed),” in 2011 18th IEEE International Conference on Image Processing.   IEEE, 2011, pp. 2837–2840.
  45. Y. Salaün, R. Marlet, and P. Monasse, “Multiscale line segment detector for robust and accurate sfm,” in 2016 23rd International Conference on Pattern Recognition (ICPR).   IEEE, 2016, pp. 2000–2005.
  46. I. Suárez, J. M. Buenaposada, and L. Baumela, “Elsed: Enhanced line segment drawing,” Pattern Recognition, vol. 127, p. 108619, 2022.
  47. K. Huang, Y. Wang, Z. Zhou, T. Ding, S. Gao, and Y. Ma, “Learning to parse wireframes in images of man-made environments,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 626–635.
  48. Y. Zhou, H. Qi, and Y. Ma, “End-to-end wireframe parsing,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 962–971.
  49. N. Xue, S. Bai, F. Wang, G.-S. Xia, T. Wu, and L. Zhang, “Learning attraction field representation for robust line segment detection,” 2019.
  50. Z. Zhang et al., “Ppgnet: Learning point-pair graph for line segment detection,” 2019.
  51. D. DeTone, T. Malisiewicz, and A. Rabinovich, “Superpoint: Self-supervised interest point detection and description,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018, pp. 224–236.
  52. T. Ruan, T. Liu, Z. Huang, Y. Wei, S. Wei, and Y. Zhao, “Devil in the details: Towards accurate single and multiple human parsing,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, 2019, pp. 4814–4821.
  53. A. R. Smith and J. F. Blinn, “Blue screen matting,” in Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, 1996, pp. 259–268.
  54. Y. Liu et al., “Tripartite information mining and integration for image matting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 7555–7564.
  55. J. Li, S. Ma, J. Zhang, and D. Tao, “Privacy-preserving portrait matting,” in Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp. 3501–3509.
  56. C. Xie, C. Xia, M. Ma, Z. Zhao, X. Chen, and J. Li, “Pyramid grafting network for one-stage high resolution saliency detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11 717–11 726.
  57. Y. Zeng, P. Zhang, J. Zhang, Z. Lin, and H. Lu, “Towards high-resolution salient object detection,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 7234–7243.
  58. T.-Y. Lin et al., “Microsoft coco: Common objects in context,” in European conference on computer vision.   Springer, 2014, pp. 740–755.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 post and received 0 likes.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube