Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 56 tok/s

Gemini 2.5 Pro 38 tok/s Pro

GPT-5 Medium 26 tok/s Pro

GPT-5 High 22 tok/s Pro

GPT-4o 84 tok/s Pro

Kimi K2 182 tok/s Pro

GPT OSS 120B 420 tok/s Pro

Claude Sonnet 4.5 30 tok/s Pro

2000 character limit reached

Mask Guided Matting via Progressive Refinement Network (2012.06722v2)

Published 12 Dec 2020 in cs.CV

Abstract: We propose Mask Guided (MG) Matting, a robust matting framework that takes a general coarse mask as guidance. MG Matting leverages a network (PRN) design which encourages the matting model to provide self-guidance to progressively refine the uncertain regions through the decoding process. A series of guidance mask perturbation operations are also introduced in the training to further enhance its robustness to external guidance. We show that PRN can generalize to unseen types of guidance masks such as trimap and low-quality alpha matte, making it suitable for various application pipelines. In addition, we revisit the foreground color prediction problem for matting and propose a surprisingly simple improvement to address the dataset issue. Evaluation on real and synthetic benchmarks shows that MG Matting achieves state-of-the-art performance using various types of guidance inputs. Code and models are available at https://github.com/yucornetto/MGMatting.

Citations (100)

View on Semantic Scholar

Summary

The paper introduces Mask Guided (MG) Matting, a flexible framework that uses various guidance masks and a Progressive Refinement Network to iteratively improve matting results.
The framework achieves state-of-the-art performance on synthetic benchmarks, showing significant improvements in metrics like SAD and MSE compared to previous methods.
The work addresses foreground color estimation and introduces a new portrait matting benchmark dataset, demonstrating enhanced robustness and detail retention on real-world images.

This paper introduces the Mask Guided (MG) Matting framework, designed to improve the flexibility and robustness of image matting by utilizing a novel method of handling guidance masks. The framework allows for varying types and qualities of guidance, such as trimaps, binary segmentation masks, and low-quality alpha mattes, making it adaptable across multiple application scenarios.

The core innovation in this work is the Progressive Refinement Network (PRN), which leverages self-guidance to iteratively enhance the matting result. This decoding process involves generating self-guidance masks from the prior outputs to refine uncertain regions progressively. This approach contrasts with traditional methods that rely heavily on a trimap, which can be cumbersome for the user and limits non-interactive applications.

The paper reports state-of-the-art results on synthetic benchmarks, with the framework demonstrating significant improvements over previous methods in key metrics such as Sum of Absolute Differences (SAD) and Mean Squared Error (MSE). For example, on the Composition-1k test set, MG Matting achieves a SAD of 31.5 compared to 35.8 by the Context-Aware Matting method.

In addition to alpha matte prediction, the paper addresses the often-overlooked challenge of foreground color estimation. The authors recognize the limitations of existing datasets, noting issues with label accuracy and data diversity. They propose Random Alpha Blending (RAB) as a solution to enhance training data quality, which significantly improves foreground color prediction accuracy.

An extensive evaluation on real-world image data further underscores the model's robustness. A new portrait matting benchmark dataset is introduced, aimed at assessing the practical applicability of matting models outside the constraints of synthetic data. Results indicate enhanced performance and detail retention in high-frequency regions like hair, with MG Matting outperforming commercial tools like Photoshop in certain scenarios.

The proposed MG Matting framework, with its robust adaptability to various guidance types and improved handling of practical datasets, opens potential avenues for future research. The ability to generalize across different input qualities and forms may further drive innovations in real-time video editing and interactive image processing applications.

Overall, the work makes a compelling contribution by relaxing traditional constraints on guidance requirements and advancing the robustness of matting models in diverse scenarios, potentially influencing forthcoming advancements in the field. Future research could explore the integration of MG Matting into real-time systems and its performance on highly dynamic scenes in video processing.