Paint Bucket Colorization Using Anime Character Color Design Sheets (2410.19424v1)
Abstract: Line art colorization plays a crucial role in hand-drawn animation production, where digital artists manually colorize segments using a paint bucket tool, guided by RGB values from character color design sheets. This process, often called paint bucket colorization, involves two main tasks: keyframe colorization, where colors are applied according to the character's color design sheet, and consecutive frame colorization, where these colors are replicated across adjacent frames. Current automated colorization methods primarily focus on reference-based and segment-matching approaches. However, reference-based methods often fail to accurately assign specific colors to each region, while matching-based methods are limited to consecutive frame colorization and struggle with issues like significant deformation and occlusion. In this work, we introduce inclusion matching, which allows the network to understand the inclusion relationships between segments, rather than relying solely on direct visual correspondences. By integrating this approach with segment parsing and color warping modules, our inclusion matching pipeline significantly improves performance in both keyframe colorization and consecutive frame colorization. To support our network's training, we have developed a unique dataset named PaintBucket-Character, which includes rendered line arts alongside their colorized versions and shading annotations for various 3D characters. To replicate industry animation data formats, we also created color design sheets for each character, with semantic information for each color and standard pose reference images. Experiments highlight the superiority of our method, demonstrating accurate and consistent colorization across both our proposed benchmarks and hand-drawn animations.
- Y. Dai, S. Zhou, Q. Li, C. Li, and C. C. Loy, “Learning inclusion matching for animation paint bucket colorization,” CVPR, 2024.
- D. Sỳkora, J. Buriánek, and J. Žára, “Unsupervised colorization of black-and-white cartoons,” in International Symposium on Non-photorealistic Animation and Rendering, 2004.
- D. Sỳkora, J. Dingliana, and S. Collins, “As-rigid-as-possible image registration for hand-drawn cartoon animations,” in International Symposium on Non-photorealistic Animation and Rendering, 2009.
- S.-Y. Chen, J.-Q. Zhang, L. Gao, Y. He, S. Xia, M. Shi, and F.-L. Zhang, “Active colorization for cartoon line drawings,” IEEE TVCG, vol. 28, no. 2, 2020.
- J. Lee, E. Kim, Y. Lee, D. Kim, J. Chang, and J. Choo, “Reference-based sketch image colorization using augmented-self reference and dense semantic correspondence,” in CVPR, 2020.
- S. Wu, X. Yan, W. Liu, S. Xu, and S. Zhang, “Self-driven dual-path learning for reference-based line art colorization under limited data,” IEEE TCSVT, vol. 34, no. 3, 2023.
- Q. Zhang, B. Wang, W. Wen, H. Li, and J. Liu, “Line art correlation matching feature transfer network for automatic animation colorization,” in WACV, 2021.
- L. Zhang, A. Rao, and M. Agrawala, “Adding conditional control to text-to-image diffusion models,” in ICCV, 2023.
- K. Akita, Y. Morimoto, and R. Tsuruno, “Hand-drawn anime line drawing colorization of faces with texture details,” Computer Animation and Virtual Worlds, 2023.
- Y. Cao, X. Meng, P. Mok, X. Liu, T.-Y. Lee, and P. Li, “Animediffusion: Anime face line drawing colorization via diffusion models,” IEEE TVCG, vol. 30, no. 10, 2024.
- J. Xing, H. Liu, M. Xia, Y. Zhang, X. Wang, Y. Shan, and T.-T. Wong, “Tooncrafter: Generative cartoon interpolation,” arXiv preprint arxiv:2405.17933, 2024.
- E. Casey, V. Pérez, and Z. Li, “The animation transformer: Visual correspondence via segment matching,” in ICCV, 2021.
- L. Siyao, Y. Li, B. Li, C. Dong, Z. Liu, and C. C. Loy, “Animerun: 2d animation visual correspondence from open source 3d movies,” NeurIPS, 2022.
- A. Maejima, H. Kubo, T. Funatomi, T. Yotsukura, S. Nakamura, and Y. Mukaigawa, “Graph matching based anime colorization with multiple references,” in SIGGRAPH, 2019.
- A. Maejima, S. Shinagawa, H. Kubo, T. Funatomi, T. Yotsukura, S. Nakamura, and Y. Mukaigawa, “Continual few-shot patch-based learning for anime-style colorization,” CVM, vol. 10, no. 4, 2024.
- “Mixamo,” https://www.mixamo.com/.
- “Aplaybox,” https://www.aplaybox.com/.
- H. Kim, H. Y. Jhoo, E. Park, and S. Yoo, “Tag2Pix: Line art colorization using text tag with SECat and changing loss,” in ICCV, 2019.
- C. Zou, H. Mo, C. Gao, R. Du, and H. Fu, “Language-based colorization of scene sketches,” ACM TOG, vol. 38, no. 6, 2019.
- D. Sỳkora, J. Dingliana, and S. Collins, “Lazybrush: Flexible painting tool for hand-drawn cartoons,” in CGF, vol. 28, no. 2, 2009.
- L. Zhang, C. Li, T.-T. Wong, Y. Ji, and C. Liu, “Two-stage sketch colorization,” ACM TOG, vol. 37, no. 6, 2018.
- L. Zhang, C. Li, E. Simo-Serra, Y. Ji, T.-T. Wong, and C. Liu, “User-guided line art flat filling with split filling mechanism,” in CVPR, 2021.
- R. Cao, H. Mo, and C. Gao, “Line art colorization based on explicit region segmentation,” in CGF, vol. 40, 2021.
- Z. Huang, M. Zhang, and J. Liao, “Lvcd: reference-based lineart video colorization with diffusion models,” in SIGGRAPH, 2024.
- L. Siyao, S. Zhao, W. Yu, W. Sun, D. Metaxas, C. C. Loy, and Z. Liu, “Deep animation video interpolation in the wild,” in CVPR, 2021.
- D. J. Butler, J. Wulff, G. B. Stanley, and M. J. Black, “A naturalistic open source movie for optical flow evaluation,” in ECCV, 2012.
- M. Shugrina, Z. Liang, A. Kar, J. Li, A. Singh, K. Singh, and S. Fidler, “Creative flow+ dataset,” in CVPR, 2019.
- S. Liu, X. Wang, X. Liu, Z. Wu, and H. S. Seah, “Shape correspondence for cel animation based on a shape association graph and spectral matching,” CVM, 2023.
- S. Liu, X. Wang, Z. Wu, and H. S. Seah, “Shape correspondence based on kendall shape space and rag for 2d animation,” The Visual Computer, vol. 36, 2020.
- H. Zhu, X. Liu, T.-T. Wong, and P.-A. Heng, “Globally optimal toon tracking,” ACM TOG, vol. 35, no. 4, 2016.
- L. Zhang, H. Huang, and H. Fu, “EXCOL: An EXtract-and-COmplete layering approach to cartoon animation reusing,” IEEE TVCG, vol. 18, no. 7, 2012.
- T. D. Q. Dang, T. Do, A. Nguyen, V. Pham, Q. Nguyen, B. Hoang, and G. Nguyen, “Correspondence neural network for line art colorization,” in SIGGRAPH, 2020.
- M.-K. Hu, “Visual pattern recognition by moment invariants,” IRE transactions on information theory, vol. 8, no. 2, 1962.
- O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in MICCAI, 2015.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” NeurIPS, 2017.
- P.-E. Sarlin, D. DeTone, T. Malisiewicz, and A. Rabinovich, “SuperGlue: Learning feature matching with graph neural networks,” in CVPR, 2020.
- T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in ICML, 2020.
- K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in CVPR, 2020.
- A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in ICML, 2021.
- C. Jia, Y. Yang, Y. Xia, Y.-T. Chen, Z. Parekh, H. Pham, Q. Le, Y.-H. Sung, Z. Li, and T. Duerig, “Scaling up visual and vision-language representation learning with noisy text supervision,” in ICML, 2021.
- F. Reda, J. Kontkanen, E. Tabellion, D. Sun, C. Pantofaru, and B. Curless, “Film: Frame interpolation for large motion,” in ECCV, 2022.
- H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing network,” in CVPR, 2017.
- S. Zhou, C. Li, and C. Change Loy, “LEDNet: Joint low-light enhancement and deblurring in the dark,” in ECCV, 2022.
- Z. Teed and J. Deng, “Raft: Recurrent all-pairs field transforms for optical flow,” in ECCV, 2020.
- J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei, “Deformable convolutional networks,” in ICCV, 2017.
- M. Cherti, R. Beaumont, R. Wightman, M. Wortsman, G. Ilharco, C. Gordon, C. Schuhmann, L. Schmidt, and J. Jitsev, “Reproducible scaling laws for contrastive language-image learning,” in CVPR, 2023.
- “Cadmium,” https://cadmium.app/.
- N. Ruiz, Y. Li, V. Jampani, Y. Pritch, M. Rubinstein, and K. Aberman, “Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation,” in CVPR, 2023.
- H. Ye, J. Zhang, S. Liu, X. Han, and W. Yang, “IP-adapter: Text compatible image prompt adapter for text-to-image diffusion models,” arXiv preprint arxiv:2308.06721, 2023.
- R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in CVPR, 2022.
- E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “LoRA: Low-rank adaptation of large language models,” in ICLR, 2022.