Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CF-Font: Content Fusion for Few-shot Font Generation (2303.14017v3)

Published 24 Mar 2023 in cs.CV

Abstract: Content and style disentanglement is an effective way to achieve few-shot font generation. It allows to transfer the style of the font image in a source domain to the style defined with a few reference images in a target domain. However, the content feature extracted using a representative font might not be optimal. In light of this, we propose a content fusion module (CFM) to project the content feature into a linear space defined by the content features of basis fonts, which can take the variation of content features caused by different fonts into consideration. Our method also allows to optimize the style representation vector of reference images through a lightweight iterative style-vector refinement (ISR) strategy. Moreover, we treat the 1D projection of a character image as a probability distribution and leverage the distance between two distributions as the reconstruction loss (namely projected character loss, PCL). Compared to L2 or L1 reconstruction loss, the distribution distance pays more attention to the global shape of characters. We have evaluated our method on a dataset of 300 fonts with 6.5k characters each. Experimental results verify that our method outperforms existing state-of-the-art few-shot font generation methods by a large margin. The source code can be found at https://github.com/wangchi95/CF-Font.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Chi Wang (93 papers)
  2. Min Zhou (65 papers)
  3. Tiezheng Ge (46 papers)
  4. Yuning Jiang (106 papers)
  5. Hujun Bao (134 papers)
  6. Weiwei Xu (65 papers)
Citations (24)

Summary

Content Fusion for Few-shot Font Generation

The paper "CF-Font: Content Fusion for Few-shot Font Generation" focuses on advancing the capabilities of few-shot font generation through a novel approach to content and style disentanglement. Few-shot font generation is tasked with the production of a new font character set in a style requiring only a limited number of reference images. This challenge is particularly pertinent for logographic languages, which inherently contain a large number of characters.

The introduction of the Content Fusion Module (CFM) stands as the cornerstone of this approach. The authors propose that a single representative font may not adequately capture the variations required for different styles. Thus, CFM projects the content feature into a linear space formed by a set of basis fonts, dynamically blending these features based on adaptive weights. This paradigm is set to address the limitations arising from content feature extraction from a singular font, which could lead to suboptimal style transfer.

The CFM enables a more flexible and comprehensive content representation crucial for few-shot font style adaptation. Basis fonts are selected through a clustering process on content features, ensuring diverse and representative coverage. The weights for each font's feature contribution are calculated based on similarity, fostering an adaptive, data-driven synthesis process.

Complementing this is the Iterative Style-vector Refinement (ISR) strategy, intended to enhance the style representation vector. This iterative process iteratively optimizes the style vector during training, refining and improving the overall quality of font style representation.

A key innovation introduced by the authors is the projected character loss (PCL). Here, character images are treated as one-dimensional probability distributions; distances between these distributions serve as a reconstruction loss metric. This method offers a global shape-focused analysis, surpassing traditional L2 and L1 losses that can disproportionately weigh pixel-level accuracy at the expense of global character form.

The empirical section substantiates the efficacy of CF-Font against several state-of-the-art methods—namely, LF-Font, MX-Font, DG-Font, and others—across a myriad of analytical metrics like L1, RMSE, SSIM, LPIPS, and FID. In evaluations involving both seen and unseen font sets, CF-Font consistently demonstrated superior performance, particularly in globally perceptual metrics such as FID. This underscores its ability to generate visually coherent and aesthetically faithful fonts even when faced with styles starkly dissimilar from those encountered during training.

This work holds significant implications for practical applications where rapid font generation is necessary. For instance, tasks such as font reconstruction from limited historical examples or generating personalized fonts benefit substantially from this research. Theoretically, it advances the discourse on disentangling style and content in generative models, providing a robust framework that may inspire analogous applications across various domains of image generation.

Future developments in this field could potentially align with the exploration of vector-based font generation, given their resolution independence and superior applicability to practical design workflows. Additionally, further refinement of the basis font selection process and more efficient computation of fusion weights could catalyze further enhancements in performance. As the landscape of AI-driven design tools evolves, methodologies like CF-Font will undoubtedly play a pivotal role in shaping the future of digital typography.

Github Logo Streamline Icon: https://streamlinehq.com