Differentiable Brushstroke Reconstruction

Updated 19 November 2025

The paper introduces a differentiable rendering pipeline that optimizes explicit brushstroke parameters using gradient descent to reconstruct and synthesize painting-like images.
It employs quadratic Bézier curves and soft assignment functions to enable smooth, end-to-end gradient propagation for precise stroke geometry and style transfer.
Advanced extensions integrate latent style vectors and robotic implementations, facilitating interactive, high-fidelity artistic reproduction in both digital and analog media.

Differentiable brushstroke reconstruction refers to a family of computational techniques that enable the recovery and generation of painting-like images where explicit, continuous brushstroke parameters (geometry, color, opacity, texture) are optimized via gradient-based methods. Central to these frameworks is a differentiable rendering pipeline, which permits end-to-end optimization of both stroke parameters and stylization objectives, enabling not only image synthesis and style transfer but also fine-grained analysis and reproduction of human artistic processes across analog and digital domains.

1. Core Concepts and Mathematical Formulation

The foundation of differentiable brushstroke reconstruction is the explicit parameterization and differentiable rasterization of strokes. Most frameworks utilize quadratic or higher-order Bézier curves to encode the centerline of each brushstroke. For example, a quadratic Bézier stroke is defined by three 2D control points $\mathbf{P}_0, \mathbf{P}_1, \mathbf{P}_2 \in \mathbb{R}^2$ , a width parameter $w \in \mathbb{R}_{>0}$ , RGB or RGBA color $c$ , and sometimes pressure, opacity, and additional style vectors (Jiang et al., 17 Nov 2025, Kotovenko et al., 2021, Nakano, 2019). Stroke geometry at sample $t\in[0,1]$ along the curve is given by:

$\mathbf{B}(t) = (1-t)^2\,\mathbf{P}_0 + 2(1-t)t\,\mathbf{P}_1 + t^2\,\mathbf{P}_2$

Opacity, color, and thickness may be interpolated along $t$ .

Differentiable renderers map the set of stroke parameters to raster images by computing per-pixel coverage either via signed-distance functions or stamp-based composition. Soft assignment functions (based on sigmoid, softmin, or softmax) ensure gradients exist almost everywhere with respect to all stroke parameters, enabling direct use in gradient descent frameworks (Jiang et al., 17 Nov 2025, Mihai et al., 2021, Kotovenko et al., 2021).

In advanced systems, each stroke may be endowed with additional latent style codes and textural parameters to model material or medium-specific appearance, and may be composited with custom operators to capture further effects, such as smudging and dry-brushing (Jiang et al., 17 Nov 2025).

2. End-to-End Optimization Pipelines

A canonical differentiable brushstroke reconstruction pipeline comprises the following sequence:

Initialization: The canvas is seeded with an initial set of stroke parameters, which may be placed randomly, derived from edge maps, or initialized via superpixel grouping.
Differentiable Rendering: The renderer computes an RGB(A) image by compositing all active strokes, using soft assignment and differentiable local blending.
Loss Construction: Losses incorporate pixel-wise differences, perceptual metrics (e.g., VGG feature or LPIPS distances), regularization on stroke geometry and count, and, optionally, style- and flow-guided components.
Gradient Backpropagation and Update: Automatic differentiation propagates gradients through the renderer into the stroke parameters, which are iteratively updated via optimizers such as Adam or RMSprop.

Specific architectures—such as autoencoders (TrajVAE (Chen et al., 30 Nov 2024)), conditional GANs (Nakano, 2019), or hybrid analytic-neural decoders—may be employed depending on the dataset and application. Coarse-to-fine strategies increase fidelity, typically by incrementally raising the number or complexity of strokes at each reconstruction level (Jiang et al., 17 Nov 2025).

3. Extensions: Texturing, Smudging, and Human Style Reproduction

Moving beyond simple parametric curves, recent frameworks synthesize geometry-conditioned textures and simulate complex physical effects. Style generation modules, implemented as conditional StyleGANs or neural brushstroke engines, generate detailed stroke textures that reflect both geometry and latent style vectors (Jiang et al., 17 Nov 2025). Differentiable smudge operators implement stroke-wise pigment transfer on the canvas, parameterized by trajectory, per-stamp radii, and brush-canvas blending coefficients. These operators unroll non-local brush interactions using normalized length-aware blending kernels, ensuring full differentiability (Jiang et al., 17 Nov 2025).

Reconstructing humanlike painting dynamics relies on methods such as motion capture (to obtain real-world trajectory ground-truth) and variational autoencoders over trajectories (e.g., TrajVAE in Spline-FRIDA), which compactly encode a manifold of plausible, human-style brushstrokes. Backpropagation through a differentiable renderer enables both style fitting and semantic planning by optimizing in a trajectory latent space (Chen et al., 30 Nov 2024).

4. Losses and Training Regimes

Differentiable brushstroke reconstruction employs multi-scale losses balanced by empirical weights:

Pixel alignment: $\mathcal{L}_\text{pixel} = \|I_\text{recon} - I_\text{target}\|_1$ or $L_2$
Perceptual similarity: $\mathcal{L}_\text{perc} = \sum_\ell \|F_\ell(I_\text{recon}) - F_\ell(I_\text{target})\|_1$ , where $F_\ell$ are deep features from pretrained networks
Gradient/magnitude direction: Enforces alignment of image gradients in feature space
Structural guidance: E.g., segmentation-aware losses or directional flow constraints
Optimal transport: Regularizes color and structure distribution with entropy penalty
Area/compactness: Encourages regularity in stroke footprint
Total variation and curvature penalties: Favor geometrically smooth, regular strokes (Jiang et al., 17 Nov 2025, Kotovenko et al., 2021, Mihai et al., 2021)

Training optimizers and schedules are adapted to each phase (appearance, texture, smudge) and may employ learning rate warmups, cosine decay, or batch-specific adjustments.

5. Quantitative and Qualitative Evaluation

Frameworks are benchmarked using:

Low-level metrics: PSNR, SSIM between reconstruction and reference images
Perceptual similarity: LPIPS or feature distances extracted from pretrained vision networks
Style metrics: Feature distances in deep style spaces or Fréchet Distance (FD)
Human studies: Forced-choice or side-by-side comparison for human-likeness, semantic alignment, and overall quality (e.g., Spline-FRIDA reports 73–84% preference for its outputs versus baselines) (Chen et al., 30 Nov 2024, Jiang et al., 17 Nov 2025).

Ablation studies contrast impact of component losses, pipeline phases, and stroke count. Increased stroke complexity generally yields more precise content and shading representation, while advanced stylization and smudging modules contribute to realism and expressivity (Jiang et al., 17 Nov 2025).

6. Robotic and Interactive Applications

Differentiable brushstroke models are integral to robotic art systems. The Spline-FRIDA architecture employs arbitrary polyline splines and differentiable dynamics (Traj2Stroke) to close the sim-to-real gap for robot painting, achieving greater accuracy and human-style representation than CNN- or hand-designed rendering methods (Chen et al., 30 Nov 2024). By training and fine-tuning on motion-captured human trajectories, robot agents can generalize semantic and aesthetic brush behavior with minimal data.

Interactive extensions allow for user-guided flow constraints, where input curves or gestural data are incorporated via projection losses on stroke direction, enabling controllable, semantically guided digital painting (Kotovenko et al., 2021, Mihai et al., 2021).

7. Practical Considerations, Limitations, and Future Directions

Efficient implementations leverage vectorized, batch GPU operations. Stamp-based parallel renderers and nearest-stamp assignments replace sequential alpha compositing, yielding significant speedups on modern hardware (Jiang et al., 17 Nov 2025). Limitations remain in modeling highly irregular, fluid or diffusive media; future work includes displacement-map stroke boundaries and physically-based fluid models.

Other practical considerations include batch-wise memory usage (mixed-precision, polyline segmentation), initialization heuristics (edge maps or superpixels), and postprocessing (hard rasterization at inference for crispness). Maintaining renderer faithfulness at high resolution presents an ongoing challenge, with hybrid analytic-neural decoders proposed as one potential solution (Nakano, 2019).

In summary, differentiable brushstroke reconstruction provides a mathematically rigorous and computationally tractable pathway to modeling, reconstructing, and synthesizing painted imagery. These methods support applications in robotic painting, neural style transfer, analytical reconstruction, and creative interactive systems, driven by continuous advancements in differentiable rendering, trajectory encoding, and style-aware generative modeling (Jiang et al., 17 Nov 2025, Chen et al., 30 Nov 2024, Kotovenko et al., 2021, Mihai et al., 2021, Nakano, 2019).