Dice Question Streamline Icon: https://streamlinehq.com

Fine-grained spatial editing, layout extrapolation, and typography in instruction-guided image editing

Investigate and develop robust methods for fine-grained spatial editing, layout extrapolation, and typography in instruction-guided image editing to resolve the currently open challenges in these areas.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper analyzes per–edit-type success rates and finds that global appearance and style operations (e.g., artistic style transfer, vintage filters) are relatively reliable, while semantically targeted but coarse edits are moderately successful.

In contrast, edits requiring precise geometry and spatial control (e.g., relocating objects, changing size/shape/orientation), layout extrapolation like outpainting, and text-related typography edits exhibit the lowest reliability (e.g., relocate object at 0.5923, change font/style at 0.5759). Based on this evidence, the authors explicitly identify fine-grained spatial editing, layout extrapolation, and typography as open problems.

References

Nano-Banana is well suited for global photometric/stylistic transformations; in contrast, fine-grained spatial editing, layout extrapolation, and typography remain open problems.

Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing (2510.19808 - Qian et al., 22 Oct 2025) in Implications, Section “Dataset Analysis”