Uni-Layout: Integrating Human Feedback in Unified Layout Generation and Evaluation (2508.02374v1)

Published 4 Aug 2025 in cs.CV, cs.IR, and cs.LG

Abstract: Layout generation plays a crucial role in enhancing both user experience and design efficiency. However, current approaches suffer from task-specific generation capabilities and perceptually misaligned evaluation metrics, leading to limited applicability and ineffective measurement. In this paper, we propose \textit{Uni-Layout}, a novel framework that achieves unified generation, human-mimicking evaluation and alignment between the two. For universal generation, we incorporate various layout tasks into a single taxonomy and develop a unified generator that handles background or element contents constrained tasks via natural language prompts. To introduce human feedback for the effective evaluation of layouts, we build \textit{Layout-HF100k}, the first large-scale human feedback dataset with 100,000 expertly annotated layouts. Based on \textit{Layout-HF100k}, we introduce a human-mimicking evaluator that integrates visual and geometric information, employing a Chain-of-Thought mechanism to conduct qualitative assessments alongside a confidence estimation module to yield quantitative measurements. For better alignment between the generator and the evaluator, we integrate them into a cohesive system by adopting Dynamic-Margin Preference Optimization (DMPO), which dynamically adjusts margins based on preference strength to better align with human judgments. Extensive experiments show that \textit{Uni-Layout} significantly outperforms both task-specific and general-purpose methods. Our code is publicly available at https://github.com/JD-GenX/Uni-Layout.

Summary

The paper presents a unified framework that integrates multi-task layout generation with human-mimicking evaluation using DMPO.
It introduces a multimodal instruction-based generator and builds a large-scale human feedback dataset, Layout-HF100k, for robust assessments.
Experimental results and ablation studies demonstrate that the approach outperforms existing methods, setting a new performance benchmark.

Uni-Layout: Integrating Human Feedback in Unified Layout Generation and Evaluation

Introduction to Layout Generation Challenges

The field of layout generation is pivotal in enhancing both user experience and design efficiency. Existing approaches often focus on specific task categories, resulting in limited applicability and evaluation techniques that may not align with human perception. The paper "Uni-Layout: Integrating Human Feedback in Unified Layout Generation and Evaluation" addresses these challenges by proposing a unified framework: Uni-Layout.

Uni-Layout encompasses three core components:

Unified Generation: Incorporates various layout tasks into a single taxonomy and utilizes natural language prompts for universal generation.
Human-Mimicking Evaluation: Builds a large-scale human feedback dataset—Layout-HF100k—to facilitate effective evaluation aligned with human perception.
Alignment Mechanism: Adopts Dynamic-Margin Preference Optimization (DMPO) to bridge the gap between generation outputs and human preferences.
Figure 1: Taxonomy of layout generation tasks and illustration of motivation. Diverse layout generation tasks can be divided into four categories: (a) BFEF, (b) BCEF, (c) BFEC, and (d) BCEC.

Framework Architecture

Unified Generation

Uni-Layout uses a multimodal instruction-based approach for generating layouts across different tasks. The layout generator handles Background-Free and Element-Free (BFEF), Background-Constrained and Element-Free (BCEF), Background-Free and Element-Constrained (BFEC), and Background-Constrained and Element-Constrained (BCEC). A scalable instruction function processes task-specific constraints, leveraging multimodal LLMs (MLLMs) for layout generation.

Human-Mimicking Evaluation

Utilizing Layout-HF100k, the framework introduces a dual-branch learning strategy integrating visual content and geometrical features for human-like assessment. A Chain-of-Thought mechanism aids qualitative evaluation, while a classification module provides quantitative assessments.

Figure 2: Layout-HF100k examples. The top row shows qualified examples, while the bottom row shows unqualified ones.

Alignment Strategy

The alignment between generated layouts and evaluation outcomes is optimized using DMPO. This method dynamically adjusts margins based on preference strength—leading to improved consistency with human judgments. By fine-tuning the layout generator, DMPO ensures that generated outputs match human-annotated preferences.

Figure 3: Overview of Uni-Layout framework: (a) Generation described in Section ~\ref{subsec:uni_generation.

Experimental Analysis

Uni-Layout was extensively validated against state-of-the-art models, demonstrating superior performance metrics. The framework achieved unparalleled accuracy in layout generation tasks, outperforming specialized and general-purpose methods in both task-specific evaluations and human-mimicking evaluations.

Figure 4: Layout Reward and Human Pass Rate across different methods.

Ablation Studies

A series of ablation studies confirmed the contribution of each framework component, emphasizing the importance of DMPO in aligning generated layouts with human preferences. The visualization enhanced by DMPO highlighted noticeable improvements in layout coherence and structure.

Figure 5: Comparison of effects before and after alignment.

Conclusion

Uni-Layout sets a new benchmark in the integration of human feedback within layout generation and evaluation. By addressing the limits of task-specific solutions and misaligned evaluation metrics, Uni-Layout paves the way for future research in unified frameworks that leverage human-centered design principles. Future work will explore extensions to three-dimensional layout generation, further bridging technological capabilities with complex application domains.