Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Ultra-High-Resolution Image Synthesis with Pyramid Diffusion Model (2403.12915v1)

Published 19 Mar 2024 in cs.CV

Abstract: We introduce the Pyramid Diffusion Model (PDM), a novel architecture designed for ultra-high-resolution image synthesis. PDM utilizes a pyramid latent representation, providing a broader design space that enables more flexible, structured, and efficient perceptual compression which enable AutoEncoder and Network of Diffusion to equip branches and deeper layers. To enhance PDM's capabilities for generative tasks, we propose the integration of Spatial-Channel Attention and Res-Skip Connection, along with the utilization of Spectral Norm and Decreasing Dropout Strategy for the Diffusion Network and AutoEncoder. In summary, PDM achieves the synthesis of images with a 2K resolution for the first time, demonstrated on two new datasets comprising images of sizes 2048x2048 pixels and 2048x1024 pixels respectively. We believe that this work offers an alternative approach to designing scalable image generative models, while also providing incremental reinforcement for existing frameworks.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com