Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 31 tok/s Pro
GPT-5 High 25 tok/s Pro
GPT-4o 57 tok/s Pro
Kimi K2 190 tok/s Pro
GPT OSS 120B 435 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Subjective Camera: Bridging Human Cognition and Visual Reconstruction through Sequence-Aware Sketch-Guided Diffusion (2506.23711v1)

Published 30 Jun 2025 in cs.CV

Abstract: We propose Subjective Camera, a human-as-imaging-device paradigm that reconstructs real-world scenes from mental impressions through synergistic use of verbal descriptions and progressive rough sketches. This approach overcomes dual limitations of language ambiguity and sketch abstraction by treating the user's drawing sequence as priors, effectively translating subjective perceptual expectations into photorealistic images. Existing approaches face three fundamental barriers: (1) user-specific subjective input biases, (2) huge modality gap between planar sketch and 3D priors in diffusion, and (3) sketch quality-sensitive performance degradation. Current solutions either demand resource-intensive model adaptation or impose impractical requirements on sketch precision. Our framework addresses these challenges through concept-sequential generation. (1) We establish robust appearance priors through text-reward optimization, and then implement sequence-aware disentangled generation that processes concepts in sketching order; these steps accommodate user-specific subjective expectation in a train-free way. (2) We employ latent optimization that effectively bridges the modality gap between planar sketches and 3D priors in diffusion. (3) Our hierarchical reward-guided framework enables the use of rough sketches without demanding artistic expertise. Comprehensive evaluation across diverse datasets demonstrates that our approach achieves state-of-the-art performance in maintaining both semantic and spatial coherence.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.