Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models (2311.17919v2)
Abstract: We address the problem of synthesizing multi-view optical illusions: images that change appearance upon a transformation, such as a flip or rotation. We propose a simple, zero-shot method for obtaining these illusions from off-the-shelf text-to-image diffusion models. During the reverse diffusion process, we estimate the noise from different views of a noisy image, and then combine these noise estimates together and denoise the image. A theoretical analysis suggests that this method works precisely for views that can be written as orthogonal transformations, of which permutations are a subset. This leads to the idea of a visual anagram--an image that changes appearance under some rearrangement of pixels. This includes rotations and flips, but also more exotic pixel permutations such as a jigsaw rearrangement. Our approach also naturally extends to illusions with more than two views. We provide both qualitative and quantitative results demonstrating the effectiveness and flexibility of our method. Please see our project webpage for additional visualizations and results: https://dangeng.github.io/visual_anagrams/
- AUTOMATIC1111. Negative prompt. https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Negative-prompt, 2022. Accessed: November 7, 2023.
- Diffusion illusions: Hiding images in plain sight. https://ryanndagreat.github.io/Diffusion-Illusions, 2023.
- Designing perceptual puzzles by differentiating probabilistic programs. In ACM SIGGRAPH 2022 Conference Proceedings, pages 1–9, 2022.
- Optical illusion shape texturing using repeated asymmetric patterns. The Visual Computer, 30:809–819, 2014.
- Camouflage images. ACM Trans. Graph., 29(4):51–1, 2010.
- Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
- Implicit generation and generalization in energy-based models. arXiv preprint arXiv:1903.08689, 2019.
- Compositional visual generation with energy based models. Advances in Neural Information Processing Systems, 33:6637–6647, 2020.
- Reduce, reuse, recycle: Compositional generation with energy-based diffusion models and mcmc. In International Conference on Machine Learning, pages 8489–8510. PMLR, 2023.
- Werner Ehm. A variational approach to geometric-optical illusions modeling. Proceedings of Fechner Day, 27(1):41–46, 2011.
- Adversarial examples that fool both computer vision and time-limited humans. Advances in neural information processing systems, 31, 2018.
- Motion without movement. ACM Siggraph Computer Graphics, 25(4):27–30, 1991.
- Compositional sculpting of iterative generative processes. arXiv preprint arXiv:2309.16115, 2023.
- Convolutional neural networks can be deceived by visual illusions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12309–12317, 2019.
- On the synthesis of visual illusions using deep generative models. Journal of Vision, 22(8):2–2, 2022.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Diffusion models as plug-and-play priors. Advances in Neural Information Processing Systems, 35:14715–14728, 2022.
- Ganmouflage: 3d object nondetection with texture fields. Computer Vision and Pattern Recognition (CVPR), 2023.
- Aaron Hertzmann. Visual indeterminacy in gan art. In ACM SIGGRAPH 2020 Art Gallery, pages 424–428. 2020.
- Color visual illusions: A statistics-based computational model. Advances in neural information processing systems, 33:9447–9458, 2020.
- Classifier-free diffusion guidance, 2022.
- Denoising diffusion probabilistic models. arXiv preprint arxiv:2006.11239, 2020.
- Intriguing properties of generative classifiers. arXiv preprint arXiv:2309.16779, 2023.
- If by deepfloyd lab at stabilityai, 2023. GitHub repository.
- Monster Labs. Controlnet qr code monster v2 for sd-1.5, 2023.
- Learning to compose visual relations. Advances in Neural Information Processing Systems, 34:23166–23178, 2021.
- Compositional visual generation with composable diffusion models. In European Conference on Computer Vision, pages 423–439. Springer, 2022.
- A parametric framework to generate visual illusions using python. Perception, 50(11):950–965, 2021.
- Is clip fooled by optical illusions? 2023.
- Glide: Towards photorealistic image generation and editing with text-guided diffusion models, 2021.
- Hybrid images. ACM Trans. Graph., 25(3):527–532, 2006.
- Camouflaging an object from many viewpoints. 2014.
- Dreamfusion: Text-to-3d using 2d diffusion. arXiv, 2022.
- Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning, pages 8748–8763. PMLR, 2021.
- Zero-shot text-to-image generation. In International Conference on Machine Learning, pages 8821–8831. PMLR, 2021.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Photorealistic text-to-image diffusion models with deep language understanding, 2022.
- Network simulations of optical illusions. International Journal of Modern Physics C, 28(02):1750018, 2017.
- Deep unsupervised learning using nonequilibrium thermodynamics. In Proceedings of the 32nd International Conference on Machine Learning, pages 2256–2265, Lille, France, 2015. PMLR.
- Denoising diffusion implicit models. arXiv:2010.02502, 2020.
- Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021.
- Matthew Tancik. Illusion diffusion. https://github.com/tancik/Illusion-Diffusion, 2023.
- Ugleh. Spiral town - different approach to qr monster. https://www.reddit.com/r/StableDiffusion/comments/16ew9fz/spiral_town_different_approach_to_qr_monster/, 2023.
- Score jacobian chaining: Lifting pretrained 2d diffusion models for 3d generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12619–12629, 2023.
- Toward quantifying ambiguities in artistic images. ACM Transactions on Applied Perception (TAP), 17(4):1–10, 2020.
- Wikipedia contributors. The dress. https://en.wikipedia.org/wiki/The_dress. Accessed: November 9, 2023.
- Adding conditional control to text-to-image diffusion models, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.