Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation (2403.06452v2)
Abstract: In the digital era, QR codes serve as a linchpin connecting virtual and physical realms. Their pervasive integration across various applications highlights the demand for aesthetically pleasing codes without compromised scannability. However, prevailing methods grapple with the intrinsic challenge of balancing customization and scannability. Notably, stable-diffusion models have ushered in an epoch of high-quality, customizable content generation. This paper introduces Text2QR, a pioneering approach leveraging these advancements to address a fundamental challenge: concurrently achieving user-defined aesthetics and scanning robustness. To ensure stable generation of aesthetic QR codes, we introduce the QR Aesthetic Blueprint (QAB) module, generating a blueprint image exerting control over the entire generation process. Subsequently, the Scannability Enhancing Latent Refinement (SELR) process refines the output iteratively in the latent space, enhancing scanning robustness. This approach harnesses the potent generation capabilities of stable-diffusion models, navigating the trade-off between image aesthetics and QR code scannability. Our experiments demonstrate the seamless fusion of visual appeal with the practical utility of aesthetic QR codes, markedly outperforming prior methods. Codes are available at \url{https://github.com/mulns/Text2QR}
- Wasserstein Generative Adversarial Networks. In Proceedings of the 34th International Conference on Machine Learning, pages 214–223, 2017.
- PiCode: A New Picture-Embedding 2D Barcode. IEEE Transactions on Image Processing, 25(8):3444–3458, 2016.
- Robust and Unobtrusive Display-to-Camera Communications via Blue Channel Embedding. IEEE Transactions on Image Processing, 28(1):156–169, 2018a.
- RA Code: A Robust and Aesthetic Code for Resolution-Constrained Applications. IEEE Transactions on Circuits and Systems for Video Technology, 28(11):3300–3312, 2018b.
- Halftone QR Codes. ACM Transactions on Graphics (TOG), 32(6):1–8, 2013.
- Russ Cox. Qartcodes. https://research.swtch.com/qart, 2012.
- Aesbench: An expert benchmark for multimodal large language models on image aesthetics perception. arXiv, 2024.
- Screen-Shooting Resilient Watermarking. IEEE Transactions on Information Forensics and Security, 14(6):1403–1418, 2018.
- TERA: Screen-to-Camera Image Code with Transparency, Efficiency, Robustness and Adaptability. IEEE Transactions on Multimedia, pages 1–1, 2021.
- Anthony Fu. Stylistic qr code with stable diffusion. https://antfu.me/posts/ai-qrcode, 2023.
- The Invisible QR Code. In Proceedings of the 23rd ACM International Conference on Multimedia, pages 1047–1050, 2015.
- QR Images: Optimized Image Embedding in QR Codes. IEEE Transactions on Image Processing, 23(7):2842–2853, 2014.
- RIHOOP: Robust Invisible Hyperlinks in Offline and Online Photographs. IEEE Transactions on Cybernetics, pages 1–13, 2020.
- Learning invisible markers for hidden codes in offline-to-online photography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2273–2282, 2022.
- Blip-diffusion: Pre-trained subject representation for controllable text-to-image generation and editing. arXiv preprint arXiv:2305.14720, 2023a.
- Fastllve: Real-time low-light video enhancement with intensity-aware look-up table. In Proceedings of the 31st ACM International Conference on Multimedia, page 8134–8144, New York, NY, USA, 2023b. Association for Computing Machinery.
- Griddehazenet: Attention-based multi-scale network for image dehazing. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pages 7313–7322. IEEE, 2019.
- End-to-end trainable video super-resolution based on a new mechanism for implicit motion estimation and compensation. In IEEE Winter Conference on Applications of Computer Vision, WACV 2020, Snowmass Village, CO, USA, March 1-5, 2020, pages 2405–2414. IEEE, 2020.
- Exploit camera raw data for video super- resolution via hidden markov model inference. IEEE Trans. Image Process., 30:2127–2140, 2021.
- Griddehazenet+: An enhanced multi-scale network with intra-task knowledge transfer for single image dehazing. IEEE Trans. Intell. Transp. Syst., 24(1):870–884, 2023.
- Oacode: Overall aesthetic 2d barcode on screen. IEEE Transactions on Multimedia, 2023.
- Ciagan: Conditional identity anonymization generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5447–5456, 2020.
- T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models. arXiv preprint arXiv:2302.08453, 2023.
- Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021.
- Sean Owen. Zxing (”zebra crossing”) barcode scanning library for java, android. https://github.com/zxing/zxing, 2013.
- Pixel ML, Inc. Quick qr art. https://quickqr.art, 2023.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2):3, 2022.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
- Learning for unconstrained space-time video super-resolution. IEEE Trans. Broadcast., 68(2):345–358, 2022a.
- Video frame interpolation via generalized deformable convolution. IEEE Trans. Multim., 24:426–439, 2022b.
- Video frame interpolation transformer. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, pages 17461–17470. IEEE, 2022c.
- Q-Art Code: Generating Scanning-robust Art-style QR Codes by Deformable Convolution. In Proceedings of the 29th ACM International Conference on Multimedia, pages 722–730, 2021a.
- Artcoder: an end-to-end method for generating scanning-robust stylized qr codes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2277–2286, 2021b.
- Stegastamp: Invisible Hyperlinks in Physical Photographs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2117–2126, 2020.
- Robust prior-based single image super resolution under multiple gaussian degradations. IEEE Access, 8:74195–74204, 2020.
- Light Field Messaging with Deep Photographic Steganography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1515–1524, 2019.
- Pred: A parallel network for handling multiple degradations via single model in single image super-resolution. In 2019 IEEE International Conference on Image Processing (ICIP), pages 2881–2885, 2019.
- Accflow: Backward accumulation for long-range optical flow. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 12119–12128, 2023.
- Stylized aesthetic QR code. IEEE Trans. Multim., 21(8):1960–1970, 2019.
- ART-UP: A novel method for generating scanning-robust aesthetic QR codes. ACM Trans. Multim. Comput. Commun. Appl., 17(1):25:1–25:23, 2021.
- Prompt-Free Diffusion: Taking” Text” out of Text-to-Image Diffusion Models. arXiv preprint arXiv:2305.16223, 2023.
- A3gan: Attribute-aware anonymization networks for face de-identification. In Proceedings of the 30th ACM International Conference on Multimedia, pages 5303–5313, 2022.
- Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3836–3847, 2023.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.