Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion (2404.00292v4)

Published 30 Mar 2024 in cs.CV

Abstract: Camouflaged vision perception is an important vision task with numerous practical applications. Due to the expensive collection and labeling costs, this community struggles with a major bottleneck that the species category of its datasets is limited to a small number of object species. However, the existing camouflaged generation methods require specifying the background manually, thus failing to extend the camouflaged sample diversity in a low-cost manner. In this paper, we propose a Latent Background Knowledge Retrieval-Augmented Diffusion (LAKE-RED) for camouflaged image generation. To our knowledge, our contributions mainly include: (1) For the first time, we propose a camouflaged generation paradigm that does not need to receive any background inputs. (2) Our LAKE-RED is the first knowledge retrieval-augmented method with interpretability for camouflaged generation, in which we propose an idea that knowledge retrieval and reasoning enhancement are separated explicitly, to alleviate the task-specific challenges. Moreover, our method is not restricted to specific foreground targets or backgrounds, offering a potential for extending camouflaged vision perception to more diverse domains. (3) Experimental results demonstrate that our method outperforms the existing approaches, generating more realistic camouflage images.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. Slic superpixels compared to state-of-the-art superpixel methods. TPAMI, 34(11):2274–2282, 2012.
  2. Demystifying mmd gans. In ICLR, 2018.
  3. autotrack: A lightweight object detection and tracking system for the sae autodrive challenge. In CRV, 2019.
  4. A naturalistic open source movie for optical flow evaluation. In ECCV, 2012.
  5. Coco-stuff: Thing and stuff classes in context. In CVPR, 2018.
  6. Confidence-weighted mutual supervision on dual networks for unsupervised cross-modality image segmentation. SCIS, 66(11):210104, 2023.
  7. Camouflage images. TOG, 29(4):51–1, 2010.
  8. Generative adversarial networks: An overview. SPM, 35(1):53–65, 2018.
  9. IC Cuthill. Camouflage. J ZOOL, 308(2):75–92, 2019.
  10. Poisson Image Editing. IPOL, 6:300–325, 2016.
  11. Vision-based pest detection based on svm classification method. COMPAG, 137:52–58, 2017.
  12. Camouflaged object detection. In CVPR, 2020.
  13. Concealed object detection. TPAMI, 44(10):6024–6042, 2021.
  14. Advances in deep concealed scene understanding. VI, 1(1):16, 2023.
  15. Dall-e for detection: Language-driven context image synthesis for object detection. arXiv preprint arXiv:2206.09592, 2022.
  16. Is synthetic data from generative models ready for image recognition? In ICLR, 2023.
  17. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In NeurIPS, 2017.
  18. Denoising diffusion probabilistic models. In NeurIPS, 2020.
  19. Annotation-efficient polyp segmentation via active learning. arXiv preprint arXiv:2403.14350, 2024a.
  20. Alignsam: Aligning segment anything model to open context via reinforcement learning. In CVPR, 2024b.
  21. Arbitrary style transfer in real-time with adaptive instance normalization. In ICCV, 2017.
  22. Rethinking polyp segmentation from an out-of-distribution perspective. MIR, pages 1–9, 2024.
  23. S 2-ver: Semi-supervised visual emotion recognition. In ECCV, 2022.
  24. Meta-sim: Learning to generate synthetic datasets. In ICCV, 2019.
  25. The making and breaking of camouflage. In ICCV, 2023.
  26. Anabranch network for camouflaged object segmentation. CVIU, 184:45–56, 2019.
  27. Bigdatasetgan: Synthesizing imagenet with pixel-wise annotations. In CVPR, 2022.
  28. Learning background prompts to discover implicit knowledge for open vocabulary object detection. In CVPR, 2024.
  29. Location-free camouflage generation network. TMM, 25:5234–5247, 2023a.
  30. Open-vocabulary object segmentation with diffusion models. In ICCV, 2023b.
  31. Microsoft coco: Common objects in context. In ECCV, 2014.
  32. Active self-training for weakly supervised 3d scene semantic segmentation. CVMJ, pages 1–14, 2024.
  33. Progressive neighbor consistency mining for correspondence pruning. In CVPR, 2023.
  34. A meaningful learning method for zero-shot semantic segmentation. SCIS, 66(11):210103, 2023a.
  35. Pgfnet: Preference-guided filtering network for two-view correspondence learning. TIP, 32:1367–1378, 2023b.
  36. Repaint: Inpainting using denoising diffusion probabilistic models. In CVPR, 2022.
  37. Camdiff: Camouflage image augmentation via diffusion. AIR, 2:9150021, 2023.
  38. Simultaneously localize, segment and rank the camouflaged objects. In CVPR, 2021.
  39. How camouflage works. Philos T R Soc B, 372(1724):20160341, 2017.
  40. A survey of synthetic data augmentation methods in machine vision. MIR, pages 1–39, 2024.
  41. High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
  42. Animal camouflage: current issues and new perspectives. Philos T R Soc B, 364(1516):423–427, 2009.
  43. Neural discrete representation learning. In NeurIPS, 2017.
  44. Multi-task learning and joint refinement between camera localization and object detection. CVMJ, pages 1–19, 2024.
  45. Learning to detect salient objects with image-level supervision. In CVPR, 2017.
  46. Ease: Robust facial expression recognition via emotion ambiguity-sensitive cooperative networks. In ACM MM, 2022.
  47. Dip: Dual incongruity perceiving network for sarcasm detection. In CVPR, 2023.
  48. Synscapes: A photorealistic synthetic dataset for street scene parsing. arXiv preprint arXiv:1810.08705, 2018.
  49. Diffumask: Synthesizing images with pixel-level annotations for semantic segmentation using diffusion models. In ICCV, 2023.
  50. Saliency detection via graph-based manifold ranking. In CVPR, 2013.
  51. A full-set tooth segmentation model based on improved pointnet++. VI, 1(1):21, 2023.
  52. Looking into gait for perceiving emotions via bilateral posture and movement graph convolutional networks. TAFFC, 2024.
  53. Deep camouflage images. In AAAI, 2020a.
  54. Sg-one: Similarity guidance network for one-shot semantic segmentation. TCYB, 50(9):3855–3865, 2020b.
  55. Datasetgan: Efficient labeled data factory with minimal human effort. In CVPR, 2021.
  56. Temporal sentiment localization: Listen and look in untrimmed videos. In ACM MM, 2022.
  57. Planeseg: Building a plug-in for boosting planar region segmentation. TNNLS, pages 1–15, 2023a.
  58. Multiple planar object tracking. In ICCV, 2023b.
  59. Weakly supervised video emotion detection and prediction via cross-modal temporal erasing network. In CVPR, 2023c.
  60. Extdm: Distribution extrapolation diffusion model for video prediction. In CVPR, 2024a.
  61. Mart: Masked affective representation learning via masked temporal distribution distillation. In CVPR, 2024b.
  62. Emotion recognition from multiple modalities: Fundamentals and methodologies. SPM, 38(6):59–73, 2021.
  63. Affective image content analysis: Two decades review and new perspectives. TPAMI, 44(10):6729–6751, 2022.
  64. Bridging global context interactions for high-fidelity image completion. In CVPR, 2022.
  65. Places: A 10 million image database for scene recognition. TPAMI, 40(6):1452–1464, 2018.
  66. Towards locality similarity preserving to 3d human pose estimation. In ACCV, 2020.
  67. Dc-gnet: Deep mesh relation capturing graph convolution network for 3d human shape reconstruction. In ACM MM, 2021.
  68. Adapt or perish: Adaptive sparse transformer with attentive feature refinement for image restoration. In CVPR, 2024.
Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com