SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection (2402.17323v2)
Abstract: In the field of class incremental learning (CIL), generative replay has become increasingly prominent as a method to mitigate the catastrophic forgetting, alongside the continuous improvements in generative models. However, its application in class incremental object detection (CIOD) has been significantly limited, primarily due to the complexities of scenes involving multiple labels. In this paper, we propose a novel approach called stable diffusion deep generative replay (SDDGR) for CIOD. Our method utilizes a diffusion-based generative model with pre-trained text-to-diffusion networks to generate realistic and diverse synthetic images. SDDGR incorporates an iterative refinement strategy to produce high-quality images encompassing old classes. Additionally, we adopt an L2 knowledge distillation technique to improve the retention of prior knowledge in synthetic images. Furthermore, our approach includes pseudo-labeling for old objects within new task images, preventing misclassification as background elements. Extensive experiments on the COCO 2017 dataset demonstrate that SDDGR significantly outperforms existing algorithms, achieving a new state-of-the-art in various CIOD scenarios. The source code will be made available to the public.
- Rodeo: Replay for online object detection. BMVC, 2020.
- Rainbow memory: Continual learning with a memory of diverse samples. In CVPR, 2021.
- End-to-end object detection with transformers. In ECCV, 2020.
- Riemannian walk for incremental learning: Understanding forgetting and intransigence. In ECCV, 2018.
- Continual learning with tiny episodic memories. arXiv, 2019.
- Gan memory with no forgetting. NeurIPS, 2020.
- Continual prototype evolution: Learning online from non-stationary data streams. In ICCV, 2021.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv, 2018.
- Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 2021.
- Overcoming catastrophic forgetting in incremental object detection via elastic response distillation. In CVPR, 2022.
- Ddgr: Continual learning with deep diffusion-based generative replay. ICML, 2023.
- Generative adversarial nets. NeurIPS, 2014.
- Improved schemes for episodic memory-based lifelong learning. In NIPS, 2020.
- Ow-detr: Open-world detection transformer. In CVPR, 2022.
- An end-to-end architecture for class-incremental object detection with knowledge distillation. In ICME, 2019.
- Exemplar-supported generative reproduction for class incremental learning. In BMVC, 2018.
- Deep residual learning for image recognition. In CVPR, 2016.
- Classifier-free diffusion guidance. arXiv, 2022.
- Denoising diffusion probabilistic models. NeurIPS, 2020.
- Composer: Creative and controllable image synthesis with composable conditions. arXiv preprint arXiv:2302.09778, 2023.
- Selective experience replay for lifelong learning. In AAAI, 2018.
- Auto-encoding variational bayes. ICLR, 2013.
- Online continual learning on class incremental blurry task configuration with anytime inference. arXiv, 2021.
- Rilod: Near real-time incremental learning for object detection at the edge. In SEC, 2019.
- Dn-detr: Accelerate detr training by introducing query denoising. In CVPR, 2022a.
- Grounded language-image pre-training. In CVPR, 2022b.
- Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. NIPS, 2020.
- Gligen: Open-set grounded text-to-image generation. In CVPR, 2023.
- Learning without forgetting. ECCV, 2016.
- Microsoft coco: Common objects in context. In ECCV, 2014.
- Incdet: In defense of elastic weight consolidation for incremental object detection. TNNLS, 2020a.
- Multi-task incremental learning for object detection. arXiv, 2020b.
- Continual detection transformer for incremental object detection. In CVPR, 2023.
- Augmented geometric distillation for data-free incremental person reid. In CVPR, 2022.
- Nerf: Representing scenes as neural radiance fields for view synthesis. ECCV, 2021.
- Conditional generative adversarial nets. arXiv preprint arXiv, 2014.
- T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models. arXiv preprint arXiv:2302.08453, 2023.
- Im2text: Describing images using 1 million captioned photographs. NIPS, 2011.
- Sid: Incremental learning for anchor-free object detection via selective and inter-related distillation. CVIU, 2021.
- Gdumb: A simple approach that questions our progress in continual learning. In ECCV, 2020.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021a.
- Learning transferable visual models from natural language supervision. In ICML, 2021b.
- icarl: Incremental classifier and representation learning. In CVPR, 2017.
- Faster r-cnn: Towards real-time object detection with region proposal networks. In NIPS, 2015.
- Anthony Robins. Catastrophic forgetting, rehearsal and pseudorehearsal. Connection Science, 1995.
- Experience replay for continual learning. In NIPS, 2019.
- High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
- U-net: Convolutional networks for biomedical image segmentation. In MICCAI, 2015.
- Laion-400m: Open dataset of clip-filtered 400 million image-text pairs. arXiv, 2021.
- Laion-5b: An open large-scale dataset for training next generation image-text models. NeurIPS, 2022.
- Objects365: A large-scale, high-quality dataset for object detection. In ICCV, 2019.
- Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In ACL, pages 2556–2565, 2018.
- Continual learning with deep generative replay. In NIPS, 2017a.
- Continual learning with deep generative replay. NIPS, 2017b.
- Diversity is definitely needed: Improving model-agnostic zero-shot classification via stable diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 769–778, 2023.
- Incremental learning of object detectors without catastrophic forgetting. In ICCV, 2017.
- On learning the geodesic path for incremental learning. In CVPR, 2021.
- Denoising diffusion implicit models. arXiv, 2020a.
- Score-based generative modeling through stochastic differential equations. arXiv, 2020b.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Sketch-guided text-to-image diffusion models. In SIGGRAPH, 2023.
- Memory replay gans: Learning to generate new categories without forgetting. NIPS, 31, 2018.
- Incremental learning using conditional adversarial networks. In NeurIPS, 2019.
- Boosting human-object interaction detection with text-to-image diffusion model. arXiv preprint arXiv:2305.12252, 2023a.
- Reco: Region-controlled text-to-image generation. In CVPR, 2023b.
- Adding conditional control to text-to-image diffusion models. In ICCV, 2023.
- Deformable detr: Deformable transformers for end-to-end object detection. ICLR, 2020.
- Junsu Kim (16 papers)
- Hoseong Cho (4 papers)
- Jihyeon Kim (7 papers)
- Yihalem Yimolal Tiruneh (3 papers)
- Seungryul Baek (32 papers)