Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 96 tok/s
Gemini 3.0 Pro 48 tok/s Pro
Gemini 2.5 Flash 155 tok/s Pro
Kimi K2 197 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation (2403.12033v1)

Published 18 Mar 2024 in cs.CV

Abstract: Being able to understand visual scenes is a precursor for many downstream tasks, including autonomous driving, robotics, and other vision-based approaches. A common approach enabling the ability to reason over visual data is Scene Graph Generation (SGG); however, many existing approaches assume undisturbed vision, i.e., the absence of real-world corruptions such as fog, snow, smoke, as well as non-uniform perturbations like sun glare or water drops. In this work, we propose a novel SGG benchmark containing procedurally generated weather corruptions and other transformations over the Visual Genome dataset. Further, we introduce a corresponding approach, Hierarchical Knowledge Enhanced Robust Scene Graph Generation (HiKER-SGG), providing a strong baseline for scene graph generation under such challenging setting. At its core, HiKER-SGG utilizes a hierarchical knowledge graph in order to refine its predictions from coarse initial estimates to detailed predictions. In our extensive experiments, we show that HiKER-SGG does not only demonstrate superior performance on corrupted images in a zero-shot manner, but also outperforms current state-of-the-art methods on uncorrupted SGG tasks. Code is available at https://github.com/zhangce01/HiKER-SGG.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (93)
  1. Visual relationship detection using scene graphs: A survey. arXiv preprint arXiv:2005.08045, 2020.
  2. Knowledge-guided short-context action anticipation in human-centric videos. arXiv preprint arXiv:2309.05943, 2023a.
  3. Sample-efficient learning of novel visual concepts. In CoLLAs, pages 637–657. PMLR, 2023b.
  4. Emerging properties in self-supervised vision transformers. In ICCV, pages 9650–9660, 2021.
  5. A comprehensive survey of scene graphs: Generation and application. IEEE TPAMI, 45(1):1–26, 2021.
  6. Resistance training using prior bias: toward unbiased scene graph generation. In AAAI, pages 212–220, 2022.
  7. Towards scene understanding: Unsupervised monocular depth estimation with semantic-aware representation. In CVPR, pages 2624–2632, 2019a.
  8. Knowledge-embedded routing network for scene graph generation. In CVPR, pages 6163–6171, 2019b.
  9. More knowledge, less bias: Unbiasing scene graph generation with explicit ontological adjustment. In WACV, pages 4023–4032, 2023.
  10. Recovering the unbiased scene graphs from the biased ones. In ACM MM, pages 1581–1590, 2021.
  11. Learning phrase representations using rnn encoder–decoder for statistical machine translation. In EMNLP, pages 1724–1734, 2014.
  12. Detecting visual relationships with deep relational networks. In CVPR, pages 3076–3086, 2017.
  13. Hierarchical memory learning for fine-grained scene graph generation. In ECCV, pages 266–283. Springer, 2022.
  14. Learning of visual relations: The devil is in the tails. In ICCV, pages 15404–15413, 2021.
  15. Stacked hybrid-attention and group collaborative learning for unbiased scene graph generation. In CVPR, pages 19427–19436, 2022.
  16. Attend, infer, repeat: Fast scene understanding with generative models. In NeurIPS, 2016.
  17. Corrupted image modeling for self-supervised visual pre-training. In ICLR, 2023.
  18. Scenegenie: Scene graph guided diffusion models for image synthesis. In ICCV, pages 88–98, 2023.
  19. Not all relations are equal: Mining informative labels for scene graph generation. In CVPR, pages 15596–15606, 2022.
  20. Glare: A dataset for traffic sign detection in sun glare. IEEE TITS, 2023.
  21. Scene graph generation with external knowledge and image reconstruction. In CVPR, pages 1969–1978, 2019.
  22. From general to specific: Informative scene graph generation via balance adjustment. In ICCV, pages 16383–16392, 2021.
  23. Physics-based rendering for improving robustness to rain. In ICCV, pages 10203–10212, 2019.
  24. Divide-and-conquer predictor for unbiased scene graph generation. IEEE TCSVT, 32(12):8611–8622, 2022.
  25. Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
  26. Learning from the scene and borrowing from the rich: tackling the long tail in scene graph generation. In IJCAI, pages 587–593, 2021.
  27. State-aware compositional learning toward unbiased training for scene graph generation. IEEE TIP, 32:43–56, 2022.
  28. Benchmarking neural network robustness to common corruptions and perturbations. In ICLR, 2018.
  29. Augmix: A simple data processing method to improve robustness and uncertainty. In ICLR, 2019.
  30. Pyramid adversarial training improves vit performance. In CVPR, pages 13419–13429, 2022.
  31. Image captioning based on scene graphs: A survey. Expert Systems with Applications, page 120698, 2023.
  32. Scene graph generation from hierarchical relationship reasoning. arXiv preprint arXiv:2303.06842, 2023.
  33. Image retrieval using scene graphs. In CVPR, pages 3668–3678, 2015.
  34. Image generation from scene graphs. In CVPR, pages 1219–1228, 2018.
  35. Stephen C Johnson. Hierarchical clustering schemes. Psychometrika, 32(3):241–254, 1967.
  36. Devil’s on the edges: Selective quad attention for scene graph generation. In CVPR, pages 18664–18674, 2023.
  37. On the effectiveness of adversarial training against common corruptions. In UAI, pages 1012–1021. PMLR, 2022.
  38. Visual genome: Connecting language and vision using crowdsourced dense image annotations. IJCV, 123:32–73, 2017.
  39. Symbolic replay: Scene graph as prompt for continual learning on vqa task. In AAAI, pages 1250–1259, 2023.
  40. The devil is in the labels: Noisy label correction for robust scene graph generation. In CVPR, pages 18869–18878, 2022a.
  41. Label semantic knowledge distillation for unbiased scene graph generation. IEEE TCSVT, 2023.
  42. Bipartite graph network with adaptive message passing for unbiased scene graph generation. In CVPR, pages 11109–11119, 2021.
  43. Ppdl: Predicate probability distribution based loss for unbiased scene graph generation. In CVPR, pages 19447–19456, 2022b.
  44. Know more say less: Image captioning based on scene graphs. IEEE TMM, 21(8):2117–2130, 2019.
  45. Rethinking the evaluation of unbiased scene graph generation. In BMVC, 2022c.
  46. Embodied semantic scene graph generation. In CoRL, pages 1585–1594. PMLR, 2022d.
  47. Gated graph sequence neural networks. In ICLR, 2016.
  48. Factorizable net: an efficient subgraph-based framework for scene graph generation. In ECCV, pages 335–351. Springer, 2018.
  49. George A Miller. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41, 1995.
  50. On interaction between augmentations and corruptions in natural corruption robustness. In NeurIPS, pages 3571–3583, 2021.
  51. The norm must go on: Dynamic unsupervised domain adaptation by normalization. In CVPR, pages 14765–14775, 2022.
  52. Glove: Global vectors for word representation. In EMNLP, pages 1532–1543, 2014.
  53. Scene graph refinement network for visual question answering. IEEE TMM, 25:3950–3961, 2023.
  54. Deep learning for seeing through window with raindrops. In ICCV, pages 2463–2471, 2019.
  55. Learning transferable visual models from natural language supervision. In ICML, pages 8748–8763. PMLR, 2021.
  56. Faster r-cnn: Towards real-time object detection with region proposal networks. In NeurIPS, page 91–99, 2015.
  57. A simple way to make neural networks robust against diverse image corruptions. In ECCV, pages 53–69. Springer, 2020.
  58. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
  59. Conceptnet 5.5: An open multilingual graph of general knowledge. In AAAI, page 4444–4451, 2017.
  60. Energy-based learning for scene graph generation. In CVPR, pages 13936–13945, 2021.
  61. Unbiased scene graph generation via two-stage causal modeling. IEEE TPAMI, 2023.
  62. Learning to compose dynamic tree structures for visual contexts. In CVPR, pages 6619–6628, 2019.
  63. Unbiased scene graph generation from biased training. In CVPR, pages 3716–3725, 2020.
  64. Cross-inferential networks for source-free unsupervised domain adaptation. In ICIP, pages 96–100. IEEE, 2023a.
  65. Neuro-modulated hebbian learning for fully test-time adaptation. In CVPR, pages 3728–3738, 2023b.
  66. Mask and predict: Multi-step reasoning for scene graph generation. In ACM MM, pages 4128–4136, 2021.
  67. Rain rendering for evaluating and improving robustness to bad weather. IJCV, 129:341–360, 2021.
  68. Learning 3d semantic scene graphs from 3d indoor reconstructions. In CVPR, pages 3961–3970, 2020.
  69. Improving scene graph generation with superpixel-based interaction learning. In ACM MM, pages 1809–1820, 2023.
  70. Cross-modal scene graph matching for relationship-aware image-text retrieval. In WACV, pages 1508–1517, 2020a.
  71. Exploring context and visual pattern of relationship for scene graph generation. In CVPR, pages 8188–8197, 2019.
  72. Sketching image gist: Human-mimetic hierarchical scene graph generation. In ECCV, pages 222–239. Springer, 2020b.
  73. Scene graph to image synthesis via knowledge consensus. In AAAI, pages 2856–2865, 2023.
  74. Unified perceptual parsing for scene understanding. In ECCV, pages 418–434. Springer, 2018.
  75. Scene graph generation by iterative message passing. In CVPR, pages 5410–5419, 2017.
  76. Meta spatio-temporal debiasing for video scene graph generation. In ECCV, pages 374–390. Springer, 2022.
  77. Pcpl: Predicate-correlation perception learning for unbiased scene graph generation. In ACM MM, pages 265–273, 2020.
  78. Graph r-cnn for scene graph generation. In ECCV, pages 670–685. Springer, 2018.
  79. Auto-encoding scene graphs for image captioning. In CVPR, pages 10685–10694, 2019.
  80. Logicdef: An interpretable defense framework against adversarial examples via inductive scene graph reasoning. In AAAI, pages 8840–8848, 2022.
  81. Linguistic structures as weak supervision for visual scene graph generation. In CVPR, pages 8289–8299, 2021.
  82. A fourier perspective on model robustness in computer vision. In NeurIPS, pages 13276–13286, 2019.
  83. Image-to-image retrieval by learning similarity between scene graphs. In AAAI, pages 10718–10726, 2021.
  84. Cogtree: Cognition tree loss for unbiased scene graph generation. In IJCAI, pages 1274–1280, 2021.
  85. Bridging knowledge graphs to generate scene graphs. In ECCV, pages 606–623. Springer, 2020a.
  86. Learning visual commonsense for robust scene graph generation. In ECCV, pages 642–657. Springer, 2020b.
  87. Neural motifs: Scene graph parsing with global context. In CVPR, pages 5831–5840, 2018.
  88. An empirical study on leveraging scene graphs for visual question answering. In BMVC, 2019.
  89. Robust hierarchical scene graph generation. In NeurIPS 2023 Workshop: New Frontiers in Graph Learning, 2023.
  90. mixup: Beyond empirical risk minimization. In ICLR, 2018.
  91. Memo: Test time robustness via adaptation and augmentation. In NeurIPS, pages 38629–38642, 2022.
  92. Prototype-based embedding network for scene graph generation. In CVPR, pages 22783–22792, 2023.
  93. Hierarchical planning for long-horizon manipulation with geometric and symbolic scene graphs. In ICRA, pages 6541–6548. IEEE, 2021.
Citations (6)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.