Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fine-Grained is Too Coarse: A Novel Data-Centric Approach for Efficient Scene Graph Generation (2305.18668v2)

Published 30 May 2023 in cs.CV

Abstract: Learning to compose visual relationships from raw images in the form of scene graphs is a highly challenging task due to contextual dependencies, but it is essential in computer vision applications that depend on scene understanding. However, no current approaches in Scene Graph Generation (SGG) aim at providing useful graphs for downstream tasks. Instead, the main focus has primarily been on the task of unbiasing the data distribution for predicting more fine-grained relations. That being said, all fine-grained relations are not equally relevant and at least a part of them are of no use for real-world applications. In this work, we introduce the task of Efficient SGG that prioritizes the generation of relevant relations, facilitating the use of Scene Graphs in downstream tasks such as Image Generation. To support further approaches, we present a new dataset, VG150-curated, based on the annotations of the popular Visual Genome dataset. We show through a set of experiments that this dataset contains more high-quality and diverse annotations than the one usually use in SGG. Finally, we show the efficiency of this dataset in the task of Image Generation from Scene Graphs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Sherif Abdelkarim et al. Exploring long tail visual relationship recognition with large vocabulary. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15921–15930, 2021.
  2. The topology and language of relationships in the visual genome dataset. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 4859–4867. IEEE, 2022.
  3. Reasoning with scene graphs for robot planning under partial observability. IEEE Robotics and Automation Letters, 7(2):5560–5567, 2022.
  4. Fernando Amodeo et al. OG-SGG: Ontology-guided scene graph generation—a case study in transfer learning for telepresence robotics. IEEE Access, 10:132564–132583, 2022.
  5. RelTR: Relation Transformer for Scene Graph Generation, Aug. 2022. arXiv:2201.11460 [cs] version: 2.
  6. Detecting Visual Relationships with Deep Relational Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3298–3308, Honolulu, HI, July 2017. IEEE.
  7. Vinay Damodaran et al. Understanding the role of scene graphs in visual question answering. arXiv preprint arXiv:2101.05479, 2021.
  8. Learning of visual relations: The devil is in the tails. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15404–15413, 2021.
  9. Xingning Dong et al. Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 19405–19414, New Orleans, LA, USA, June 2022. IEEE.
  10. Scene graph generation with external knowledge and image reconstruction. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1969–1978, 2019.
  11. Martin Heusel et al. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  12. Filip Ilievski et al. Dimensions of commonsense knowledge. Knowledge-Based Systems, 229:107347, 2021.
  13. Image generation from scene graphs. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1219–1228, 2018.
  14. Zero-shot scene graph relation prediction through commonsense knowledge integration. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 466–482. Springer, 2021.
  15. Ranjay Krishna et al. Visual genome: Connecting language and vision using crowdsourced dense image annotations. International journal of computer vision, 123(1):32–73, 2017.
  16. Visual question answering over scene graph. In 2019 First International Conference on Graph Computing (GC), pages 45–50. IEEE, 2019.
  17. The devil is in the labels: Noisy label correction for robust scene graph generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18869–18878, 2022.
  18. Sgtr: End-to-end scene graph generation with transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19486–19496, 2022.
  19. Bipartite graph network with adaptive message passing for unbiased scene graph generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11109–11119, 2021.
  20. Wei Li et al. PPDL: Predicate probability distribution based loss for unbiased scene graph generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19447–19456, 2022.
  21. Embodied semantic scene graph generation. In Conference on Robot Learning, pages 1585–1594. PMLR, 2022.
  22. Vrr-vg: Refocusing visually-relevant relationships. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10403–10412, 2019.
  23. GPS-Net: Graph Property Sensing Network for Scene Graph Generation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3743–3752, Seattle, WA, USA, June 2020. IEEE.
  24. Fully convolutional scene graph generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11546–11556, 2021.
  25. Yichao Lu et al. Context-aware scene graph generation with seq2seq transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pages 15931–15941, 2021.
  26. Focusing visual relation detection on relevant relations with prior potentials. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2980–2989, 2020.
  27. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 11 2019.
  28. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28, 2015.
  29. Conceptnet 5.5: An open multilingual graph of general knowledge. In Thirty-first AAAI conference on artificial intelligence, 2017.
  30. Energy-Based Learning for Scene Graph Generation. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 13931–13940, Nashville, TN, USA, June 2021. IEEE.
  31. Unbiased Scene Graph Generation From Biased Training. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3713–3722, Seattle, WA, USA, June 2020. IEEE.
  32. Learning to compose dynamic tree structures for visual contexts. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6619–6628, 2019.
  33. On the role of scene graphs in image captioning. In Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN), pages 29–34, 2019.
  34. Exploring Context and Visual Pattern of Relationship for Scene Graph Generation. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8180–8189, Long Beach, CA, USA, June 2019. IEEE.
  35. Sketching image gist: Human-mimetic hierarchical scene graph generation. In European Conference on Computer Vision, pages 222–239. Springer, 2020.
  36. Unbiased Scene Graph Generation via Rich and Fair Semantic Extraction, Feb. 2020. arXiv:2002.00176 [cs].
  37. Scene graph generation by iterative message passing. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5410–5419, 2017.
  38. Scene graph captioner: Image captioning based on structural visual representation. Journal of Visual Communication and Image Representation, 58:477–485, 2019.
  39. Pcpl: Predicate-correlation perception learning for unbiased scene graph generation. In Proceedings of the 28th ACM International Conference on Multimedia, pages 265–273, 2020.
  40. Reformer: The relational transformer for image captioning. In Proceedings of the 30th ACM International Conference on Multimedia, pages 5398–5406, 2022.
  41. Linguistic Structures as Weak Supervision for Visual Scene Graph Generation. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8285–8295, Nashville, TN, USA, June 2021. IEEE.
  42. Unbiased Heterogeneous Scene Graph Generation with Relation-aware Message Passing Neural Network, Dec. 2022. arXiv:2212.00443 [cs] version: 1.
  43. CogTree: Cognition Tree Loss for Unbiased Scene Graph Generation. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, pages 1274–1280. International Joint Conferences on Artificial Intelligence Organization, Aug. 2021.
  44. Visual relationship detection with internal and external linguistic knowledge distillation. In Proceedings of the IEEE international conference on computer vision, pages 1974–1982, 2017.
  45. Neural motifs: Scene graph parsing with global context. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5831–5840, 2018.
  46. Fine-grained scene graph generation with data transfer. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVII, pages 409–424. Springer, 2022.
  47. Visual translation embedding network for visual relation detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5532–5540, 2017.
  48. Large-scale visual relationship understanding. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 9185–9194, 2019.
  49. Learning to Generate Scene Graph from Natural Language Supervision. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 1803–1814, Montreal, QC, Canada, Oct. 2021. IEEE.
  50. Scene Graph Generation: A Comprehensive Survey, June 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Neau Maëlic (1 paper)
  2. Paulo E. Santos (10 papers)
  3. Anne-Gwenn Bosser (4 papers)
  4. Cédric Buche (12 papers)
Citations (2)