Unbiased Heterogeneous Scene Graph Generation with Relation-aware Message Passing Neural Network (2212.00443v4)
Abstract: Recent scene graph generation (SGG) frameworks have focused on learning complex relationships among multiple objects in an image. Thanks to the nature of the message passing neural network (MPNN) that models high-order interactions between objects and their neighboring objects, they are dominant representation learning modules for SGG. However, existing MPNN-based frameworks assume the scene graph as a homogeneous graph, which restricts the context-awareness of visual relations between objects. That is, they overlook the fact that the relations tend to be highly dependent on the objects with which the relations are associated. In this paper, we propose an unbiased heterogeneous scene graph generation (HetSGG) framework that captures relation-aware context using message passing neural networks. We devise a novel message passing layer, called relation-aware message passing neural network (RMP), that aggregates the contextual information of an image considering the predicate type between objects. Our extensive evaluations demonstrate that HetSGG outperforms state-of-the-art methods, especially outperforming on tail predicate classes.
- Resistance Training using Prior Bias: toward Unbiased Scene Graph Generation. AAAI.
- Knowledge-embedded routing network for scene graph generation. CVPR.
- Recovering the unbiased scene graphs from the biased ones. In Proceedings of the 29th ACM International Conference on Multimedia, 1581–1590.
- Learning of Visual Relations: The Devil Is in the Tails. In ICCV, 15404–15413.
- Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation. In CVPR.
- metapath2vec: Scalable representation learning for heterogeneous networks. Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining(KDD), 135–144.
- Image captioning with scene-graph based semantic concepts. ICMLC.
- Generating natural language explanations for visual question answering using scene graphs and visual attention. arXiv preprint arXiv:1902.05715.
- Heterogeneous graph transformer. In WWW.
- HDMI: High-Order Deep Multiplex Infomax. In Proceedings of the Web Conference 2021, WWW ’21, 2414–2424. New York, NY, USA: Association for Computing Machinery. ISBN 9781450383127.
- Graph density-aware losses for novel compositions in scene graph generation. BMVC.
- Visual genome: Connecting language and vision using crowdsourced dense image annotations. ICCV.
- The open images dataset v4. International Journal of Computer Vision, 128(7): 1956–1981.
- The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale. ICCV.
- The Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation. In CVPR, 18869–18878.
- Deeper insights into graph convolutional networks for semi-supervised learning. In Thirty-Second AAAI conference on artificial intelligence.
- Bipartite graph network with adaptive message passing for unbiased scene graph generation. CVPR.
- PPDL: Predicate Probability Distribution Based Loss for Unbiased Scene Graph Generation. In CVPR.
- Scene graph generation from objects, phrases and region captions. ICCV, 1261–1270.
- Focal loss for dense object detection. ICCV.
- Gps-net: Graph property sensing network for scene graph generation. CVPR, 3746–3753.
- Unsupervised Attributed Multiplex Network Embedding. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, 5371–5378. AAAI Press.
- Task-guided pair embedding in heterogeneous network. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 489–498.
- Glove: Global vectors for word representation. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP).
- Attentive relational networks for mapping images to scene graphs. CVPR, 3957–3966.
- Scene Graph based Image Retrieval–A case study on the CLEVR Dataset. ICCV Workshops.
- Faster r-cnn: Towards real-time object detection with region proposal networks. NeurIPS.
- Modeling relational data with graph convolutional networks. European semantic web conference(ESWC), 593–607.
- Structured query-based image retrieval using scene graphs. CVPR Workshops, 178–179.
- Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proceedings of the VLDB Endowment, 4(11): 992–1003.
- Sur, C. 2019. Tpsgtr: Neural-symbolic tensor product scene-graph-triplet representation for image captioning. arXiv preprint arXiv:1911.10115.
- Unbiased scene graph generation from biased training. CVPR, 3716–3725.
- Learning to compose dynamic tree structures for visual contexts. CVPR.
- Heterogeneous graph attention network. The world wide web conference(WWW), 2022–2032.
- Aggregated residual transformations for deep neural networks. Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 1492–1500.
- Scene graph generation by iterative message passing. CVPR.
- Pcpl: Predicate-correlation perception learning for unbiased scene graph generation. In Proceedings of the 28th ACM International Conference on Multimedia, 265–273.
- Graph r-cnn for scene graph generation. ECCV, 128(7): 670–685.
- Auto-encoding scene graphs for image captioning. CVPR, 10685–10694.
- LTE4G: Long-Tail Experts for Graph Neural Networks. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2434–2443.
- Neural motifs: Scene graph parsing with global context. CVPR, 5831–5840.
- An empirical study on leveraging scene graphs for visual question answering. BMVC.
- Heterogeneous graph neural network. In KDD.
- Graphical contrastive losses for scene graph parsing. CVPR, 11535–11543.
- Kanghoon Yoon (16 papers)
- Kibum Kim (16 papers)
- Jinyoung Moon (13 papers)
- Chanyoung Park (83 papers)