Towards Data-centric Graph Machine Learning: Review and Outlook (2309.10979v1)
Abstract: Data-centric AI, with its primary focus on the collection, management, and utilization of data to drive AI models and applications, has attracted increasing attention in recent years. In this article, we conduct an in-depth and comprehensive review, offering a forward-looking outlook on the current efforts in data-centric AI pertaining to graph data-the fundamental data structure for representing and capturing intricate dependencies among massive and diverse real-life entities. We introduce a systematic framework, Data-centric Graph Machine Learning (DC-GML), that encompasses all stages of the graph data lifecycle, including graph data collection, exploration, improvement, exploitation, and maintenance. A thorough taxonomy of each stage is presented to answer three critical graph-centric questions: (1) how to enhance graph data availability and quality; (2) how to learn from graph data with limited-availability and low-quality; (3) how to build graph MLOps systems from the graph data-centric view. Lastly, we pinpoint the future prospects of the DC-GML domain, providing insights to navigate its advancements and applications.
- Implementation of elliptic curve digital signature algorithm (ECDSA). In 2014 Global Summit on Computer & Information Technology. IEEE, 1–6.
- Edge classification in networks. In 2016 IEEE 32nd International Conference on Data Engineering. IEEE, 1038–1049.
- Domain Adaptation with Adversarial Training and Graph Embeddings. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 1077–1087.
- Amazon. 2005. Amazon Mechanical Turk. https://www.mturk.com/.
- Raman Arora and Jalaj Upadhyay. 2019. On differentially private graph sparsification and applications. Advances in Neural Information Processing Systems 32 (2019).
- Amazon AWS. N/A. Amazon SageMaker Platform. https://aws.amazon.com/sagemaker/.
- Simgnn: A neural network approach to fast graph similarity computation. In Proceedings of the twelfth ACM International Conference on Web Search and Data Mining. 384–392.
- Spectral sparsification of graphs: theory and algorithms. Commun. ACM 56, 8 (2013), 87–94.
- Regularizing graph neural networks via consistency-diversity graph augmentations. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 3913–3921.
- The Effects of Data Quality on Machine Learning Performance. arXiv preprint arXiv:2207.14529 (2022).
- SizeShiftReg: a Regularization Method for Improving Size-Generalization in Graph Neural Networks. In Advances in Neural Information Processing Systems.
- Graph Coarsening with Neural Networks. In International Conference on Learning Representations.
- Active learning for graph embedding. arXiv preprint arXiv:1705.05085 (2017).
- Graph domain adaptation: A generative view. arXiv preprint arXiv:2106.07482 (2021).
- Knowledge graphs meet crowdsourcing: a brief survey. In Cloud Computing: 10th EAI International Conference. Springer, 3–17.
- Deep clustering for unsupervised learning of visual features. In Proceedings of the European Conference on Computer Vision. 132–149.
- Demystifying Artificial Intelligence for Data Preparation. In Companion of International Conference on Management of Data. 13–20.
- SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16 (2002), 321–357.
- Topology-imbalance learning for semi-supervised node classification. Advances in Neural Information Processing Systems 34 (2021), 29885–29897.
- Graph unrolling networks: Interpretable neural networks for graph signal denoising. IEEE Transactions on Signal Processing 69 (2021), 3699–3713.
- Signal denoising on graphs via graph filtering. In 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP). IEEE, 872–876.
- A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning. PMLR, 1597–1607.
- Learning on attribute-missing graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 2 (2020), 740–757.
- ActiveHNE: active heterogeneous network embedding. In Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2123–2129.
- Iterative deep graph learning for graph neural networks: Better and robust node embeddings. Advances in Neural Information Processing Systems 33 (2020).
- Adaptive Universal Generalized PageRank Graph Neural Network. In International Conference on Learning Representations.
- ALLIE: Active Learning on Large-scale Imbalanced Graphs. In Proceedings of the ACM Web Conference. 690–698.
- Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 9268–9277.
- Nrgnn: Learning a label noise resistant graph neural network on sparsely and noisily labeled graphs. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 227–236.
- Towards robust graph neural networks for noisy graphs with sparse labels. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 181–191.
- NetworkX developers. 2014. NetworkX: Network Analysis in Python. https://networkx.org/.
- Meta propagation networks for graph few-shot semi-supervised learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 6524–6531.
- Data augmentation for deep graph learning: A survey. ACM SIGKDD Explorations Newsletter 24, 2 (2022), 61–77.
- Faster Hyperparameter Search on Graphs via Calibrated Dataset Condensation. In NeurIPS 2022 Workshop: New Frontiers in Graph Learning.
- Graph auto-encoder for graph signal denoising. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 3322–3326.
- DC Dowson and BV666017 Landau. 1982. The Fréchet distance between multivariate normal distributions. Journal of Multivariate Analysis 12, 3 (1982), 450–455.
- Alexandre Duval and Fragkiskos D Malliaros. 2021. Graphsvx: Shapley value explanations for graph neural networks. In Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference. Springer, 302–318.
- Benchmarking graph neural networks. arXiv preprint arXiv:2003.00982 (2020).
- Graph random neural networks for semi-supervised learning on graphs. Advances in Neural Information Processing Systems 33 (2020), 22092–22103.
- Learning discrete structures for graph neural networks. In International Conference on Machine Learning. PMLR, 1972–1982.
- Location-centered house price prediction: A multi-task learning approach. ACM Transactions on Intelligent Systems and Technology 13, 2 (2022), 1–25.
- Active discriminative network representation learning. In International Joint Conference on Artificial Intelligence.
- Predict then Propagate: Graph Neural Networks meet Personalized PageRank. In International Conference on Learning Representations.
- Diffusion improves graph learning. Advances in Neural Information Processing Systems 32 (2019).
- Amirata Ghorbani and James Zou. 2019. Data shapley: Equitable valuation of data for machine learning. In International Conference on Machine Learning. PMLR, 2242–2251.
- Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 922–929.
- A Data-centric Framework to Endow Graph Neural Networks with Out-Of-Distribution Detection Ability. (2023).
- Co-teaching: Robust training of deep neural networks with extremely noisy labels. Advances in Neural Information Processing Systems 31 (2018).
- G-mixup: Graph data augmentation for graph classification. In International Conference on Machine Learning. PMLR, 8230–8248.
- ASGN: An active semi-supervised graph neural network for molecular property prediction. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 731–752.
- Kaveh Hassani and Amir Hosein Khasahmadi. 2020. Contrastive multi-view representation learning on graphs. In International Conference on Machine Learning. PMLR, 4116–4126.
- Haibo He and Edwardo A Garcia. 2009. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering 21, 9 (2009), 1263–1284.
- GraphMAE: Self-Supervised Masked Graph Autoencoders. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 594–604.
- Graph policy network for transferable active learning on graphs. Advances in Neural Information Processing Systems 33 (2020), 10174–10185.
- Open graph benchmark: Datasets for machine learning on graphs. Advances in Neural Information Processing Systems 33 (2020), 22118–22133.
- Rectifying pseudo labels: Iterative feature clustering for graph representation learning. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 720–729.
- Deep entity matching with adversarial active learning. The VLDB Journal 32, 1 (2023), 229–255.
- Scaling up graph neural networks via graph coarsening. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 675–684.
- T2-gnn: Graph neural networks for graphs with incomplete features and structure via teacher-student distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 4339–4346.
- igraph core team. 2003. igraph: The Network Analysis Package. https://igraph.org/.
- Towards efficient data valuation based on the shapley value. In The 22nd International Conference on Artificial Intelligence and Statistics. PMLR, 1167–1176.
- Semi-supervised learning with graph learning-convolutional networks. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 11313–11320.
- Heterogeneous graph neural network via attribute completion. In Proceedings of the Web Conference. 391–400.
- Amer: A New Attribute-Missing Network Embedding Approach. IEEE Transactions on Cybernetics (2022).
- Active domain transfer on network embedding. In Proceedings of The Web Conference. 2683–2689.
- A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly Detection. arXiv preprint arXiv:2307.03759 (2023).
- Self-supervised learning on graphs: Deep insights and new direction. arXiv preprint arXiv:2006.10141 (2020).
- Node similarity preserving graph convolutional networks. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 148–156.
- Automated Self-Supervised Learning for Graphs. In International Conference on Learning Representations.
- Graph structure learning for robust graph neural networks. In Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 66–74.
- Condensing Graphs Via One-Step Gradient Matching. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 720–730.
- Graph Condensation For Graph Neural Networks. In International Conference on Learning Representations.
- Graph coarsening with preserved spectral properties. In International Conference on Artificial Intelligence and Statistics. PMLR, 4452–4462.
- Longlong Jing and Yingli Tian. 2020. Self-supervised visual feature learning with deep neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 11 (2020), 4037–4058.
- JuryGCN: quantifying jackknife uncertainty on graph convolutional networks. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 742–752.
- Thomas N Kipf and Max Welling. 2016. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308 (2016).
- Thomas N Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.
- Daphne Koller and Nir Friedman. 2009. Probabilistic graphical models: principles and techniques. MIT press.
- Robust optimization as data augmentation for large-scale graphs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 60–69.
- Machine learning operations (mlops): Overview, definition, and architecture. IEEE Access (2023).
- Kubeflow. N/A. Kubeflow Platform. https://github.com/kubeflow/kubeflow.
- Deeper insights into graph convolutional networks for semi-supervised learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
- Influence maximization on social graphs: A survey. IEEE Transactions on Knowledge and Data Engineering 30, 10 (2018), 1852–1872.
- Data augmentation for ml-driven data preparation and integration. Proceedings of the VLDB Endowment 14, 12 (2021), 3182–3185.
- Seal: Semisupervised adversarial active learning on attributed graphs. IEEE Transactions on Neural Networks and Learning Systems 32, 7 (2020), 3136–3147.
- Informative pseudo-labeling for graph neural networks with few labels. Data Mining and Knowledge Discovery 37, 1 (2023), 228–254.
- Threat detection and investigation with system-level provenance graphs: a survey. Computers & Security 106 (2021), 102282.
- Cyclic label propagation for graph semi-supervised learning. World Wide Web 25, 2 (2022), 703–721.
- Efficient graph generation with graph recurrent attention networks. Advances in Neural Information Processing Systems 32 (2019).
- Focal loss for dense object detection. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 2980–2988.
- Data-Centric Learning from Unlabeled Graphs with Diffusion Model. arXiv preprint arXiv:2303.10108 (2023).
- Kun Liu and Evimaria Terzi. 2008. Towards identity anonymization on graphs. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. 93–106.
- Local augmentation for graph neural networks. In International Conference on Machine Learning. PMLR, 14054–14072.
- Self-supervised learning: Generative or contrastive. IEEE Transactions on Knowledge and Data Engineering 35, 1 (2021), 857–876.
- Pick and choose: a GNN-based imbalanced learning approach for fraud detection. In Proceedings of the Web Conference. 3168–3177.
- Learning Strong Graph Neural Networks with Weak Information. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.
- Graph self-supervised learning: A survey. IEEE Transactions on Knowledge and Data Engineering (2022).
- Deep transfer learning with joint adaptation networks. In International Conference on Machine Learning. PMLR, 2208–2217.
- Parameterized explainer for graph neural network. Advances in Neural Information Processing Systems 33 (2020), 19620–19631.
- Deepeye: Towards automatic data visualization. In IEEE International Conference on Data Engineering. IEEE, 101–112.
- Progressive graph learning for open-set domain adaptation. In International Conference on Machine Learning. PMLR, 6468–6478.
- A unified view on graph neural networks as graph signal denoising. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 1202–1211.
- Kaushalya Madhawa and Tsuyoshi Murata. 2020. Metal: Active semi-supervised learning on graphs via meta-learning. In Asian Conference on Machine Learning. PMLR, 561–576.
- Source free unsupervised graph domain adaptation. arXiv preprint arXiv:2112.00955 (2021).
- Privacy-Integrated Graph Clustering Through Differential Privacy. In EDBT/ICDT Workshops, Vol. 1330. 247–254.
- Emmanuel Müller. 2023. Graph clustering with graph neural networks. Journal of Machine Learning Research 24 (2023), 1–21.
- Inc. neo4j. 2000. neo4j: Graph Database & Analytics. https://neo4j.com/.
- NeurIPS. 2021. NeurIPS Data-Centric AI Workshop. https://datacentricai.org/neurips21/.
- Andrew Ng. 2021. A Chat with Andrew on MLOps: From Model-centric to Data-centric AI. https://www.youtube.com/watch?v=06-AZXmwHjo.
- Learning graph neural networks with noisy labels. arXiv preprint arXiv:1905.01591 (2019).
- Graphworld: Fake graphs bring real insights for gnns. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3691–3701.
- Adversarially regularized graph autoencoder for graph embedding. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 2609–2615.
- Unifying Large Language Models and Knowledge Graphs: A Roadmap. arXiv preprint arXiv:2306.08302 (2023).
- Shirui Pan and Xingquan Zhu. 2013. Graph Classification with Imbalanced Class Distributions and Noise.. In Proceedings of International Joint Conference on Artificial Intelligence. 1586–1592.
- GraphENS: Neighbor-aware ego network synthesis for class-imbalanced node classification. In International Conference on Learning Representations.
- Loss factorization, weakly supervised learning and label noise robustness. In International Conference on Machine Learning. PMLR, 708–717.
- Making deep neural networks robust to label noise: A loss correction approach. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 1944–1952.
- A new self-supervised task on graphs: Geodesic distance prediction. Information Sciences 607 (2022), 1195–1210.
- Graph representation learning via graphical mutual information maximization. In Proceedings of The Web Conference. 259–270.
- Focused clustering and outlier detection in large attributed graphs. In Proceedings of the 20th ACM SIGKDD international Conference on Knowledge Discovery & Data Mining. 1346–1355.
- Mehmet Pilancı and Elif Vural. 2020. Domain adaptation on graphs by learning aligned graph bases. IEEE Transactions on Knowledge and Data Engineering 34, 2 (2020), 587–600.
- Neoklis Polyzotis and Matei Zaharia. 2021. What can Data-Centric AI Learn from Data and ML Engineering? arXiv preprint arXiv:2112.06439 (2021).
- Data validation for machine learning. Proceedings of Machine Learning and Systems 1 (2019), 334–347.
- Robust Training of Graph Neural Networks via Noise Governance. In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. 607–615.
- Imgagn: Imbalanced network embedding via generative adversarial graph networks. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1390–1398.
- Sebastian Raschka. 2018. Model evaluation, model selection, and algorithm selection in machine learning. arXiv preprint arXiv:1811.12808 (2018).
- icarl: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2001–2010.
- Continuous integration of machine learning models with ease. ml/ci: Towards a rigorous yet practical treatment. Proceedings of Machine Learning and Systems 1 (2019), 322–333.
- Untrained graph neural networks for denoising. IEEE Transactions on Signal Processing 70 (2022), 5708–5723.
- Self-supervised graph transformer on large-scale molecular data. Advances in Neural Information Processing Systems 33 (2020), 12559–12571.
- DropEdge: Towards Deep Graph Convolutional Networks on Node Classification. In International Conference on Learning Representations.
- Pathfinder discovery networks for neural message passing. In Proceedings of the Web Conference. 2547–2558.
- Graph sparsification approaches for laplacian smoothing. In Artificial Intelligence and Statistics. PMLR, 1250–1259.
- Practitioners Guide to MLOps: A Framework for Continuous Delivery and Automation of Machine Learning. Google Could White paper (2021).
- Ravi S Sandhu and Pierangela Samarati. 1994. Access control: principle and practice. IEEE Communications Magazine 32, 9 (1994), 40–48.
- Local graph sparsification for scalable clustering. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data. 721–732.
- Collective classification in network data. AI magazine 29, 3 (2008), 93–93.
- Ozan Sener and Silvio Savarese. 2018. Active Learning for Convolutional Neural Networks: A Core-Set Approach. In International Conference on Learning Representations.
- Amazon Web Services. 2021. Real-time Fraud Detection with Graph Neural Network on DGL. https://github.com/awslabs/realtime-fraud-detection-with-gnn-on-dgl.
- Amazon Web Services. 2023. Graphstorm: enterprise graph machine learning framework for billion-scale graphs for ML scientists and data scientists. https://github.com/awslabs/graphstorm/wiki.
- Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868 (2018).
- Adversarial Deep Network Embedding for Cross-Network Node Classification. In The Thirty-Fourth AAAI Conference on Artificial Intelligence. 2991–2999.
- The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine 30, 3 (2013), 83–98.
- Foundations and modeling of dynamic networks using dynamic graph neural networks: A survey. IEEE Access 9 (2021), 79143–79168.
- Tom AB Snijders and Krzysztof Nowicki. 1997. Estimation and prediction for stochastic blockmodels for graphs with latent block structure. Journal of Classification 14, 1 (1997), 75–100.
- Daniel A Spielman and Shang-Hua Teng. 2011. Spectral sparsification of graphs. SIAM J. Comput. 40, 4 (2011), 981–1025.
- Missing data imputation with adversarially-trained graph convolutional networks. Neural Networks 129 (2020), 249–260.
- Multi-stage self-supervised learning for graph convolutional networks on graphs with few labeled nodes. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 5892–5899.
- Adversarial graph augmentation to improve graph contrastive learning. Advances in Neural Information Processing Systems 34 (2021), 15920–15933.
- Graph convolutional networks for graphs containing missing features. Future Generation Computer Systems 117 (2021), 155–168.
- Federated Learning on Non-IID Graphs via Structural Knowledge Sharing. In Proceedings of the AAAI Conference on Artificial Intelligence.
- Wei Tang and Matthew Lease. 2011. Semi-supervised consensus labeling for crowdsourcing. In SIGIR 2011 Workshop on Crowdsourcing for Information Retrieval. 1–6.
- A survey on modern deep neural network for traffic prediction: Trends, methods and challenges. IEEE Transactions on Knowledge and Data Engineering 34, 4 (2020), 1544–1561.
- Contrastive multiview coding. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16. Springer, 776–794.
- Synthetic Graph Generation to Benchmark Graph Learning. arXiv preprint arXiv:2204.01376 (2022).
- Siamese Attribute-missing Graph Auto-encoder. arXiv preprint arXiv:2112.04842 (2021).
- Stanford University. 2021. Data-Centric AI Virtual Workshop. https://hai.stanford.edu/events/data-centric-ai-virtual-workshop.
- Graph Attention Networks. In International Conference on Learning Representations.
- Deep graph infomax. International Conference on Learning Representations 2, 3 (2019), 4.
- Graphmix: Improved training of gnns for semi-supervised learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 10024–10032.
- Waseem Waheed and David BH Tay. 2018. Graph polynomial filter for signal denoising. IET Signal Processing 12, 3 (2018), 301–309.
- Contrastive and generative graph convolutional networks for graph-based semi-supervised learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 10049–10057.
- Contrastive graph poisson networks: Semi-supervised learning with extremely limited labels. Advances in Neural Information Processing Systems 34 (2021), 6316–6327.
- Mgae: Marginalized graph autoencoder for graph clustering. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 889–898.
- Fei Wang and Changshui Zhang. 2006. Label propagation through linear neighborhoods. In International Conference on Machine Learning. 985–992.
- Hongwei Wang and Jure Leskovec. 2021. Combining graph convolutional neural networks and label propagation. ACM Transactions on Information Systems 40, 4 (2021), 1–27.
- Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. arXiv preprint arXiv:1909.01315 (2019).
- Graph Structure Estimation Neural Networks. In Proceedings of the Web Conference. 342–353.
- A survey on heterogeneous graph embedding: methods, techniques, applications and sources. IEEE Transactions on Big Data 9, 2 (2022), 415–436.
- Mixup for node and graph classification. In Proceedings of the Web Conference. 3663–3674.
- Nodeaug: Semi-supervised node classification with data augmentation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 207–217.
- Imbalanced graph classification via graph-of-graph neural networks. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2067–2076.
- Trend filtering on graphs. In Artificial Intelligence and Statistics. PMLR, 1042–1050.
- Federatedscope-gnn: Towards a unified, comprehensive and efficient package for federated graph learning. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 4110–4120.
- CLNode: Curriculum Learning for Node Classification. In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. 670–678.
- Max Welling. 2009. Herding dynamical weights to learn. In Proceedings of the 26th Annual International Conference on Machine Learning. 1121–1128.
- Data collection and quality challenges in deep learning: A data-centric ai perspective. The VLDB Journal 32, 4 (2023), 791–813.
- A Generic Graph Sparsification Framework using Deep Reinforcement Learning. In 2022 IEEE International Conference on Data Mining. IEEE, 1221–1226.
- Piotr Niedź wiedź. 2016. Amazon Neptune AI Platform. https://neptune.ai/product.
- Simplifying graph convolutional networks. In International Conference on Machine Learning. PMLR, 6861–6871.
- Unsupervised domain adaptive graph convolutional networks. In Proceedings of The Web Conference. 1457–1467.
- Openwgl: Open-world graph learning. In IEEE International Conference on Data Mining. IEEE, 681–690.
- Active learning for graph neural networks via node feature propagation. arXiv preprint arXiv:1910.07567 (2019).
- Discovering invariant rationales for graph neural networks. arXiv preprint arXiv:2201.12872 (2022).
- A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems 32, 1 (2020), 4–24.
- Unsupervised data augmentation for consistency training. Advances in Neural Information Processing Systems 33 (2020), 6256–6268.
- How powerful are graph neural networks? arXiv preprint arXiv:1810.00826 (2018).
- Gnnexplainer: Generating explanations for graph neural networks. Advances in Neural Information Processing Systems 32 (2019).
- Graph contrastive learning automated. In International Conference on Machine Learning. PMLR, 12121–12132.
- Graph contrastive learning with augmentations. Advances in Neural Information Processing Systems 33 (2020), 5812–5823.
- When does self-supervision help graph convolutional networks?. In International Conference on Machine Learning. PMLR, 10871–10880.
- Privacy preservation based on clustering perturbation algorithm for social network. Multimedia Tools and Applications 77 (2018), 11241–11258.
- Accelerating the machine learning lifecycle with MLflow. IEEE Data Eng. Bull. 41, 4 (2018), 39–45.
- Central moment discrepancy (cmd) for domain-invariant representation learning. arXiv preprint arXiv:1702.08811 (2017).
- Data-centric AI: Perspectives and Challenges. arXiv preprint arXiv:2301.04819 (2023).
- Data-centric artificial intelligence: A survey. arXiv preprint arXiv:2303.10158 (2023).
- Privacy and security for online social networks: challenges and opportunities. IEEE Network 24, 4 (2010), 13–18.
- PGAS: Privacy-preserving graph encryption for accurate constrained shortest distance queries. Information Sciences 506 (2020), 325–345.
- mixup: Beyond Empirical Risk Minimization. In International Conference on Learning Representations.
- Trustworthy graph neural networks: Aspects, methods and trends. arXiv preprint arXiv:2205.07424 (2022).
- Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural networks. Advances in Neural Information Processing Systems 31 (2018).
- Alg: Fast and accurate active learning framework for graph convolutional networks. In Proceedings of the 2021 International Conference on Management of Data. 2366–2374.
- Information Gain Propagation: a New Way to Graph Active Learning with Soft Labels. In International Conference on Learning Representations.
- Rim: Reliable influence-based active learning on graphs. Advances in Neural Information Processing Systems 34 (2021), 27978–27990.
- GRAIN: improving data efficiency of graph neural networks via diversified in fluence maximization. Proceedings of the VLDB Endowment 14, 11 (2021), 2473–2482.
- DANE: domain adaptive network embedding. In Proceedings of the 28th International Joint Conference on Artificial Intelligence. 4362–4368.
- Batch active learning with graph neural networks via multi-agent deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 9118–9126.
- Adaptive diffusion in graph neural networks. Advances in Neural Information Processing Systems 34 (2021), 23321–23333.
- Data augmentation for graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 11015–11023.
- TopoImb: Toward Topology-level Imbalance in Learning from Graphs. arXiv preprint arXiv:2212.08689 (2022).
- Graphsmote: Imbalanced node classification on graphs with graph neural networks. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 833–841.
- Robust graph representation learning via neural sparsification. In International Conference on Machine Learning. PMLR, 11458–11468.
- Graph neural networks for graphs with heterophily: A survey. arXiv preprint arXiv:2202.07082 (2022).
- Structure-free Graph Condensation: From Large-scale Graphs to Condensed Graph-free Data. arXiv preprint arXiv:2306.02664 (2023).
- Robust graph representation learning for local corruption recovery. In Proceedings of the Web Conference.
- Learning with local and global consistency. Advances in Neural Information Processing Systems 16 (2003).
- Learning from labeled and unlabeled data on a directed graph. In International Conference on Machine Learning. 1036–1043.
- Mengting Zhou and Zhiguo Gong. 2023. GraphSR: A Data Augmentation Algorithm for Imbalanced Node Classification. arXiv preprint arXiv:2302.12814 (2023).
- Attent: Active attributed network alignment. In Proceedings of the Web Conference. 3896–3906.
- Dynamic Self-training Framework for Graph Convolutional Networks. arXiv e-prints (2019), arXiv–1910.
- Shift-robust gnns: Overcoming the limitations of localized graph training data. Advances in Neural Information Processing Systems 34 (2021), 27965–27977.
- Shift-Robust Node Classification via Graph Clustering Co-training. In NeurIPS 2022 Workshop: New Frontiers in Graph Learning.
- Semi-supervised learning using gaussian fields and harmonic functions. In International Conference on Machine Learning. 912–919.
- Graph contrastive learning with adaptive augmentation. In Proceedings of the Web Conference. 2069–2080.