Novel Representation Learning Technique using Graphs for Performance Analytics (2401.10799v1)
Abstract: The performance analytics domain in High Performance Computing (HPC) uses tabular data to solve regression problems, such as predicting the execution time. Existing Machine Learning (ML) techniques leverage the correlations among features given tabular datasets, not leveraging the relationships between samples directly. Moreover, since high-quality embeddings from raw features improve the fidelity of the downstream predictive models, existing methods rely on extensive feature engineering and pre-processing steps, costing time and manual effort. To fill these two gaps, we propose a novel idea of transforming tabular performance data into graphs to leverage the advancement of Graph Neural Network-based (GNN) techniques in capturing complex relationships between features and samples. In contrast to other ML application domains, such as social networks, the graph is not given; instead, we need to build it. To address this gap, we propose graph-building methods where nodes represent samples, and the edges are automatically inferred iteratively based on the similarity between the features in the samples. We evaluate the effectiveness of the generated embeddings from GNNs based on how well they make even a simple feed-forward neural network perform for regression tasks compared to other state-of-the-art representation learning techniques. Our evaluation demonstrates that even with up to 25% random missing values for each dataset, our method outperforms commonly used graph and Deep Neural Network (DNN)-based approaches and achieves up to 61.67% & 78.56% improvement in MSE loss over the DNN baseline respectively for HPC dataset and Machine Learning Datasets.
- G. Panwar, D. Zhang, Y. Pang, M. Dahshan, N. DeBardeleben, B. Rattindran, and X. Jian, “Quantifying memory underutilization in hpc systems and using it to improve performance via architecture support,” in Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, ser. MICRO ’52. New York, NY, USA: Association for Computing Machinery, 2019, p. 821–835. [Online]. Available: https://doi.org/10.1145/3352460.3358267
- T. Patki, J. J. Thiagarajan, A. Ayala, and T. Z. Islam, “Performance optimality or reproducibility: that is the question,” in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019, pp. 1–30.
- T. Z. Islam, J. J. Thiagarajan, A. Bhatele, M. Schulz, and T. Gamblin, “A machine learning framework for performance coverage analysis of proxy applications,” in SC’16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2016, pp. 538–549.
- D. Beckingsale, O. Pearce, I. Laguna, and T. Gamblin, “Apollo: Reusable models for fast, dynamic tuning of input-dependent code,” in 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2017, pp. 307–316.
- W. Tang, N. Desai, D. Buettner, and Z. Lan, “Analyzing and Adjusting User Runtime Estimates to Improve Job Scheduling on the Blue Gene/P,” in Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, 2010, pp. 1–11.
- Y. Fan, P. Rich, W. E. Allcock, M. E. Papka, and Z. Lan, “Trade-off between prediction accuracy and underestimation rate in job runtime estimates,” in 2017 IEEE International Conference on Cluster Computing (CLUSTER), Sep. 2017, pp. 530–540.
- R. Jonschkowski and O. Brock, “State representation learning in robotics: Using prior knowledge about physical interaction.” in Robotics: Science and Systems, 2014.
- B. Yoshua, A. C. Courville, and P. Vincent, “Unsupervised feature learning and deep learning: A review and new perspectives,” CoRR, vol. abs/1206.5538, 2012. [Online]. Available: http://arxiv.org/abs/1206.5538
- T. Chen, T. He, M. Benesty, V. Khotilovich, Y. Tang, H. Cho, K. Chen et al., “Xgboost: extreme gradient boosting,” R package version 0.4-2, vol. 1, no. 4, pp. 1–4, 2015.
- Z. Ding, H. Nguyen, X.-N. Bui, J. Zhou, and H. Moayedi, “Computational intelligence model for estimating intensity of blast-induced ground vibration in a mine based on imperialist competitive and extreme gradient boosting algorithms,” Natural Resources Research, vol. 29, no. 2, pp. 751–769, 2020.
- A. I. A. Osman, A. N. Ahmed, M. F. Chow, Y. F. Huang, and A. El-Shafie, “Extreme gradient boosting (xgboost) model to predict the groundwater levels in selangor malaysia,” Ain Shams Engineering Journal, vol. 12, no. 2, pp. 1545–1556, 2021.
- S. K. Kiangala and Z. Wang, “An effective adaptive customization framework for small manufacturing plants using extreme gradient boosting-xgboost and random forest ensemble learning algorithms in an industry 4.0 environment,” Machine Learning with Applications, vol. 4, p. 100024, 2021.
- T. Emmanuel, T. Maupong, D. Mpoeleng, T. Semong, B. Mphago, and O. Tabona, “A survey on missing data in machine learning,” Journal of Big Data, vol. 8, no. 1, pp. 1–37, 2021.
- S. O. Arık and T. Pfister, “Tabnet: Attentive interpretable tabular learning,” in AAAI, vol. 35, 2021, pp. 6679–6687.
- L. McInnes, J. Healy, and S. Astels, “hdbscan: Hierarchical density based clustering.” J. Open Source Softw., vol. 2, no. 11, p. 205, 2017.
- A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
- M. Wu, S. Pan, L. Du, and X. Zhu, “Learning graph neural networks with positive and unlabeled nodes,” ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 15, no. 6, pp. 1–25, 2021.
- F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “The graph neural network model,” IEEE transactions on neural networks, vol. 20, no. 1, pp. 61–80, 2008.
- W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” Advances in neural information processing systems, vol. 30, 2017.
- A. Hagberg, P. Swart, and D. S Chult, “Exploring network structure, dynamics, and function using networkx,” Los Alamos National Lab.(LANL), Los Alamos, NM (United States), Tech. Rep., 2008.
- M. Wang, D. Zheng, Z. Ye, Q. Gan, M. Li, X. Song, J. Zhou, C. Ma, L. Yu, Y. Gai et al., “Deep graph library: A graph-centric, highly-performant package for graph neural networks,” arXiv preprint arXiv:1909.01315, 2019.
- F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg et al., “Scikit-learn: Machine learning in python,” the Journal of machine Learning research, vol. 12, pp. 2825–2830, 2011.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
- Tanzima, “tzislam/2019-protools-data: Data for protools,” Sep. 2019. [Online]. Available: https://doi.org/10.5281/zenodo.3403038
- C. T. Vaughan and R. F. Barrett, “Enabling tractable exploration of the performance of adaptive mesh refinement,” in 2015 IEEE International Conference on Cluster Computing, Sep. 2015, pp. 746–752.
- “The NAS Parallel Benchmarks.” [Online]. Available: http:llscience.nas.nasa.gov/SoftwarelNPB/
- O. Pearce, H. Ahmed, R. W. Larsen, P. Pirkelbauer, and D. F. Richards, “Exploring dynamic load imbalance solutions with the comd proxy application,” Future Generation Computer Systems, 2017. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0167739X17300560
- A. J. Kunen, T. S. Bailey, and P. N. Brown, “KRIPKE - A Massively Parallel Transport Mini-App,” in American Nuclear Society M&C 2015.
- T. F. Brooks, D. S. Pope, and M. A. Marcolini, “Airfoil self-noise and prediction,” Tech. Rep., 1989.
- “Environmental Sensor Telemetry Data.” [Online]. Available: https://www.kaggle.com/datasets/garystafford/environmental-sensor-data-132k
- S. Chen, “Beijing Multi-Site Air-Quality Data,” UCI Machine Learning Repository, 2019, DOI: https://doi.org/10.24432/C5RK5G.
- J. Tramm, A. Siegel, T. Islam, and M. Schulz, “XSBench-the Development and Verification of a Performance Abstraction for Monte Carlo Reactor Analysis,” The Role of Reactor Physics toward a Sustainable Future (PHYSOR), 2014.
- P. K. Romano and B. Forget, “The openmc monte carlo particle transport code,” Annals of Nuclear Energy, vol. 51, pp. 274–281, 2013.
- S. Haykin and N. Network, “A comprehensive foundation,” Neural networks, vol. 2, no. 2004, p. 41, 2004.
- P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, “Graph attention networks,” arXiv preprint arXiv:1710.10903, 2017.
- Schmidhuber, Jürgen, “Deep learning in neural networks: An overview,” Neural networks, vol. 61, pp. 85–117, 2015.
- G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “Lightgbm: A highly efficient gradient boosting decision tree,” Advances in neural information processing systems, vol. 30, 2017.
- P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, P.-A. Manzagol, and L. Bottou, “Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion.” Journal of machine learning research, vol. 11, no. 12, 2010.
- T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next-generation hyperparameter optimization framework,” in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 2019, pp. 2623–2631.
- G. Terejanu, J. Chowdhury, R. Rashid, and A. Chowdhury, “Explainable deep modeling of tabular data using tablegraphnet,” arXiv preprint arXiv:2002.05205, 2020.
- X. Guo, Y. Quan, H. Zhao, Q. Yao, Y. Li, and W. Tu, “Tabgnn: Multiplex graph neural network for tabular data prediction,” arXiv preprint arXiv:2108.09127, 2021.
- H. Guo, R. Tang, Y. Ye, Z. Li, and X. He, “Deepfm: a factorization-machine based neural network for ctr prediction,” arXiv preprint arXiv:1703.04247, 2017.
- H. Yao, X. Wu, Z. Tao, Y. Li, B. Ding, R. Li, and Z. Li, “Automated relational meta-learning,” arXiv preprint arXiv:2001.00745, 2020.
- R. Vilalta and Y. Drissi, “A perspective view and survey of meta-learning,” Artificial intelligence review, vol. 18, no. 2, pp. 77–95, 2002.