Graph Enabled Cross-Domain Knowledge Transfer (2304.03452v2)
Abstract: To leverage machine learning in any decision-making process, one must convert the given knowledge (for example, natural language, unstructured text) into representation vectors that can be understood and processed by machine learning model in their compatible language and data format. The frequently encountered difficulty is, however, the given knowledge is not rich or reliable enough in the first place. In such cases, one seeks to fuse side information from a separate domain to mitigate the gap between good representation learning and the scarce knowledge in the domain of interest. This approach is named Cross-Domain Knowledge Transfer. It is crucial to study the problem because of the commonality of scarce knowledge in many scenarios, from online healthcare platform analyses to financial market risk quantification, leaving an obstacle in front of us benefiting from automated decision making. From the machine learning perspective, the paradigm of semi-supervised learning takes advantage of large amount of data without ground truth and achieves impressive learning performance improvement. It is adopted in this dissertation for cross-domain knowledge transfer. (to be continued)
- Matrix entry-wise sampling: Simple is best. 2013.
- Near-optimal entrywise sampling for data matrices. In Advances in Neural Information Processing Systems, pages 1565–1573, 2013.
- Fast computation of low-rank matrix approximations. Journal of the ACM (JACM), 54(2):9–es, 2007.
- Algorithms and hardness for linear algebra on geometric graphs. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 541–552. IEEE, 2020.
- A fast random sampling algorithm for sparsifying matrices. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 272–279. Springer, 2006.
- A neural probabilistic language model. Journal of Machine Learning Research, 3(Feb):1137–1155, 2003.
- Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606, 2016.
- Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135–146, 2017.
- Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203, 2013.
- High performance convolutional neural networks for document processing. In Tenth International Workshop on Frontiers in Handwriting Recognition. Suvisoft, 2006.
- Fast approximate knn graph construction for high dimensional data via recursive lanczos bisection. Journal of Machine Learning Research, 10(9), 2009.
- cudnn: Efficient primitives for deep learning. arXiv preprint arXiv:1410.0759, 2014.
- Learning parametrised graph shift operators. arXiv preprint arXiv:2101.10050, 2021.
- Indexing by latent semantic analysis. Journal of the American society for information science, 41(6):391–407, 1990.
- Convolutional neural networks on graphs with fast localized spectral filtering. 29:3844–3852, 2016.
- Exploiting linear structure within convolutional networks for efficient evaluation. In Advances in Neural Information Processing Systems, pages 1269–1277, 2014.
- A note on element-wise matrix sparsification via a matrix-valued bernstein inequality. Information Processing Letters, 111(8):385–389, 2011.
- Techniques for interpretable machine learning. Communications of the ACM, 63(1):68–77, 2019.
- The approximation of one matrix by another of lower rank. Psychometrika, 1(3):211–218, 1936.
- On the shift operator, graph frequency, and optimal filtering in graph signal processing. IEEE Transactions on Signal Processing, 65(23):6303–6318, 2017.
- Error bounds for random matrix approximation schemes. arXiv preprint arXiv:0911.4108, 2009.
- Matrix Computations, volume 3. Johns Hopkins University Press, 2013.
- Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115, 2014.
- Matrix sparsification and the sparse null space problem. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 205–218. Springer, 2010.
- Regularisation of neural networks by enforcing lipschitz continuity. arXiv preprint arXiv:1804.04368, 2018.
- Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015.
- Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
- Deep convolutional networks on graph-structured data. arXiv preprint arXiv:1506.05163, 2015.
- Open graph benchmark: Datasets for machine learning on graphs. arXiv preprint arXiv:2005.00687, 2020.
- Improving training of deep neural networks via singular value bounding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4344–4352, 2017.
- Data representation and learning with graph diffusion-embedding networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10414–10423, 2019.
- Regularization of deep neural networks with spectral dropout. Neural Networks, 110:82–90, 2019.
- Compression of deep convolutional neural networks for fast and low power mobile applications. arXiv preprint arXiv:1511.06530, 2015.
- Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
- Predict then propagate: Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997, 2018.
- Faster spectral sparsification and numerical algorithms for sdd matrices. ACM Transactions on Algorithms (TALG), 12(2):1–16, 2015.
- Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.
- Speeding-up convolutional neural networks using fine-tuned cp-decomposition. arXiv preprint arXiv:1412.6553, 2014.
- Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks, 3361(10):1995, 1995.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- Neural word embedding as implicit matrix factorization. In Advances in Neural Information Processing Systems, pages 2177–2185, 2014.
- Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016.
- Lanczosnet: Multi-scale deep graph convolutional networks. arXiv preprint arXiv:1901.01484, 2019.
- Sparse convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 806–814, 2015.
- Large graph construction for scalable semi-supervised learning. In International Conference on Machine Learning, 2010.
- Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pages 3111–3119, 2013.
- George A Miller. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41, 1995.
- Leon Mirsky. Symmetric gauge functions and unitarily invariant norms. The Quarterly Journal of Mathematics, 11(1):50–59, 1960.
- Matrix sparsification via the khintchine inequality. Technical report, Citeseer, 2009.
- Revisiting graph neural networks: All we have is low-pass filters. arXiv preprint arXiv:1905.09550, 2019.
- The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab, 1999.
- Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32:8026–8037, 2019.
- Transfer learning in biomedical natural language processing: An evaluation of bert and elmo on ten benchmarking datasets. In Proceedings of the 2019 Workshop on Biomedical Natural Language Processing (BioNLP 2019), pages 58–65, 2019.
- Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, 2014.
- DeepWalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 701–710, 2014.
- The perron-frobenius theorem: some of its applications. IEEE Signal Processing Magazine, 22(2):62–75, 2005.
- Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323–2326, 2000.
- Thomas SÂ McCormick. A combinatorial approach to some sparse matrix problems. Technical report, STANFORD UNIV CA SYSTEMS OPTIMIZATION LAB, 1983.
- Think globally, fit locally: unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research, 4(Jun):119–155, 2003.
- The graph neural network model. IEEE Transactions on Neural Networks, 20(1):61–80, 2008.
- The singular values of convolutional layers. arXiv preprint arXiv:1805.10408, 2018.
- Eugene Seneta. Non-negative matrices and Markov chains. Springer Science & Business Media, 2006.
- Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868, 2018.
- The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine, 30(3):83–98, 2013.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Daniel A Spielman. Spectral graph theory and its applications. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS’07), pages 29–38. IEEE, 2007.
- Spectral sparsification of graphs. SIAM Journal on Computing, 40(4):981–1025, 2011.
- Fast graph attention networks using effective resistance based graph sparsification. arXiv preprint arXiv:2006.08796, 2020.
- Highway networks. arXiv preprint arXiv:1505.00387, 2015.
- Fisher-Bures adversary graph convolutional networks. In Uncertainty in Artificial Intelligence, pages 465–475. PMLR, 2020.
- Sparsifying neural network connections for face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4856–4864, 2016.
- Embedding imputation with self-supervised graph neural networks. IEEE Access, 2023.
- Ulrike Von Luxburg. A tutorial on spectral clustering. Statistics and Computing, 17(4):395–416, 2007.
- Simplifying graph convolutional networks. In International Conference on Machine Learning, pages 6861–6871. PMLR, 2019.
- Dew point effect on financial market volatility. Technical report, Imprecision Friday, 2016.
- A parallel implementation of support vector machines with nonlinear dimensionality reduction. Technical report, NJIT, 2019.
- Quantifying heterogeneity in financial time series for improved prediction. In The 6th Applied Financial Modeling Conference, 2018.
- Perturbing eigenvalues with residual learning in graph convolutional neural networks. In Proceedings of The 13th Asian Conference on Machine Learning, Proceedings of Machine Learning Research. PMLR, 2021.
- Neural network pruning as spectrum preserving process. submitted to ACM TKDD, 2021.
- Enhancing domain word embedding via latent semantic imputation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 557–565, 2019.
- Biowordvec, improving biomedical word embeddings with subword information and mesh. Scientific Data, 6, 05 2019.
- Spectral norm regularization for improving the generalizability of deep learning. arXiv preprint arXiv:1705.10941, 2017.
- Fast knn graph construction with locality sensitive hashing. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 660–674. Springer, 2013.
- Xiaojin Zhu. Semi-supervised learning with graphs. PhD thesis, Carnegie Mellon University, 2005.
- Learning from labeled and unlabeled data with label propagation. Technical report, CMU, 2002.
- Introduction to semi-supervised learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 3(1):1–130, 2009.
- Xiaojin Jerry Zhu. Semi-supervised learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences, 2005.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.