2000 character limit reached
More Discriminative Sentence Embeddings via Semantic Graph Smoothing (2402.12890v1)
Published 20 Feb 2024 in cs.CL and cs.LG
Abstract: This paper explores an empirical approach to learn more discriminantive sentence representations in an unsupervised fashion. Leveraging semantic graph smoothing, we enhance sentence embeddings obtained from pretrained models to improve results for the text clustering and classification tasks. Our method, validated on eight benchmarks, demonstrates consistent improvements, showcasing the potential of semantic graph smoothing in improving sentence embeddings for the supervised and unsupervised document categorization tasks.
- How to leverage a multi-layered transformer language model for text clustering: an ensemble approach. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 2837–2841.
- David Arthur and Sergei Vassilvitskii. 2007. K-means++ the advantages of careful seeding. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pages 1027–1035.
- Entropy weighted power k-means clustering. In International conference on artificial intelligence and statistics, pages 691–701. PMLR.
- Convolutional neural networks on graphs with fast localized spectral filtering. Advances in neural information processing systems, 29.
- Janez Demšar. 2006. Statistical comparisons of classifiers over multiple data sets. The Journal of Machine learning research, 7:1–30.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186.
- Subspace co-clustering with two-way graph convolution. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 3938–3942.
- Boosting subspace co-clustering via bilateral graph convolution. IEEE Transactions on Knowledge and Data Engineering.
- Predict then propagate: Graph neural networks meet personalized pagerank. In International Conference on Learning Representations.
- Lawrence Hubert and Phipps Arabie. 1985. Comparing partitions. Journal of classification, 2:193–218.
- Nitin Jindal and Bing Liu. 2007. Review spam detection. In Proceedings of the 16th international conference on World Wide Web, pages 1189–1190.
- Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations.
- Deeper insights into graph convolutional networks for semi-supervised learning. In Proceedings of the AAAI conference on artificial intelligence, volume 32.
- Bertgcn: Transductive text classification by combining gcn and bert. arXiv preprint arXiv:2105.05727.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
- Sentiment analysis of blogs by combining lexical knowledge with text classification. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1275–1284.
- Graph signal processing: Overview, challenges, and applications. Proceedings of the IEEE, 106(5):808–828.
- Michael J Pazzani and Daniel Billsus. 2007. Content-based recommendation systems. In The adaptive web: methods and strategies of web personalization, pages 325–341. Springer.
- Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992.
- The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE signal processing magazine, 30(3):83–98.
- Information theoretic measures for clusterings comparison: is a correction for chance necessary? In Proceedings of the 26th annual international conference on machine learning, pages 1073–1080.
- Dissecting the diffusion process in linear graph convolutional networks. Advances in Neural Information Processing Systems, 34:5758–5769.
- Simplifying graph convolutional networks. In International conference on machine learning, pages 6861–6871. PMLR.
- Graph convolutional networks for text classification. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 7370–7377.
- Hao Zhu and Piotr Koniusz. 2020. Simple spectral graph convolution. In International conference on learning representations.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.