Teacher-Student Learning on Complexity in Intelligent Routing (2402.15665v1)
Abstract: Customer service is often the most time-consuming aspect for e-commerce websites, with each contact typically taking 10-15 minutes. Effectively routing customers to appropriate agents without transfers is therefore crucial for e-commerce success. To this end, we have developed a machine learning framework that predicts the complexity of customer contacts and routes them to appropriate agents accordingly. The framework consists of two parts. First, we train a teacher model to score the complexity of a contact based on the post-contact transcripts. Then, we use the teacher model as a data annotator to provide labels to train a student model that predicts the complexity based on pre-contact data only. Our experiments show that such a framework is successful and can significantly improve customer experience. We also propose a useful metric called complexity AUC that evaluates the effectiveness of customer service at a statistical level.
- Applying natural language processing and hierarchical machine learning approaches to text difficulty classification. International Journal of Artificial Intelligence in Education volume, 30:337–370.
- Comparing machine learning classification approaches for predicting expository text difficulty. International Florida Artificial Intelligence Research Society Conference.
- An analysis of transformations. Journal of the Royal Statistical Society B, 26:211–252.
- Xgboost: A scalable tree boosting system. ACM SIGKDD International Conference.
- Fawcett, T. (2006). An introduction to roc analysis. Pattern Recognition Letters, 27(8):861–874.
- Gilchrist, W. (2000). Statistical modelling with quantile functions. CRC Press.
- Entity embeddings of categorical variables. arXiv, page 1604.06737.
- A simple generalisation of the area under the roc curve for multiple class classification problems. Machine Learning, 45(2):171–186.
- Lightgbm: A highly efficient gradient boosting decision tree. Neural Information Processing Systems (NIPS).
- Aucμ𝜇{}_{\mu}start_FLOATSUBSCRIPT italic_μ end_FLOATSUBSCRIPT: A performance metric for multi-class machine learning models. International Conference on Machine Learning (ICML), PMLR(97).
- On information and sufficiency. Annals of Mathematical Statistics, 22:79–86.
- McClish, D. K. (1989). Analyzing a portion of the roc curve. Med Decis Making, 9(3):190–195.
- Universal model in online customer service. WWW ’23 Companion: Companion Proceedings of the ACM Web Conference 2023, pages 878–885.
- Catboost: unbiased boosting with categorical features. d Conference on Neural Information Processing Systems (NeurIPS).
- Mining of massive datasets. Cambridge University Press, pages 1–17.
- Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27:379–423.
- The information bottleneck method. arXiv, page physics/0004057.
- Deep learning and the information bottleneck principle. arXiv, page 1503.02406.
- Wang, Y. (2006). Automatic recognition of text difficulty from consumers health information. Computer-Based Medical Systems, 2006. CBMS 2006. 19th IEEE, pages 131–136.
- A new family of power transformations to improve normality or symmetry. Biometrika, 87(4):954–959.