Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Temporal Knowledge Distillation for Time-Sensitive Financial Services Applications (2312.16799v1)

Published 28 Dec 2023 in cs.LG and cs.AI

Abstract: Detecting anomalies has become an increasingly critical function in the financial service industry. Anomaly detection is frequently used in key compliance and risk functions such as financial crime detection fraud and cybersecurity. The dynamic nature of the underlying data patterns especially in adversarial environments like fraud detection poses serious challenges to the machine learning models. Keeping up with the rapid changes by retraining the models with the latest data patterns introduces pressures in balancing the historical and current patterns while managing the training data size. Furthermore the model retraining times raise problems in time-sensitive and high-volume deployment systems where the retraining period directly impacts the models ability to respond to ongoing attacks in a timely manner. In this study we propose a temporal knowledge distillation-based label augmentation approach (TKD) which utilizes the learning from older models to rapidly boost the latest model and effectively reduces the model retraining times to achieve improved agility. Experimental results show that the proposed approach provides advantages in retraining times while improving the model performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Anomaly Detection in Finance: Editors’ Introduction. Proceedings of Machine Learning Research 71:1–7, 2017 KDD 2017: Workshop on Anomaly Detection in Finance (2017).
  2. Anomaly Detection in Finance: Editors’ Introduction. In Proceedings of the KDD 2017: Workshop on Anomaly Detection in Finance (Proceedings of Machine Learning Research, Vol. 71). PMLR, 1–7.
  3. A.Shah. 2020. Challenges Deploying Machine Learning Models to Production, MLOps: DevOps for Machine Learning. Towards DataScience (2020). https://towardsdatascience.com/challenges-deploying-machine-learning-models-to-production-ded3f9009cb3
  4. Jimmy Ba and Rich Caruana. 2014. Do Deep Nets Really Need to be Deep? In Advances in Neural Information Processing Systems 27. 2654–2662.
  5. Label Refinery: Improving ImageNet Classification through Label Progression. CoRR abs/1805.02641 (2018).
  6. Model Compression. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Philadelphia, PA, USA) (KDD ’06). 535–541.
  7. K. Buehler. 2019. Transforming Approaches to AML and Financial Crime. McKinsey (2019).
  8. Protecting Mobile Money against Financial Crimes Global Policy Challenges and Solutions. World Bank Report (2011).
  9. AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training. 32nd AAAI conference on Artificial Intelligence (2018).
  10. Cross-Layer Distillation with Semantic Calibration. CoRR abs/2012.03236 (2020).
  11. Learning Efficient Object Detection Models with Knowledge Distillation. In Advances in Neural Information Processing Systems 30.
  12. Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. CoRR abs/1603.02754 (2016).
  13. European Payments Council. 2019. Payment Methods Report 2019: Innovations in the Way We Pay. E.U. Payments Council Report (2019).
  14. Accelerating Slide Deep Learning on Modern CPUs: Vectorization, Quantizations, Memory Optimizations and More. MLSys (2021).
  15. Jesse Davis and Mark Goadrich. 2006. The Relationship Between Precision-Recall and ROC Curves. In Proceedings of the 23rd International Conference on Machine Learning (ICML ’06).
  16. Terrance Devries and Graham W. Taylor. 2017. Improved Regularization of Convolutional Neural Networks with Cutout. CoRR abs/1708.04552 (2017).
  17. Deep learning acceleration based on in-memory computing. BM Journal of Research and Development, vol. 63, no. 6, pp. 7:1-7:16, 1 Nov.-Dec. (2019).
  18. Mohammad Farhadi and Yezhou Yang. 2019. TKD: Temporal Knowledge Distillation for Active Perception. CoRR abs/1903.01522 (2019).
  19. Javad Forough and Saeedeh Momtazi. 2021. Ensemble of deep sequential models for credit card fraud detection. Applied Soft Computing 99 (2021), 106883.
  20. World Economic Forum. 2020. Transforming Paradigms: A Global AI in Financial Services Survey.
  21. Born Again Neural Networks. CoRR abs/1805.04770 (2018).
  22. Knowledge Distillation: A Survey. CoRR abs/2006.05525 (2020).
  23. P. Harrison. 2020. Ecommerce Account Takeover Fraud Jumps to 378% Since the Start of COVID-19 Pandemic. The Fintech Times (2020). https://thefintechtimes.com/ecommerce-account-takeover-fraud-jumps-to-378-since-the-start-of-covid-19-pandemic/
  24. Financial Crime and Fraud in the Age of Cybersecurity. McKinsey and Company Report (2019).
  25. Deep Learning in Finance. (2016). arXiv:1602.06561
  26. David Heun. 2021. Small Businesses Fueling Zelle’s Growth. American Banker (2021). https://www.americanbanker.com/news/small-businesses-fueling-zelles-growth
  27. Distilling the Knowledge in a Neural Network. CoRR abs/1503.02531 (2015).
  28. IEEE Computational Intelligence Society. 2019. Fraud Detection Competition. https://www.kaggle.com/c/ieee-fraud-detection.
  29. FReTAL: Generalizing Deepfake Detection Using Knowledge Distillation and Representation Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 1001–1012.
  30. Christelle Marfaing and Alexandre Garcia. 2018. Computer-Assisted Fraud Detection, From Active Learning to Reward Maximization. CoRR abs/1811.08212 (2018).
  31. FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2592-2600 (2016).
  32. NeurIPS 2019 Workshop on Robust AI in Financial Services: Data, Fairness, Explainability, Trustworthiness, and Privacy.
  33. Forbes Business Reports. 2020. Fraud Trends And Tectonics. Forbes (2020). https://www.forbes.com/sites/businessreporter/2020/06/08/fraud-trends-and-tectonics/?sh=3a422de06d12
  34. FitNets: Hints for Thin Deep Nets. CoRR abs/1412.6550 (2014).
  35. Gözde Gül Şahin and Mark Steedman. 2018. Data Augmentation via Dependency Tree Morphing for Low-Resource Languages. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 5004–5009.
  36. Takaya Saito and Marc Rehmsmeier. 2015. The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLOS ONE 10 (03 2015).
  37. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108 (2019).
  38. Federated Knowledge Distillation. CoRR abs/2011.02367 (2020).
  39. E. Shein. 2021. Account Takeover Fraud Rates Skyrocketed 282% Over Last Year. TechRepublic (2021). https://www.techrepublic.com/article/account-takeover-fraud-rates-skyrocketed-282-over-last-year/
  40. Hongda Shen and Eren Kurshan. 2020. Deep Q-Network-based Adaptive Alert Threshold Selection Policy for Payment Fraud Systems in Retail Banking. CoRR abs/2010.11062 (2020).
  41. The Global Framework for Fighting Financial Crime Enhancing Effectiveness & Improving Outcomes. Deloitte Report (2019).
  42. M Shepovalov and V Akella. 2020. FPGA and GPU-based acceleration of ML workloads on Amazon cloud-A case study using gradient boosted decision tree library. Elsevier, Integration Volume 70, January 2020, Pages 1-9 (2020).
  43. EWS Early Warning Systems. 2019. Zelle Digital Adoption. EWS Reports (2019).
  44. A Survey of FPGA Based Deep Learning Accelerators: Challenges and Opportunities. Arxiv (2018).
  45. Timeframe Analysis. 2019. Timeframe Analysis. https://www.kaggle.com/terrypham/transactiondt-timeframe-deduction.
  46. Do Deep Convolutional Nets Really Need to be Deep (Or Even Convolutional)? ArXiv abs/1603.05691 (2016).
  47. Essence Knowledge Distillation for Speech Recognition. CoRR (2019).
  48. Zelle. 2021. Zelle® Closes 2020 with Record $307 Billion Sent on 1.2 Billion Transactions. Zelle Press Releases (2021). https://www.zellepay.com/press-releases/zeller-closes-2020-record-307-billion-sent-12-billion-transactions
  49. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks. Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp 161–170 (2015).
  50. GPU-acceleration for Large-scale Tree Boosting. Arxiv (2017).
  51. Data-Free Knowledge Distillation for Image Super-Resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7852–7861.
  52. Zhi-Hua Zhou. 2012. Ensemble Methods: Foundations and Algorithms (1st ed.). Chapman & Hall/CRC.
  53. Data-Free Knowledge Distillation for Heterogeneous Federated Learning. CoRR abs/2105.10056 (2021).

Summary

We haven't generated a summary for this paper yet.