Privacy-Enhancing Collaborative Information Sharing through Federated Learning -- A Case of the Insurance Industry (2402.14983v1)
Abstract: The report demonstrates the benefits (in terms of improved claims loss modeling) of harnessing the value of Federated Learning (FL) to learn a single model across multiple insurance industry datasets without requiring the datasets themselves to be shared from one company to another. The application of FL addresses two of the most pressing concerns: limited data volume and data variety, which are caused by privacy concerns, the rarity of claim events, the lack of informative rating factors, etc.. During each round of FL, collaborators compute improvements on the model using their local private data, and these insights are combined to update a global model. Such aggregation of insights allows for an increase to the effectiveness in forecasting claims losses compared to models individually trained at each collaborator. Critically, this approach enables machine learning collaboration without the need for raw data to leave the compute infrastructure of each respective data owner. Additionally, the open-source framework, OpenFL, that is used in our experiments is designed so that it can be run using confidential computing as well as with additional algorithmic protections against leakage of information via the shared model updates. In such a way, FL is implemented as a privacy-enhancing collaborative learning technique that addresses the challenges posed by the sensitivity and privacy of data in traditional machine learning solutions. This paper's application of FL can also be expanded to other areas including fraud detection, catastrophe modeling, etc., that have a similar need to incorporate data privacy into machine learning collaborations. Our framework and empirical results provide a foundation for future collaborations among insurers, regulators, academic researchers, and InsurTech experts.
- A machine learning approach for individual claims reserving in insurance. Applied Stochastic Models in Business and Industry, 35(5):1127–1155.
- Machine learning in p&c insurance: A review for pricing and reserving. Risks, 9(1):4.
- Splitnn-driven vertical partitioning. arXiv preprint arXiv:2008.04137.
- Openfl: the open federated learning library. Physics in Medicine & Biology, 67.
- Guelman, L. (2012). Gradient boosting trees for auto insurance loss cost modeling and prediction. Expert Systems with Applications, 39(3):3659–3667.
- Machine learning approaches for auto insurance big data. Risks, 9(2):42.
- Insurance telematics: Opportunities and challenges with the smartphone solution. IEEE Intelligent Transportation Systems Magazine, 6(4):57–70.
- Fitting tweedie’s compound poisson model to insurance claims data. Scandinavian Actuarial Journal, 1994(1):69–93.
- Loss models: from data to decisions. John Wiley & Sons, 4 edition.
- A review of applications in federated learning. Computers & Industrial Engineering, 149:106854.
- Communication-efficient learning of deep networks from decentralized data. In Singh, A. and Zhu, J., editors, Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, volume 54 of Proceedings of Machine Learning Research, pages 1273–1282. PMLR.
- Mosley Jr, R. C. (2012). Social media analytics: Data mining applied to insurance twitter posts. Technical report, Casualty Actuarial Society E-Forum.
- Neural networks: an introduction. Springer Science & Business Media, 2 edition.
- Federated learning for wireless communications: Motivation, opportunities, and challenges. IEEE Communications Magazine, 58(6):46–51.
- Federated learning enables big data for rare cancer boundary detection. Nature Communications, 13(7346):1–17.
- The future of digital health with federated learning. NPJ digital medicine, 3(119):119.
- Wearables and the internet of things: considerations for the life and health insurance industry. British Actuarial Journal, 24:e22.
- Introduction to multi-layer feed-forward neural networks. Chemometrics and Intelligent Laboratory Systems, 39(1):43–62.
- Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 10:1–19.
- Deep leakage from gradients. In Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché Buc, F., Fox, E., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
- Panyi Dong (3 papers)
- Zhiyu Quan (6 papers)
- Brandon Edwards (8 papers)
- Runhuan Feng (10 papers)
- Tianyang Wang (80 papers)
- Patrick Foley (7 papers)
- Prashant Shah (6 papers)
- Shih-Han Wang (6 papers)