Measure Contribution of Participants in Federated Learning (1909.08525v1)

Published 17 Sep 2019 in cs.LG and stat.ML

Abstract: Federated Machine Learning (FML) creates an ecosystem for multiple parties to collaborate on building models while protecting data privacy for the participants. A measure of the contribution for each party in FML enables fair credits allocation. In this paper we develop simple but powerful techniques to fairly calculate the contributions of multiple parties in FML, in the context of both horizontal FML and vertical FML. For Horizontal FML we use deletion method to calculate the grouped instance influence. For Vertical FML we use Shapley Values to calculate the grouped feature importance. Our methods open the door for research in model contribution and credit allocation in the context of federated machine learning.

Citations (176)

View on Semantic Scholar

Summary

Analyzing Participant Contribution in Federated Learning

Overview

The paper "Measure Contribution of Participants in Federated Learning" presents methodologies for attributing value to participant contributions within Federated Machine Learning (FML). This contribution measurement is critical for fostering collaboration while maintaining privacy—a core tenet of Federated Learning. The proposed methods focus on calculating contributions in both Horizontal and Vertical FML scenarios, utilizing deletion methods for instance groups and Shapley Values for feature importance, respectively.

Horizontal Federated Learning

In Horizontal FML, participants contribute training instances. The deletion method, an intuitive and straightforward approach, gauges the influence of instance groups on model predictions. This method involves retraining models with a specific party's data omitted and measuring deviations from the original model's outputs, thus quantifying the data's importance. The paper presents an approximation algorithm for this method, which leverages batch deletion to enhance computational efficiency. This enables scalable calculation of contribution metrics across potentially large datasets, offering a practical implementation for Horizontal FML frameworks.

Vertical Federated Learning

Vertical FML engages participants through feature space sharing, making attribution of value to features critical. The paper adopts Shapley Values, rooted in cooperative game theory, to fairly distribute contributions among features. This approach traditionally demands high computational resources due to its exponential time complexity. However, the authors propose a method using federated feature aggregation and Shapley group interaction indices as an approximation, aiming to reduce computational load while maintaining effective contribution analysis. This facilitates multi-party collaboration without exposing individual feature values, adhering to privacy constraints inherent in FML.

Experimental Validation

The paper details experiments using the Cervical cancer (Risk Factors) Data Set to validate proposed methodologies. For Horizontal FML, the deletion method successfully discriminated participant contributions by evaluating instance group influences on a model's predictive performance. Vertical FML experiments applied Shapley Values to denote the importance of participant features, demonstrating adaptability to real-world scenarios through federated feature constructs. Both approaches are shown to be effective, highlighting their practical applicability in incentivizing participation and ensuring fair credit allocation.

Implications and Future Work

The methodologies introduced offer notable implications for the development of incentives and reward systems in FML environments. By enabling fair and privacy-preserving contribution measurement, these techniques support sustainable and collaborative Federated Learning ecosystems, especially crucial in industries like insurance where data privacy is paramount.

Looking forward, the paper suggests the integration and refinement of influential function-based approaches and optimized Shapley value sampling to bolster accuracy and efficiency. These advancements would further cement the approach's utility as a foundational component in comprehensive FML toolkits, supporting broader industrial application and standardized contribution measurement frameworks.

Conclusion

While the paper represents a significant step towards a systematic method of evaluating participant contributions in Federated Learning, the continued refinement and exploration of advanced algorithms hold promise for enhancing the computational efficiency and practicality of these methods. Ultimately, the ability to fairly measure contributions is pivotal to the widespread adoption and success of Federated Learning models across diverse sectors.