The Price of Privacy in Collaborative Learning
B. Pejo, Q. Tang, and G. Biczok
in the Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS'18), Toronto, Canada, 15-19 October 2018, ISBN:978-1-4503-5693-0, pp. 2261-2263, 2018
Machine learning algorithms have reached mainstream status and are widely deployed in many applications. The accuracy of such algorithms depends significantly on the size of the underlying training dataset; in reality a small or medium sized organization often does not have enough data to train a reasonably accurate model. For such organizations, a realistic solution is to train machine learning models based on a joint dataset (which is a union of the individual ones). Unfortunately, privacy concerns prevent them from straightforwardly doing so. While a number of privacy-preserving solutions exist for collaborating organizations to securely aggregate the parameters in the process of training the models, we are not aware of any work that provides a rational framework for the participants to precisely balance the privacy loss and accuracy gain in their collaboration.
In this paper, we model the collaborative training process as a two-player game where each player aims to achieve higher accuracy while preserving the privacy of its own dataset. We introduce the notion of Price of Privacy, a novel approach for measuring the impact of privacy protection on the accuracy in the proposed framework. Furthermore, we develop a game-theoretical model for different player types, and then either find or prove the existence of a Nash Equilibrium with regard to the strength of privacy protection for each player.