Incentivizing Collaboration in Machine Learning via Synthetic Data Rewards

Sebastian Shenghong Tay; Xinyi Xu; Chuan Sheng Foo; Bryan Kian Hsiang Low

Incentivizing Collaboration in Machine Learning via Synthetic Data Rewards

Sebastian Shenghong Tay, Xinyi Xu, Chuan Sheng Foo, Bryan Kian Hsiang Low

[AAAI-22] Main Track

Keywords
Poster Session 4 @ Red 3, Poster Session 8 @ Red 3, Oral Session 4 @ Red 3, Poster Session 4, Poster Session 8, Oral Session 4

Download Paper

Enter the Virtual Venue

Abstract: This paper presents a novel collaborative generative modeling (CGM) framework that incentivizes collaboration among self-interested parties to contribute data to a pool for training a generative model (e.g., GAN), from which synthetic data are drawn and distributed to the parties as rewards commensurate to their contributions. Distributing synthetic data as rewards (instead of trained models or money) offers task- and model-agnostic benefits for downstream learning tasks and is less likely to violate data privacy regulation. To realize the framework, we firstly propose a data valuation function using maximum mean discrepancy (MMD) that values data based on its quantity and quality in terms of its closeness to the true data distribution and provide theoretical results guiding the kernel choice in our MMD-based data valuation function. Then, we formulate the reward scheme as a linear optimization problem that when solved, guarantees certain incentives such as fairness in the CGM framework. We devise a weighted sampling algorithm for generating synthetic data to be distributed to each party as reward such that the value of its data and the synthetic data combined matches its assigned reward value by the reward scheme. We empirically show using simulated and real-world datasets that the parties' synthetic data rewards are commensurate to their contributions.

Introduction Video

Sessions where this paper appears

Timezone

Poster Session 4

Fri, February 25 5:00 PM - 6:45 PM (+00:00)

Red 3

Add to Calendar
Apple
Google
iCal File
Microsoft 365
Outlook.com
Yahoo

Poster Session 4
Poster Session 8

Sun, February 27 12:45 AM - 2:30 AM (+00:00)

Red 3

Add to Calendar
Apple
Google
iCal File
Microsoft 365
Outlook.com
Yahoo

Poster Session 8
Oral Session 4

Fri, February 25 6:45 PM - 8:00 PM (+00:00)

Red 3

Add to Calendar
Apple
Google
iCal File
Microsoft 365
Outlook.com
Yahoo

Oral Session 4