Recovering The Propensity Score From Biased Positive Unlabeled Data

Walter Gerych; Thomas Hartvigsen; Luke Buquicchio; Emmanuel Agu; Elke Rundensteiner

Recovering The Propensity Score From Biased Positive Unlabeled Data

Walter Gerych, Thomas Hartvigsen, Luke Buquicchio, Emmanuel Agu, Elke Rundensteiner

[AAAI-22] Main Track

Keywords
Poster Session 2 @ Blue 2, Poster Session 9 @ Blue 2, Oral Session 2 @ Blue 2, Poster Session 2, Poster Session 9, Oral Session 2

Download Paper

Enter the Virtual Venue

Abstract: Positive-Unlabeled (PU) learning methods train a classifier

to distinguish between the positive and negative classes given

only positive and unlabeled data. While traditional PU methods require the labeled positive samples to be an unbiased sample of the positive distribution, in practice the labeled sample is often a biased draw from the true distribution. Prior

work shows that if we know the likelihood that each positive instance will be selected for labeling, referred to as the propensity score, then the biased sample can be used for PU learning. Unfortunately, no prior work has been proposed an

inference strategy for which the propensity score is identifiable. In this work, we propose two sets of assumptions under which the propensity score can be uniquely determined: one in which no assumption is made on the functional form of the propensity score (requiring assumptions on the data distribution), and the second which loosens the data assumptions while assuming a functional form for the propensity score. We then propose inference strategies for each case. Our empirical study shows that our approach significantly outperforms the state-of-the-art propensity estimation methods on a rich variety of benchmark datasets.

Introduction Video

Sessions where this paper appears

Timezone

Poster Session 2

Fri, February 25 12:45 AM - 2:30 AM (+00:00)

Blue 2

Add to Calendar
Apple
Google
iCal File
Microsoft 365
Outlook.com
Yahoo

Poster Session 2
Poster Session 9

Sun, February 27 8:45 AM - 10:30 AM (+00:00)

Blue 2

Add to Calendar
Apple
Google
iCal File
Microsoft 365
Outlook.com
Yahoo

Poster Session 9
Oral Session 2

Fri, February 25 2:30 AM - 3:45 AM (+00:00)

Blue 2

Add to Calendar
Apple
Google
iCal File
Microsoft 365
Outlook.com
Yahoo

Oral Session 2