Abstract
Consider semi-supervised learning for classification, where both labelled and unlabelled data are available for training. The goal is to exploit both datasets to achieve higher prediction accuracy than just using labelled data alone. We develop a semi-supervised logistic learning method based on exponential tilt mixture models by extending a statistical equivalence between logistic regression and exponential tilt modelling. We study maximum nonparametric likelihood estimation and derive novel objective functions that are shown to be Fisher probability consistent. We also propose regularized estimation and construct simple and highly interpretable expectation–maximization (EM) algorithms. Finally, we present numerical results that demonstrate the advantage of the proposed methods compared with existing methods.
Original language | English (US) |
---|---|
Article number | e312 |
Journal | Stat |
Volume | 9 |
Issue number | 1 |
DOIs | |
State | Published - 2020 |
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty
Keywords
- Fisher consistency
- empirical likelihood
- expectation–maximization algorithm
- exponential tilt model
- logistic regression
- semi-supervised learning