TY - JOUR
T1 - Cross-domain learning from multiple sources
T2 - A consensus regularization perspective
AU - Zhuang, Fuzhen
AU - Luo, Ping
AU - Xiong, Hui
AU - Xiong, Yuhong
AU - He, Qing
AU - Shi, Zhongzhi
N1 - Funding Information:
This work is supported by the National Science Foundation of China (No. 60675010, 60933004, 60975039), 863 National High-Tech Program (No. 2007AA01Z132), National Basic Research Priorities Programme (No.2007CB311004), and National Science and Technology Support Plan (No. 2006BAC08B06). Also, this research was supported in part by the US National Science Foundation (NSF) via grant number CNS 0831186 and the Rutgers Seed Funding for Collaborative Computing Research. Finally, the authors are grateful to the anonymous referees for their constructive comments on the paper. A preliminary version of this work has been published in the Proceedings of the 17th ACM Conference on Information and Knowledge Mining (CIKM) [1]. Ping Luo was the corresponding author for this paper.
PY - 2010
Y1 - 2010
N2 - Classification across different domains studies how to adapt a learning model from one domain to another domain which shares similar data characteristics. While there are a number of existing works along this line, many of them are only focused on learning from a single source domain to a target domain. In particular, a remaining challenge is how to apply the knowledge learned from multiple source domains to a target domain. Indeed, data from multiple source domains can be semantically related, but have different data distributions. It is not clear how to exploit the distribution differences among multiple source domains to boost the learning performance in a target domain. To that end, in this paper, we propose a consensus regularization framework for learning from multiple source domains to a target domain. In this framework, a local classifier is trained by considering both local data available in one source domain and the prediction consensus with the classifiers learned from other source domains. Moreover, we provide a theoretical analysis as well as an empirical study of the proposed consensus regularization framework. The experimental results on text categorization and image classification problems show the effectiveness of this consensus regularization learning method. Finally, to deal with the situation that the multiple source domains are geographically distributed, we also develop the distributed version of the proposed algorithm, which avoids the need to upload all the data to a centralized location and helps to mitigate privacy concerns.
AB - Classification across different domains studies how to adapt a learning model from one domain to another domain which shares similar data characteristics. While there are a number of existing works along this line, many of them are only focused on learning from a single source domain to a target domain. In particular, a remaining challenge is how to apply the knowledge learned from multiple source domains to a target domain. Indeed, data from multiple source domains can be semantically related, but have different data distributions. It is not clear how to exploit the distribution differences among multiple source domains to boost the learning performance in a target domain. To that end, in this paper, we propose a consensus regularization framework for learning from multiple source domains to a target domain. In this framework, a local classifier is trained by considering both local data available in one source domain and the prediction consensus with the classifiers learned from other source domains. Moreover, we provide a theoretical analysis as well as an empirical study of the proposed consensus regularization framework. The experimental results on text categorization and image classification problems show the effectiveness of this consensus regularization learning method. Finally, to deal with the situation that the multiple source domains are geographically distributed, we also develop the distributed version of the proposed algorithm, which avoids the need to upload all the data to a centralized location and helps to mitigate privacy concerns.
KW - Classification
KW - consensus regularization
KW - cross-domain learning
KW - multiple source domains
UR - http://www.scopus.com/inward/record.url?scp=78149253779&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78149253779&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2009.205
DO - 10.1109/TKDE.2009.205
M3 - Review article
AN - SCOPUS:78149253779
SN - 1041-4347
VL - 22
SP - 1664
EP - 1678
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 12
M1 - 5342420
ER -