Abstract
Transfer learning aims to learn robust classifiers for the target domain by leveraging knowledge from a source domain. Since the source and the target domains are usually from different distributions, existing methods mainly focus on adapting the cross-domain marginal or conditional distributions. However, in real applications, the marginal and conditional distributions usually have different contributions to the domain discrepancy. Existing methods fail to quantitatively evaluate the different importance of these two distributions, which will result in unsatisfactory transfer performance. In this article, we propose a novel concept called Dynamic Distribution Adaptation (DDA), which is capable of quantitatively evaluating the relative importance of each distribution. DDA can be easily incorporated into the framework of structural risk minimization to solve transfer learning problems. On the basis of DDA, we propose two novel learning algorithms: (1) Manifold Dynamic Distribution Adaptation (MDDA) for traditional transfer learning, and (2) Dynamic Distribution Adaptation Network (DDAN) for deep transfer learning. Extensive experiments demonstrate that MDDA and DDAN significantly improve the transfer learning performance and set up a strong baseline over the latest deep and adversarial methods on digits recognition, sentiment analysis, and image classification. More importantly, it is shown that marginal and conditional distributions have different contributions to the domain divergence, and our DDA is able to provide good quantitative evaluation of their relative importance, which leads to better performance. We believe this observation can be helpful for future research in transfer learning.
- Mahsa Baktashmotlagh, Mehrtash Harandi, and Mathieu Salzmann. 2016. Distribution-matching embedding for visual domain adaptation. J. Mach. Learn. Res. 17, 1 (2016), 3760--3789.Google ScholarDigital Library
- Mahsa Baktashmotlagh, Mehrtash T. Harandi, Brian C. Lovell, and Mathieu Salzmann. 2013. Unsupervised domain adaptation by domain invariant projection. In Proceedings of the IEEE International Conference on Computer Vision. 769--776.Google ScholarDigital Library
- Mahsa Baktashmotlagh, Mehrtash T. Harandi, Brian C. Lovell, and Mathieu Salzmann. 2014. Domain adaptation on the statistical manifold. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2481--2488.Google ScholarDigital Library
- Mikhail Belkin, Partha Niyogi, and Vikas Sindhwani. 2006. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7 (Nov 2006), 2399--2434.Google ScholarDigital Library
- Shai Ben-David, John Blitzer, Koby Crammer, and Fernando Pereira. 2007. Analysis of representations for domain adaptation. In Advances in Neural Information Processing Systems. 137--144.Google Scholar
- Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 8 (2013), 1798--1828.Google ScholarDigital Library
- John Blitzer, Ryan McDonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 120--128.Google ScholarDigital Library
- Konstantinos Bousmalis, George Trigeorgis, Nathan Silberman, Dilip Krishnan, and Dumitru Erhan. 2016. Domain separation networks. In Advances in Neural Information Processing Systems. 343--351.Google Scholar
- Deng Cai, Xiaofei He, Jiawei Han, and Thomas S Huang. 2011. Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 8 (2011), 1548--1560.Google ScholarDigital Library
- Yue Cao, Mingsheng Long, and Jianmin Wang. 2018. Unsupervised domain adaptation with distribution matching machines. In Proceedings of the 2018 AAAI International Conference on Artificial Intelligence (AAAI’18).Google Scholar
- Chao Chen, Zhihong Chen, Boyuan Jiang, and Xinyu Jin. 2019. Joint domain alignment and discriminative feature learning for unsupervised deep domain adaptation. In Proceedings of the 2018 AAAI International Conference on Artificial Intelligence (AAAI’19).Google ScholarCross Ref
- Minmin Chen, Zhixiang Xu, Kilian Weinberger, and Fei Sha. 2012. Marginalized denoising autoencoders for domain adaptation. In Proceedings of the International Conference on Machine Learning (ICML’12).Google Scholar
- Yiqiang Chen, Jindong Wang, Meiyu Huang, and Han Yu. 2019. Cross-position activity recognition with stratified transfer learning. Perv. Mob. Comput. 57, July (2019), 1--13.Google Scholar
- Yiqiang Chen, Jindong Wang, Chaohui Yu, Wen Gao, and Xin Qin. 2019. FedHealth: A federated transfer learning framework for wearable healthcare. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’19) Workshop on Federated Machine Learning.Google Scholar
- Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. 2007. Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 210--219.Google ScholarDigital Library
- Wenyuan Dai, Qiang Yang, Gui-Rong Xue, and Yong Yu. 2007. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine Learning (ICML’07). ACM, 193--200.Google ScholarDigital Library
- Oscar Day and Taghi M. Khoshgoftaar. 2017. A survey on heterogeneous transfer learning. Journal of Big Data 4, 1 (2017), 29.Google ScholarCross Ref
- Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2009 (CVPR’09). IEEE, 248--255.Google ScholarCross Ref
- Chuong B. Do and Andrew Y. Ng. 2006. Transfer learning for text classification. In Advances in Neural Information Processing Systems. 299--306.Google Scholar
- Basura Fernando, Amaury Habrard, Marc Sebban, and Tinne Tuytelaars. 2013. Unsupervised visual domain adaptation using subspace alignment. In Proceedings of the IEEE International Conference on Computer Vision. 2960--2967.Google ScholarDigital Library
- Magda Friedjungová and Marcel Jirina. 2017. Asymmetric heterogeneous transfer learning: A survey. In Proceedings of the 6th International Conference on Data Science, Technology and Applications (DATA'17). 17–27.Google ScholarDigital Library
- Yaroslav Ganin and Victor Lempitsky. 2015. Unsupervised domain adaptation by backpropagation. In Proceedings of the International Conference on Machine Learning (ICML’15).Google Scholar
- Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17, 59 (2016), 1--35.Google ScholarDigital Library
- Muhammad Ghifary, David Balduzzi, W. Bastiaan Kleijn, and Mengjie Zhang. 2017. Scatter component analysis: A unified framework for domain adaptation and domain generalization. IEEE Trans. Pattern Anal. Mach. Intell. 39, 7 (2017), 1414--1430.Google ScholarDigital Library
- Boqing Gong, Yuan Shi, Fei Sha, and Kristen Grauman. 2012. Geodesic flow kernel for unsupervised domain adaptation. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12). IEEE, 2066--2073.Google ScholarCross Ref
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems. 2672--2680.Google Scholar
- Raghuraman Gopalan, Ruonan Li, and Rama Chellappa. 2011. Domain adaptation for object recognition: An unsupervised approach. In Proceedings of the 2011 IEEE International Conference on Computer Vision (ICCV’11). IEEE, 999--1006.Google ScholarDigital Library
- Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, and Alexander Smola. 2012. A kernel two-sample test. J. Mach. Learn. Res. 13 (Mar. 2012), 723--773.Google Scholar
- Arthur Gretton, Dino Sejdinovic, Heiko Strathmann, Sivaraman Balakrishnan, Massimiliano Pontil, Kenji Fukumizu, and Bharath K. Sriperumbudur. 2012. Optimal kernel choice for large-scale two-sample tests. In Advances in Neural Information Processing Systems. 1205--1213.Google Scholar
- Jihun Hamm and Daniel D. Lee. 2008. Grassmann discriminant analysis: A unifying view on subspace-based learning. In Proceedings of the 25th International Conference on Machine Learning. ACM, 376--383.Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google ScholarCross Ref
- Cheng-An Hou, Yao-Hung Hubert Tsai, Yi-Ren Yeh, and Yu-Chiang Frank Wang. 2016. Unsupervised domain adaptation with label and structural consistency. IEEE Trans. Image Process. 25, 12 (2016), 5552--5562.Google ScholarDigital Library
- Ye Jia, Yu Zhang, Ron Weiss, Quan Wang, Jonathan Shen, Fei Ren, Patrick Nguyen, Ruoming Pang, Ignacio Lopez Moreno, Yonghui Wu, et al. 2018. Transfer learning from speaker verification to multispeaker text-to-speech synthesis. In Advances in Neural Information Processing Systems. 4480--4490.Google Scholar
- Bartosz Krawczyk. 2016. Learning from imbalanced data: Open challenges and future directions. Progr. Artif. Intell. 5, 4 (2016), 221--232.Google ScholarCross Ref
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097--1105.Google ScholarDigital Library
- Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Proceedings of the European Conference on Computer Vision. Springer, 740--755.Google Scholar
- Mingsheng Long, Yue Cao, Jianmin Wang, and Michael Jordan. 2015. Learning transferable features with deep adaptation networks. In Proceedings of the International Conference on Machine Learning (ICML’15). 97--105.Google Scholar
- Mingsheng Long, Zhangjie Cao, Jianmin Wang, and Michael I. Jordan. 2018. Conditional adversarial domain adaptation. In Advances in Neural Information Processing Systems. 1645--1655.Google Scholar
- Mingsheng Long, Jianmin Wang, Guiguang Ding, Sinno Jialin Pan, and S Yu Philip. 2014. Adaptation regularization: A general framework for transfer learning. IEEE Trans. Knowl. Data Eng. 26, 5 (2014), 1076--1089.Google ScholarDigital Library
- Mingsheng Long, Jianmin Wang, Guiguang Ding, Jiaguang Sun, and Philip S. Yu. 2013. Transfer feature learning with joint distribution adaptation. In Proceedings of the IEEE International Conference on Computer Vision. 2200--2207.Google Scholar
- Mingsheng Long, Han Zhu, Jianmin Wang, and Michael I. Jordan. 2017. Deep transfer learning with joint adaptation networks. In Proceedings of the International Conference on Machine Learning. 2208--2217.Google Scholar
- Yong Luo, Tongliang Liu, Dacheng Tao, and Chao Xu. 2014. Decomposition-based transfer distance metric learning for image classification. IEEE Trans. Image Process. 23, 9 (2014), 3789--3801.Google ScholarCross Ref
- Yong Luo, Tongliang Liu, Yonggang Wen, and Dacheng Tao. 2018. Online heterogeneous transfer metric learning. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’18). 2525--2531.Google ScholarCross Ref
- Yong Luo, Yonggang Wen, Lingyu Duan, and Dacheng Tao. 2018. Transfer metric learning: Algorithms, applications and outlooks. arXiv preprint arXiv:1810.03944 (2018).Google Scholar
- Sinno Jialin Pan, Ivor W. Tsang, James T. Kwok, and Qiang Yang. 2011. Domain adaptation via transfer component analysis. IEEE Trans. Neur. Netw. 22, 2 (2011), 199--210.Google ScholarDigital Library
- Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 10 (2010), 1345--1359.Google ScholarDigital Library
- Pau Panareda Busto and Juergen Gall. 2017. Open set domain adaptation. In Proceedings of the IEEE International Conference on Computer Vision. 754--763.Google ScholarCross Ref
- Zhongyi Pei, Zhangjie Cao, Mingsheng Long, and Jianmin Wang. 2018. Multi-adversarial domain adaptation. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
- Chuan-Xian Ren, Dao-Qing Dai, Ke-Kun Huang, and Zhao-Rong Lai. 2014. Transfer learning of structured representation for face recognition. IEEE Trans. Image Process. 23, 12 (2014), 5440--5454.Google ScholarCross Ref
- Kate Saenko, Brian Kulis, Mario Fritz, and Trevor Darrell. 2010. Adapting visual category models to new domains. In Proceedings of the European Conference on Computer Vision. Springer, 213--226.Google ScholarDigital Library
- Swami Sankaranarayanan, Yogesh Balaji, Carlos D. Castillo, and Rama Chellappa. 2018. Generate to adapt: Aligning domains using generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18).Google ScholarCross Ref
- Baochen Sun, Jiashi Feng, and Kate Saenko. 2016. Return of frustratingly easy domain adaptation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’16), Vol. 6. 8.Google Scholar
- Baochen Sun and Kate Saenko. 2015. Subspace distribution alignment for unsupervised domain adaptation. In Proceedings of the British Machine Vision Conference (BMVC’15). 24--1.Google ScholarCross Ref
- Baochen Sun and Kate Saenko. 2016. Deep coral: Correlation alignment for deep domain adaptation. In Proceedings of the European Conference on Computer Vision. Springer, 443--450.Google ScholarCross Ref
- Jafar Tahmoresnezhad and Sattar Hashemi. 2017. Visual domain adaptation via transfer feature learning. Knowl. Inf. Syst. 50, 2 (2017), 586–605.Google ScholarDigital Library
- Ben Tan, Yangqiu Song, Erheng Zhong, and Qiang Yang. 2015. Transitive transfer learning. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1155--1164.Google ScholarDigital Library
- Eric Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. 2017. Adversarial discriminative domain adaptation. In Proceedings of the Computer Vision and Pattern Recognition (CVPR’17), Vol. 1. 4.Google ScholarCross Ref
- Eric Tzeng, Judy Hoffman, Ning Zhang, Kate Saenko, and Trevor Darrell. 2014. Deep domain confusion: Maximizing for domain invariance. arXiv preprint arXiv:1412.3474 (2014).Google Scholar
- Vladimir Naumovich Vapnik and Vlamimir Vapnik. 1998. Statistical Learning Theory. Vol. 1. Wiley, New York, NY.Google ScholarCross Ref
- Hemanth Venkateswara, Jose Eusebio, Shayok Chakraborty, and Sethuraman Panchanathan. 2017. Deep hashing network for unsupervised domain adaptation. In Proceedings of the Computer Vision and Pattern Recognition (CVPR’17). 5018--5027.Google ScholarCross Ref
- Jindong Wang et al. 2018. Everything about Transfer Learning and Domain Adapation. Retrieved from http://transferlearning.xyz.Google Scholar
- Jindong Wang, Yiqiang Chen, Shuji Hao, Wenjie Feng, and Zhiqi Shen. 2017. Balanced distribution adaptation for transfer learning. In Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM’17). IEEE, 1129--1134.Google ScholarCross Ref
- Jindong Wang, Yiqiang Chen, Lisha Hu, Xiaohui Peng, and Philip S. Yu. 2018. Stratified transfer learning for cross-domain activity recognition. In Proceedings of the 2018 IEEE International Conference on Pervasive Computing and Communications (PerCom’18).Google Scholar
- Jindong Wang, Yiqiang Chen, Han Yu, Meiyu Huang, and Qiang Yang. 2019. Easy transfer learning by exploiting intra-domain structures. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME’19).Google ScholarCross Ref
- Jindong Wang, Wenjie Feng, Yiqiang Chen, Han Yu, Meiyu Huang, and Philip S Yu. 2018. Visual domain adaptation with manifold embedded distribution alignment. In Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference. ACM, 402--410.Google ScholarDigital Library
- Jindong Wang, Vincent W. Zheng, Yiqiang Chen, and Meiyu Huang. 2018. Deep transfer learning for cross-domain activity recognition. In Proceedings of the 3rd International Conference on Crowd Science and Engineering. ACM, 16.Google ScholarDigital Library
- Yong Xu, Xiaozhao Fang, Jian Wu, Xuelong Li, and David Zhang. 2016. Discriminative transfer subspace learning via low-rank and sparse representation. IEEE Trans. Image Process. 25, 2 (2016), 850--863.Google ScholarDigital Library
- Yonghui Xu, Sinno Jialin Pan, Hui Xiong, Qingyao Wu, Ronghua Luo, Huaqing Min, and Hengjie Song. 2017. A unified framework for metric transfer learning. IEEE Trans. Knowl. Data Eng. 29, 6 (2017), 1158--1171.Google ScholarDigital Library
- Jiangyan Yi, Jianhua Tao, Zhengqi Wen, and Ye Bai. 2019. Language-adversarial transfer learning for low-resource speech recognition. IEEE/ACM Trans. Aud. Speech Lang. Process. 27, 3 (2019), 621--630.Google ScholarDigital Library
- Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks? In Advances in Neural Information Processing Systems. 3320--3328.Google Scholar
- Werner Zellinger, Thomas Grubinger, Edwin Lughofer, Thomas Natschläger, and Susanne Saminger-Platz. 2017. Central moment discrepancy (cmd) for domain-invariant representation learning. In Proceedings of the International Conference on Learning Representations (ICLR’71).Google Scholar
- Jing Zhang, Zewei Ding, Wanqing Li, and Philip Ogunbona. 2018. Importance weighted adversarial nets for partial domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8156--8164.Google ScholarCross Ref
- Jing Zhang, Wanqing Li, and Philip Ogunbona. 2017. Cross-dataset recognition: A survey.arXiv preprint arXiv:1705.04396 (2017).Google Scholar
- Jing Zhang, Wanqing Li, and Philip Ogunbona. 2017. Joint geometrical and statistical alignment for visual domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).Google ScholarCross Ref
- Weichen Zhang, Wanli Ouyang, Wen Li, and Dong Xu. 2018. Collaborative and adversarial network for unsupervised domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3801--3809.Google ScholarCross Ref
- Yu Zhang and Qiang Yang. 2017. A survey on multi-task learning. arXiv preprint arXiv:1707.08114 (2017).Google Scholar
- Erheng Zhong, Wei Fan, Jing Peng, Kun Zhang, Jiangtao Ren, Deepak Turaga, and Olivier Verscheure. 2009. Cross domain distribution adaptation via kernel mapping. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1027--1036.Google ScholarDigital Library
- Yongchun Zhu, Fuzhen Zhuang, Jindong Wang, Jingwu Chen, Zhiping Shi, Wenjuan Wu, and Qing He. 2019. Multi-representation adaptation network for cross-domain image classification. Neur. Netw. 119 (2019), 214–221.Google ScholarDigital Library
Index Terms
- Transfer Learning with Dynamic Distribution Adaptation
Recommendations
Visual Domain Adaptation with Manifold Embedded Distribution Alignment
MM '18: Proceedings of the 26th ACM international conference on MultimediaVisual domain adaptation aims to learn robust classifiers for the target domain by leveraging knowledge from a source domain. Existing methods either attempt to align the cross-domain distributions, or perform manifold subspace learning. However, there ...
Deep autoencoder based domain adaptation for transfer learning
AbstractThe concept of transfer learning has received a great deal of concern and interest throughout the last decade. Selecting an ideal representational framework for instances of various domains to minimize the divergence among source and target ...
Skill based transfer learning with domain adaptation for continuous reinforcement learning domains
AbstractAlthough reinforcement learning is known as an effective machine learning technique, it might perform poorly in complex problems, especially real-world problems, leading to a slow rate of convergence. This issue magnifies when facing continuous ...
Comments