ABSTRACT
Interdisciplinary collaborations have generated huge impact to society. However, it is often hard for researchers to establish such cross-domain collaborations. What are the patterns of cross-domain collaborations? How do those collaborations form? Can we predict this type of collaborations?
Cross-domain collaborations exhibit very different patterns compared to traditional collaborations in the same domain: 1) sparse connection: cross-domain collaborations are rare; 2) complementary expertise: cross-domain collaborators often have different expertise and interest; 3) topic skewness: cross-domain collaboration topics are focused on a subset of topics. All these patterns violate fundamental assumptions of traditional recommendation systems.
In this paper, we analyze the cross-domain collaboration data from research publications and confirm the above patterns. We propose the Cross-domain Topic Learning (CTL) model to address these challenges. For handling sparse connections, CTL consolidates the existing cross-domain collaborations through topic layers instead of at author layers, which alleviates the sparseness issue. For handling complementary expertise, CTL models topic distributions from source and target domains separately, as well as the correlation across domains. For handling topic skewness, CTL only models relevant topics to the cross-domain collaboration.
We compare CTL with several baseline approaches on large publication datasets from different domains. CTL outperforms baselines significantly on multiple recommendation metrics. Beyond accurate recommendation performance, CTL is also insensitive to parameter tuning as confirmed in the sensitivity analysis.
Supplemental Material
- R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. ACM Press, 1999. Google ScholarDigital Library
- M. Balabanović and Y. Shoham. Fab: content-based, collaborative recommendation. Commun. ACM, 40:66--72, March 1997. Google ScholarDigital Library
- K. Balog, L. Azzopardi, and M. de Rijke. Formal models for expert finding in enterprise corpora. In SIGIR'06, pages 43--55, 2006. Google ScholarDigital Library
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. JMLR, 3:993--1022, 2003. Google ScholarDigital Library
- W. Buntine and A. Jakulin. Applying discrete pca in data analysis. In UAI'04, pages 59--66, 2004. Google ScholarDigital Library
- D. Chakrabarti and C. Faloutsos. Graph mining: Laws, generators, and algorithms. ACM Comput. Surv., 38(1):2, 2006. Google ScholarDigital Library
- H.-H. Chen, L. Gou, X. Zhang, and C. L. Giles. Collabseer: a search engine for collaboration discovery. In JCDL'11, pages 231--240, 2011. Google ScholarDigital Library
- A. Das, M. Datar, A. Garg, and S. Rajaram. Google news personalization: Scalable online collaborative filtering. In WWW'07, 2007. Google ScholarDigital Library
- M. Deshpande and G. Karypis. Item-based top-n recommendation algorithms. ACM Trans. Inf. Syst., 22(1):143--177, Jan. 2004. Google ScholarDigital Library
- A. Doucet, N. de Freitas, K. Murphy, and S. Russell. Rao-blackwellised particle filtering for dynamic bayesian networks. In UAI'00, pages 176--183, 2000. Google ScholarDigital Library
- M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. In SIGCOMM'99, pages 251--262, 1999. Google ScholarDigital Library
- M. Granovetter. The strength of weak ties. American Journal of Sociology, 78(6):1360--1380, 1973.Google ScholarCross Ref
- T. L. Griffiths and M. Steyvers. Finding scientific topics. In PNAS'04, pages 5228--5235, 2004.Google ScholarCross Ref
- G. Heinrich. Parameter estimation for text analysis. Technical report, University of Leipzig, Germany, 2004.Google Scholar
- T. Hofmann. Probabilistic latent semantic indexing. In SIGIR'99, pages 50--57, 1999. Google ScholarDigital Library
- H. Kautz, B. Selman, and M. Shah. Referral web: Combining social networks and collaborative filtering. Communications of the ACM, 40(3):63--65, 1997. Google ScholarDigital Library
- I. Konstas, V. Stathopoulos, and J. M. Jose. On social networks and collaborative recommendation. In SIGIR'09, pages 195--202, 2009. Google ScholarDigital Library
- J. Leskovec and C. Faloutsos. Sampling from large graphs. In KDD'06, pages 631--636, 2006. Google ScholarDigital Library
- J. Leskovec, D. Huttenlocher, and J. Kleinberg. Predicting positive and negative links in online social networks. In WWW'10, pages 641--650, 2010. Google ScholarDigital Library
- D. Liben-Nowell and J. M. Kleinberg. The link-prediction problem for social networks. JASIST, 58(7):1019--1031, 2007. Google ScholarDigital Library
- R. Lichtenwalter, J. T. Lussier, and N. V. Chawla. New perspectives and methods in link prediction. In KDD'10, pages 243--252, 2010. Google ScholarDigital Library
- L. Lovasz. Random walks on graphs: A survey. Combinatorics, 2(1):1?6, 1993.Google Scholar
- D. Mimno and A. McCallum. Expertise modeling for matching papers with reviewers. In KDD'07, pages 500--509, 2007. Google ScholarDigital Library
- J. Quackenbush. Microarray analysis and tumor classification. New England Journal of Medicine, 354:2463--2472, June 2006.Google ScholarCross Ref
- D. Sculley. Combined regression and ranking. In KDD'10, pages 979--988, 2010. Google ScholarDigital Library
- Y. Shi, D. Ye, A. Goder, and S. Narayanan. A large scale machine learning system for recommending heterogeneous content in social networks. In SIGIR'11, pages 1337--1338, 2011. Google ScholarDigital Library
- M. Steyvers, P. Smyth, and T. Griffiths. Probabilistic author-topic models for information discovery. In KDD'04, pages 306--315, 2004. Google ScholarDigital Library
- J. Sun, H. Qu, D. Chakrabarti, and C. Faloutsos. Neighborhood formation and anomaly detection in bipartite graphs. In ICDM'05, pages 418--425, 2005. Google ScholarDigital Library
- J. Tang, L. Yao, and D. Chen. Multi-topic based query-oriented summarization. In SDM'09, pages 1147--1158, 2009.Google ScholarCross Ref
- J. Tang, J. Zhang, R. Jin, Z. Yang, K. Cai, L. Zhang, and Z. Su. Topic level expertise search over heterogeneous networks. Machine Learning Journal, 82(2):211--237, 2011. Google ScholarDigital Library
- J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: Extraction and mining of academic social networks. In KDD'08, pages 990--998, 2008. Google ScholarDigital Library
- L. Tang and H. Liu. Relational learning via latent social dimensions. In KDD'09, pages 817--826, 2009. Google ScholarDigital Library
- W. Tang, J. Tang, T. Lei, C. Tan, B. Gao, and T. Li. On optimization of expertise matching with various constraints. Neurocomputing, 76(1):71--83, 2012. Google ScholarDigital Library
- C. Wang and D. M. Blei. Collaborative topic modeling for recommending scientific articles. In KDD'11, pages 448--456, 2011. Google ScholarDigital Library
- Q. Yuan, L. Chen, and S. Zhao. Factorization vs. regularization: fusing heterogeneous social relationships in top-n recommendation. In RecSys'11, pages 245--252, 2011. Google ScholarDigital Library
- J. Zhang, J. Tang, and J. Li. Expert finding in a social network. In DASFAA'07, pages 1066--1069, 2007.Google ScholarCross Ref
Index Terms
- Cross-domain collaboration recommendation
Recommendations
Cross-domain recommendation: an embedding and mapping approach
IJCAI'17: Proceedings of the 26th International Joint Conference on Artificial IntelligenceData sparsity is one of the most challenging problems for recommender systems. One promising solution to this problem is cross-domain recommendation, i.e., leveraging feedbacks or ratings from multiple domains to improve recommendation performance in a ...
Adaptive social recommendation combined with the multi-domain influence
AbstractSocial relationships help to model user’s potential preferences and improve recommendation accuracy. In social recommendation, user decision-making will be affected by his own historical interaction items and social friends. Most ...
Highlights- A new multi-domain adaptive social recommendation method.
- Considers the ...
Cross domain recommendation using multidimensional tensor factorization
CD-MDTF Approach is proposed to alleviate the degree of sparsity and cold start.Results validated that recommendation accuracy is improved due to cross-domain.Tensor Factorization makes the modeling of domains as dimensions more flexible.Crawler is ...
Comments