Skip to main content
Erschienen in: Knowledge and Information Systems 1/2016

01.10.2016 | Regular Paper

Exploring demographic information in social media for product recommendation

verfasst von: Wayne Xin Zhao, Sui Li, Yulan He, Liwei Wang, Ji-Rong Wen, Xiaoming Li

Erschienen in: Knowledge and Information Systems | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In many e-commerce Web sites, product recommendation is essential to improve user experience and boost sales. Most existing product recommender systems rely on historical transaction records or Web-site-browsing history of consumers in order to accurately predict online users’ preferences for product recommendation. As such, they are constrained by limited information available on specific e-commerce Web sites. With the prolific use of social media platforms, it now becomes possible to extract product demographics from online product reviews and social networks built from microblogs. Moreover, users’ public profiles available on social media often reveal their demographic attributes such as age, gender, and education. In this paper, we propose to leverage the demographic information of both products and users extracted from social media for product recommendation. In specific, we frame recommendation as a learning to rank problem which takes as input the features derived from both product and user demographics. An ensemble method based on the gradient-boosting regression trees is extended to make it suitable for our recommendation task. We have conducted extensive experiments to obtain both quantitative and qualitative evaluation results. Moreover, we have also conducted a user study to gauge the performance of our proposed recommender system in a real-world deployment. All the results show that our system is more effective in generating recommendation results better matching users’ preferences than the competitive baselines.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Fußnoten
7
Given an attribute, we collect all the unique values filled in by users in our data collection, and only keep the values with high population. We further manually group similar values. Furthermore, we discretized attribute values based on the customer segmentation [11] (chapter five) in marketing and ensured balanced distribution probabilities over different values across different discretization intervals.
 
8
This will make \(\phi ^{(u,a)}\) no longer a valid probability distribution. But as will be shown later, it does not affect the construction of demographic feature vectors.
 
9
For example, we can sum the corresponding demographic-based probabilities for each attribute: User 1 will be assigned to a value of 2.52 by having \(1\times 1 + 0.9 \times 1 + 0.7\times 0.8 + 0.3\times 0.2\), while similarly user 2 will be assigned to a value of 1.44.
 
10
We distinguish normal users from spam users using the following three conditions: (1) an normal user should have a balanced number of tweets and retweets; (2) a normal user should not include any keywords relating to products or brands in her the nickname or profile description. (3) A normal user should not publish many tweets containing keywords relating products or brands.
 
11
To be more specific, the values of y are needed to be given in training, while in test we obtain the values of y by using the predicted output from the learnt ranking function f, and an item with a larger value for y will be ranked in a higher position, i.e., of more importance for recommendation.
 
12
On Sina Weibo, all the tweets from a user can be publicly seen by other registered users. The judges log into their own Weibo accounts and check the validity of each candidate query–product pair online. Each user’s public profile of a user is also checked and spam users are removed. The workload for each judge is about 5–7 times the number of qd pairs in Table 2, i.e., only 1/7–1/5 of the originally detected qd pairs are finally kept as training data.
 
13
https://​sourceforge.​net/​p/​lemur/​wiki/​RankLib/​. RankLib might assign equal scores to items during ranking. In this case, we further sort the items of equal scores by their sales volume.
 
14
For the listwise approach, each training instance is an ordered list. However, the relative order between non-relevant products is not possible to obtain in our training data.
 
15
Balanced interleaving method reflects the intuition that the results of the two rankings A and B should be interleaved into a single ranking I in a balanced way, which ensures that any top k results in I always contain the top \(k_a\) results from A and the top \(k_b\) results from B, where \(k_a\) and \(k_b\) differ by at most 1.
 
Literatur
1.
Zurück zum Zitat Wang J, Zhang Y (2013) Opportunity model for e-commerce recommendation: right product; right time. In: Ser. SIGIR ’13 Wang J, Zhang Y (2013) Opportunity model for e-commerce recommendation: right product; right time. In: Ser. SIGIR ’13
2.
Zurück zum Zitat von Reischach F, Michahelles F, Schmidt A (2009) The design space of ubiquitous product recommendation systems. In: Ser. MUM ’09 von Reischach F, Michahelles F, Schmidt A (2009) The design space of ubiquitous product recommendation systems. In: Ser. MUM ’09
3.
Zurück zum Zitat Giering M (2008) Retail sales prediction and item recommendations using customer demographics at store level. SIGKDD Explor Newsl 10(2):84–89CrossRef Giering M (2008) Retail sales prediction and item recommendations using customer demographics at store level. SIGKDD Explor Newsl 10(2):84–89CrossRef
4.
Zurück zum Zitat Xiao B, Benbasat I (2007) E-commerce product recommendation agents: use, characteristics, and impact. MIS Q 31:137–209 Xiao B, Benbasat I (2007) E-commerce product recommendation agents: use, characteristics, and impact. MIS Q 31:137–209
5.
Zurück zum Zitat Linden G, Smith B, York J (2003) Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput 7(1):76–80CrossRef Linden G, Smith B, York J (2003) Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput 7(1):76–80CrossRef
6.
Zurück zum Zitat Hollerit B, Kröll M, Strohmaier M (2013) Towards linking buyers and sellers: detecting commercial intent on twitter. In: Ser. WWW ’13 companion Hollerit B, Kröll M, Strohmaier M (2013) Towards linking buyers and sellers: detecting commercial intent on twitter. In: Ser. WWW ’13 companion
7.
Zurück zum Zitat Zhao X-W, Guo Y, He Y, Jiang H, Wu Y, Li X (2014) We know what you want to buy: a demographic-based system for product recommendation on microblogs. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, ser. KDD ’14, 2014, pp 1935–1944 Zhao X-W, Guo Y, He Y, Jiang H, Wu Y, Li X (2014) We know what you want to buy: a demographic-based system for product recommendation on microblogs. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, ser. KDD ’14, 2014, pp 1935–1944
8.
Zurück zum Zitat Baker M, Hart S (2007) The marketing book, 6th edn. Routledge, London Baker M, Hart S (2007) The marketing book, 6th edn. Routledge, London
9.
Zurück zum Zitat Sridhar G (2007) Consumer involvement in product choice–a demographic analysis. XIMB J Manag 3:131–148 Sridhar G (2007) Consumer involvement in product choice–a demographic analysis. XIMB J Manag 3:131–148
10.
Zurück zum Zitat Zeithaml VA (1985) The new demographics and market fragmentation. J Mark 49:64–75CrossRef Zeithaml VA (1985) The new demographics and market fragmentation. J Mark 49:64–75CrossRef
11.
Zurück zum Zitat Tsiptsis K, Chorianopoulos A (2010) Data mining techniques in CRM: inside customer segmentation. Wiley, LondonCrossRef Tsiptsis K, Chorianopoulos A (2010) Data mining techniques in CRM: inside customer segmentation. Wiley, LondonCrossRef
12.
Zurück zum Zitat Dong Y, Yang Y, Tang J, Yang Y, Chawla N-V (2014) Inferring user demographics and social strategies in mobile social networks. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, ser. KDD ’14, 2014, pp 15–24 Dong Y, Yang Y, Tang J, Yang Y, Chawla N-V (2014) Inferring user demographics and social strategies in mobile social networks. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, ser. KDD ’14, 2014, pp 15–24
13.
Zurück zum Zitat Mislove A, Viswanath B, Gummadi K-P, Druschel P (2010) You are who you know: inferring user profiles in online social networks. In: Ser. WSDM ’10 Mislove A, Viswanath B, Gummadi K-P, Druschel P (2010) You are who you know: inferring user profiles in online social networks. In: Ser. WSDM ’10
14.
Zurück zum Zitat Bi B, Shokouhi M, Kosinski M, Graepel T (2013) Inferring the demographics of search users: social data meets search queries. In: Ser. WWW ’13 Bi B, Shokouhi M, Kosinski M, Graepel T (2013) Inferring the demographics of search users: social data meets search queries. In: Ser. WWW ’13
15.
Zurück zum Zitat Zou B, Zhou G, Zhu Q (2014) Negation focus identification with contextual discourse information. In: Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (vol 1: long papers). Association for Computational Linguistics, Baltimore, Maryland, pp 522–530 Zou B, Zhou G, Zhu Q (2014) Negation focus identification with contextual discourse information. In: Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (vol 1: long papers). Association for Computational Linguistics, Baltimore, Maryland, pp 522–530
16.
Zurück zum Zitat (2012) US demographic and business summary data. Product guide (2012) US demographic and business summary data. Product guide
17.
Zurück zum Zitat Zhai C, Lafferty JD (2004) A study of smoothing methods for language models applied to information retrieval. ACM Trans Inf Syst 22(2):179–214CrossRef Zhai C, Lafferty JD (2004) A study of smoothing methods for language models applied to information retrieval. ACM Trans Inf Syst 22(2):179–214CrossRef
18.
Zurück zum Zitat Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Ser. ACL ’04 Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Ser. ACL ’04
19.
Zurück zum Zitat Liu T-Y (2009) Learning to rank for information retrieval. Found Trends Inf Retr 3(3):225–331CrossRef Liu T-Y (2009) Learning to rank for information retrieval. Found Trends Inf Retr 3(3):225–331CrossRef
20.
Zurück zum Zitat Turney P-D (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on Association for Computational Linguistics, ser. ACL ’02, 2002, pp 417–424 Turney P-D (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on Association for Computational Linguistics, ser. ACL ’02, 2002, pp 417–424
21.
Zurück zum Zitat Ganjisaffar Y, Caruana R, Lopes C-V (2011) Bagging gradient-boosted trees for high precision, low variance ranking models. In Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval, ser. SIGIR ’11, 2011, pp 85–94 Ganjisaffar Y, Caruana R, Lopes C-V (2011) Bagging gradient-boosted trees for high precision, low variance ranking models. In Proceedings of the 34th international ACM SIGIR conference on research and development in information retrieval, ser. SIGIR ’11, 2011, pp 85–94
22.
Zurück zum Zitat Zhang H, Riedl E, Petrushin V-A, Pal S, Spoelstra J (2012) Committee based prediction system for recommendation: KDD cup 2011, track2. In: Proceedings of KDD cup 2011 competition, San Diego, CA, USA, 2011, pp 215–229 Zhang H, Riedl E, Petrushin V-A, Pal S, Spoelstra J (2012) Committee based prediction system for recommendation: KDD cup 2011, track2. In: Proceedings of KDD cup 2011 competition, San Diego, CA, USA, 2011, pp 215–229
23.
Zurück zum Zitat Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth and Brooks, MontereyMATH Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth and Brooks, MontereyMATH
27.
Zurück zum Zitat Ho TK, Hull JJ, Srihari SN (1994) Decision combination in multiple classifier systems. IEEE Trans Pattern Anal Mach Intell 16(1):66–75CrossRef Ho TK, Hull JJ, Srihari SN (1994) Decision combination in multiple classifier systems. IEEE Trans Pattern Anal Mach Intell 16(1):66–75CrossRef
28.
Zurück zum Zitat Joachims T (2006) Training linear svms in linear time. In Ser. KDD ’06 Joachims T (2006) Training linear svms in linear time. In Ser. KDD ’06
29.
Zurück zum Zitat Freund Y, Iyer R, Schapire R-E, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969MathSciNetMATH Freund Y, Iyer R, Schapire R-E, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969MathSciNetMATH
30.
Zurück zum Zitat Cao Z, Qin T, Liu T-Y, Tsai M-F, Li H (2007) Learning to rank: from pairwise approach to listwise approach. In Ser. ICML ’07 Cao Z, Qin T, Liu T-Y, Tsai M-F, Li H (2007) Learning to rank: from pairwise approach to listwise approach. In Ser. ICML ’07
31.
Zurück zum Zitat Xu J, Li H (2007) Adarank: a boosting algorithm for information retrieval. In: Ser. SIGIR ’07 Xu J, Li H (2007) Adarank: a boosting algorithm for information retrieval. In: Ser. SIGIR ’07
32.
Zurück zum Zitat Weng J, Lim E-P, Jiang J, He Q (2010) Twitterrank: finding topic-sensitive influential twitterers. In: WSDM Weng J, Lim E-P, Jiang J, He Q (2010) Twitterrank: finding topic-sensitive influential twitterers. In: WSDM
33.
Zurück zum Zitat Chapelle O, Joachims T, Radlinski F, Yue Y (2012) Large-scale validation and analysis of interleaved search evaluation. ACM Trans Inf Syst 30(1):6:1–6:41CrossRef Chapelle O, Joachims T, Radlinski F, Yue Y (2012) Large-scale validation and analysis of interleaved search evaluation. ACM Trans Inf Syst 30(1):6:1–6:41CrossRef
34.
Zurück zum Zitat Sarwar B, Karypis G, Konstan J, Riedl J (2001) Item-based collaborative filtering recommendation algorithms. In: Ser. WWW ’01 Sarwar B, Karypis G, Konstan J, Riedl J (2001) Item-based collaborative filtering recommendation algorithms. In: Ser. WWW ’01
35.
Zurück zum Zitat Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749 Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
36.
Zurück zum Zitat Symeonidis P, Tiakas E, Manolopoulos Y (2011) Product recommendation and rating prediction based on multi-modal social networks. In: Ser. RecSys ’11 Symeonidis P, Tiakas E, Manolopoulos Y (2011) Product recommendation and rating prediction based on multi-modal social networks. In: Ser. RecSys ’11
37.
Zurück zum Zitat Korfiatis N, Poulos M (2013) Using online consumer reviews as a source for demographic recommendations: a case study using online travel reviews. Expert Syst Appl 40(14):5507–5515CrossRef Korfiatis N, Poulos M (2013) Using online consumer reviews as a source for demographic recommendations: a case study using online travel reviews. Expert Syst Appl 40(14):5507–5515CrossRef
38.
Zurück zum Zitat Qiu L, Benbasat I (2010) A study of demographic embodiments of product recommendation agents in electronic commerce. Int J Hum Comput Stud 68(10):669–688CrossRef Qiu L, Benbasat I (2010) A study of demographic embodiments of product recommendation agents in electronic commerce. Int J Hum Comput Stud 68(10):669–688CrossRef
39.
Zurück zum Zitat Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRef Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRef
40.
Zurück zum Zitat Liu Y, Huang J, An A, Yu X (2007) ARSA: a sentiment-aware model for predicting sales performance using blogs. In: SIGIR Liu Y, Huang J, An A, Yu X (2007) ARSA: a sentiment-aware model for predicting sales performance using blogs. In: SIGIR
41.
Zurück zum Zitat McGlohon M, Glance NS, Reiter Z (2010) Star quality: aggregating reviews to rank products and merchants. In: ICWSM McGlohon M, Glance NS, Reiter Z (2010) Star quality: aggregating reviews to rank products and merchants. In: ICWSM
42.
Zurück zum Zitat Ganu G, Kakodkar Y, Marian A (2013) Improving the quality of predictions using textual information in online user reviews. Inf Syst 38(1):1–15CrossRef Ganu G, Kakodkar Y, Marian A (2013) Improving the quality of predictions using textual information in online user reviews. Inf Syst 38(1):1–15CrossRef
43.
Zurück zum Zitat Zhang Y, Lai G, Zhang M, Zhang Y, Liu Y, Ma S (2014) Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In: SIGIR Zhang Y, Lai G, Zhang M, Zhang Y, Liu Y, Ma S (2014) Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In: SIGIR
44.
Zurück zum Zitat Zhang Y, Zhang H, Zhang M, Liu Y, Ma S (2014) Do users rate or review? Boost phrase-level sentiment labeling with review-level sentiment classification. In: SIGIR Zhang Y, Zhang H, Zhang M, Liu Y, Ma S (2014) Do users rate or review? Boost phrase-level sentiment labeling with review-level sentiment classification. In: SIGIR
45.
Zurück zum Zitat Pazzani M-J (1999) A framework for collaborative, content-based and demographic filtering. Artif Intell Rev 13(5–6):393–408CrossRef Pazzani M-J (1999) A framework for collaborative, content-based and demographic filtering. Artif Intell Rev 13(5–6):393–408CrossRef
46.
Zurück zum Zitat Seroussi Y, Bohnert F, Zukerman I (2011) Personalised rating prediction for new users using latent factor models. In: ACM HH Seroussi Y, Bohnert F, Zukerman I (2011) Personalised rating prediction for new users using latent factor models. In: ACM HH
47.
Zurück zum Zitat Dai HK, Zhao L, Nie Z, Wen J-R, Wang L, Li Y (2006) Detecting online commercial intention (oci). In: WWW ’06 Dai HK, Zhao L, Nie Z, Wen J-R, Wang L, Li Y (2006) Detecting online commercial intention (oci). In: WWW ’06
Metadaten
Titel
Exploring demographic information in social media for product recommendation
verfasst von
Wayne Xin Zhao
Sui Li
Yulan He
Liwei Wang
Ji-Rong Wen
Xiaoming Li
Publikationsdatum
01.10.2016
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 1/2016
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-015-0897-5

Weitere Artikel der Ausgabe 1/2016

Knowledge and Information Systems 1/2016 Zur Ausgabe

Premium Partner