Skip to main content
Top
Published in: Knowledge and Information Systems 3/2017

17-04-2017 | Regular Paper

Mining collective knowledge: inferring functional labels from online review for business

Authors: Feifan Fan, Wayne Xin Zhao, Ji-Rong Wen, Ge Xu, Edward Y. Chang

Published in: Knowledge and Information Systems | Issue 3/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

With the increasing popularity of online e-commerce services, a large volume of online reviews have been constantly generated by users. In this paper, we propose to study the problem of inferring functional labels using online review text. Functional labels summarize and highlight the main characteristics of a business, which can serve as bridges between the consumption needs and the service functions. We consider two kinds of semantic similarities: lexical similarity and embedding similarity, which characterize the relatedness in two different perspectives. To measure the lexical similarity, we use the classic probabilistic ranking formula, i.e., BM25; to measure the embedding similarity, we propose an extended embedding model which can incorporate weak supervised information derived from review text. These two kinds of similarities compensate each other and capture the semantic relatedness in a more comprehensive way. We construct a test collection consisting of four different domains based on a Yelp dataset and consider multiple baseline methods for comparison. Extensive experiments have shown that the proposed methods are very effective.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
5
We assume that the information supported by review writers is important. Thus, we discard all the reviews with the rating less than three stars.
 
6
For simplicity, we will only present the objective function for a single business; it will be easy to extend to multiple businesses.
 
7
Our current model requires that the attribute values should be discretized.
 
9
A label can be considered as a phrase in this method.
 
Literature
1.
go back to reference Archak N, Ghose A, Ipeirotis P (2007) Show me the money! Deriving the pricing power of product features by mining consumer reviews. In: Proceedings of the ACM SIGKDD conference on knowledge discovery and data mining (KDD) Archak N, Ghose A, Ipeirotis P (2007) Show me the money! Deriving the pricing power of product features by mining consumer reviews. In: Proceedings of the ACM SIGKDD conference on knowledge discovery and data mining (KDD)
2.
go back to reference Barker K, Cornacchia N (2000) Using noun phrase heads to extract document keyphrases. In: Advances in artificial intelligence. Springer, Berlin, pp 40–52 Barker K, Cornacchia N (2000) Using noun phrase heads to extract document keyphrases. In: Advances in artificial intelligence. Springer, Berlin, pp 40–52
3.
go back to reference Bengio Y, Ducharme R, Vincent P, Janvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155MATH Bengio Y, Ducharme R, Vincent P, Janvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155MATH
4.
go back to reference Bengio Y, LeCun Y, Henderson D (1993) Globally trained handwritten word recognizer using spatial representation, convolutional neural networks, and hidden markov models. In: 7th NIPS conference on advances in neural information processing systems 6, Denver, Colorado, USA, pp 937–944 Bengio Y, LeCun Y, Henderson D (1993) Globally trained handwritten word recognizer using spatial representation, convolutional neural networks, and hidden markov models. In: 7th NIPS conference on advances in neural information processing systems 6, Denver, Colorado, USA, pp 937–944
5.
go back to reference Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022 Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
6.
go back to reference Bordes A, Usunier N, García-Durán A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 2787–2795 Bordes A, Usunier N, García-Durán A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 2787–2795
7.
go back to reference Branavan SRK, Chen H, Eisenstein J, Barzilay R (2008) Learning document-level semantic properties from free-text annotations. In: Proceedings of the Association for Computational Linguistics (ACL) Branavan SRK, Chen H, Eisenstein J, Barzilay R (2008) Learning document-level semantic properties from free-text annotations. In: Proceedings of the Association for Computational Linguistics (ACL)
8.
go back to reference Breck E, Choi Y, Cardie C (2007) Identifying expressions of opinion in context. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), Hyderabad, India Breck E, Choi Y, Cardie C (2007) Identifying expressions of opinion in context. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), Hyderabad, India
9.
go back to reference Ganu G, Elhadad N, Marian A (2009) Beyond the stars: improving rating predictions using review text content. In: Proceedings of the 12th international workshop on the web and databases (WebDB) Ganu G, Elhadad N, Marian A (2009) Beyond the stars: improving rating predictions using review text content. In: Proceedings of the 12th international workshop on the web and databases (WebDB)
10.
go back to reference Ganu G, Kakodkar Y, Marian AL (2013) Improving the quality of predictions using textual information in online user reviews. Inf Syst 38(1):1–15CrossRef Ganu G, Kakodkar Y, Marian AL (2013) Improving the quality of predictions using textual information in online user reviews. Inf Syst 38(1):1–15CrossRef
11.
go back to reference Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the ACM SIGKDD conference on knowledge discovery and data mining (KDD), pp 168–177 Hu M, Liu B (2004) Mining and summarizing customer reviews. In: Proceedings of the ACM SIGKDD conference on knowledge discovery and data mining (KDD), pp 168–177
12.
go back to reference Jindal N, Liu B (2008) Opinion spam and analysis. In: Proceedings of the conference on web search and web data mining (WSDM), pp 219–230 Jindal N, Liu B (2008) Opinion spam and analysis. In: Proceedings of the conference on web search and web data mining (WSDM), pp 219–230
13.
go back to reference Jones KS, Walker S, Robertson SE (2000) A probabilistic model of information retrieval: development and comparative experiments—part 1. Inf Process Manag 36(6):779–808CrossRef Jones KS, Walker S, Robertson SE (2000) A probabilistic model of information retrieval: development and comparative experiments—part 1. Inf Process Manag 36(6):779–808CrossRef
14.
go back to reference Jones KS, Walker S, Robertso SE (2000) A probabilistic model of information retrieval: development and comparative experiments—part 2. Inf Process Manag 36(6):809–840CrossRef Jones KS, Walker S, Robertso SE (2000) A probabilistic model of information retrieval: development and comparative experiments—part 2. Inf Process Manag 36(6):809–840CrossRef
15.
go back to reference Kiros R, Salakhutdinov R, Zemel RS (2014) Multimodal neural language models. In: Proceedings of the 31th international conference on machine learning, ICML 2014, Beijing, China, pp 595–603 Kiros R, Salakhutdinov R, Zemel RS (2014) Multimodal neural language models. In: Proceedings of the 31th international conference on machine learning, ICML 2014, Beijing, China, pp 595–603
16.
go back to reference Kiros R, Zemel RS, Salakhutdinov RR (2014) A multiplicative model for learning distributed text-based attribute representations. In: Advances in neural information processing systems 27: annual conference on neural information processing systems 2014, Montreal, Quebec, Canada, pp 2348–2356 Kiros R, Zemel RS, Salakhutdinov RR (2014) A multiplicative model for learning distributed text-based attribute representations. In: Advances in neural information processing systems 27: annual conference on neural information processing systems 2014, Montreal, Quebec, Canada, pp 2348–2356
17.
go back to reference Krämer B (1995) Classification of generic places: explorations with implications for evaluation. J Environ Psychol 15(1):3–22CrossRef Krämer B (1995) Classification of generic places: explorations with implications for evaluation. J Environ Psychol 15(1):3–22CrossRef
18.
go back to reference Le QV, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the 31th international conference on machine learning, ICML 2014, Beijing, China, pp 1188–1196 Le QV, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the 31th international conference on machine learning, ICML 2014, Beijing, China, pp 1188–1196
19.
go back to reference Litvak M, Last M (2008) Graph-based keyword extraction for single-document summarization. In: Proceedings of the workshop on multi-source multilingual information extraction and summarization. Association for Computational Linguistics, pp 17–24 Litvak M, Last M (2008) Graph-based keyword extraction for single-document summarization. In: Proceedings of the workshop on multi-source multilingual information extraction and summarization. Association for Computational Linguistics, pp 17–24
20.
go back to reference Liu Y, Huang J, An A, Yu X (2007) ARSA: A sentiment-aware model for predicting sales performance using blogs. In: Proceedings of the ACM special interest group on information retrieval (SIGIR) Liu Y, Huang J, An A, Yu X (2007) ARSA: A sentiment-aware model for predicting sales performance using blogs. In: Proceedings of the ACM special interest group on information retrieval (SIGIR)
21.
go back to reference Liu Z, Huang W, Zheng Y, Sun M (2010) Automatic keyphrase extraction via topic decomposition. In: Proceedings of the 2010 conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 366–376 Liu Z, Huang W, Zheng Y, Sun M (2010) Automatic keyphrase extraction via topic decomposition. In: Proceedings of the 2010 conference on empirical methods in natural language processing. Association for Computational Linguistics, pp 366–376
22.
go back to reference McGlohon M, Glance NS, Reiter Z (2010) Star quality: aggregating reviews to rank products and merchants. In: ICWSM. The AAAI Press McGlohon M, Glance NS, Reiter Z (2010) Star quality: aggregating reviews to rank products and merchants. In: ICWSM. The AAAI Press
23.
go back to reference Mei Q , Ling X, Wondra M, Su H, Zhai CX (2007) Topic sentiment mixture: modeling facets and opinions in weblogs. In: Proceedings of WWW, New York, NY, USA. ACM Press, pp 171–180 Mei Q , Ling X, Wondra M, Su H, Zhai CX (2007) Topic sentiment mixture: modeling facets and opinions in weblogs. In: Proceedings of WWW, New York, NY, USA. ACM Press, pp 171–180
24.
go back to reference Mihalcea R, Tarau P (2004) Textrank: bringing order into texts. Association for Computational Linguistics Mihalcea R, Tarau P (2004) Textrank: bringing order into texts. Association for Computational Linguistics
25.
go back to reference Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. CoRR, abs/1301.3781 Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. CoRR, abs/1301.3781
26.
go back to reference Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 3111–3119 Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 3111–3119
27.
go back to reference Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRef Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1–2):1–135CrossRef
28.
go back to reference Socher R, Chen D, Manning CD, Ng AY (2013) Reasoning with neural tensor networks for knowledge base completion. In Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 926–934 Socher R, Chen D, Manning CD, Ng AY (2013) Reasoning with neural tensor networks for knowledge base completion. In Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 926–934
29.
go back to reference Tomokiyo T, Hurst M (2003) A language model approach to keyphrase extraction. In: Proceedings of the ACL 2003 workshop on multiword expressions: analysis, acquisition and treatment, vol 18. Association for Computational Linguistics, pp 33–40 Tomokiyo T, Hurst M (2003) A language model approach to keyphrase extraction. In: Proceedings of the ACL 2003 workshop on multiword expressions: analysis, acquisition and treatment, vol 18. Association for Computational Linguistics, pp 33–40
30.
go back to reference Wan X, Xiao J (2008) Collabrank: towards a collaborative approach to single-document keyphrase extraction. In: Proceedings of the 22nd international conference on computational linguistics, vol 1. Association for Computational Linguistics, pp 969–976 Wan X, Xiao J (2008) Collabrank: towards a collaborative approach to single-document keyphrase extraction. In: Proceedings of the 22nd international conference on computational linguistics, vol 1. Association for Computational Linguistics, pp 969–976
31.
go back to reference Wan X, Xiao J (2008) Single document keyphrase extraction using neighborhood knowledge. AAAI 8:855–860 Wan X, Xiao J (2008) Single document keyphrase extraction using neighborhood knowledge. AAAI 8:855–860
32.
go back to reference Wang J, Zhao WX, He Y, Li X (2014) Infer user interests via link structure regularization. ACM TIST 5(2):23:1–23:22 Wang J, Zhao WX, He Y, Li X (2014) Infer user interests via link structure regularization. ACM TIST 5(2):23:1–23:22
33.
go back to reference Xu X, Tan S, Liu Y, Cheng X, Lin Z (2012) Towards jointly extracting aspects and aspect-specific sentiment knowledge. In: 21st ACM international conference on information and knowledge management, CIKM’12, Maui, HI, USA, pp 1895–1899 Xu X, Tan S, Liu Y, Cheng X, Lin Z (2012) Towards jointly extracting aspects and aspect-specific sentiment knowledge. In: 21st ACM international conference on information and knowledge management, CIKM’12, Maui, HI, USA, pp 1895–1899
34.
go back to reference Zhao WX, Li S, He Y, Chang EY, Wen J-R, Li X (2016) Connecting social media to e-commerce: Cold-start product recommendation using microblogging information. IEEE Trans Knowl Data Eng 28(5):1147–1159CrossRef Zhao WX, Li S, He Y, Chang EY, Wen J-R, Li X (2016) Connecting social media to e-commerce: Cold-start product recommendation using microblogging information. IEEE Trans Knowl Data Eng 28(5):1147–1159CrossRef
35.
go back to reference Zhao XW, Wang J, He Y, Nie JY, Li X (2013) Originator or propagator?: incorporating social role theory into topic models for twitter content analysis. In: 22nd ACM international conference on information and knowledge management, CIKM’13, San Francisco, CA, USA, pp 1649–1654 Zhao XW, Wang J, He Y, Nie JY, Li X (2013) Originator or propagator?: incorporating social role theory into topic models for twitter content analysis. In: 22nd ACM international conference on information and knowledge management, CIKM’13, San Francisco, CA, USA, pp 1649–1654
36.
go back to reference Zhao X, Jiang J, Yan H, Li X (2010) Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid. In: Proceedings of the 2010 conference on empirical methods in natural language processing, Cambridge, MA. Association for Computational Linguistics, pp 56–65 Zhao X, Jiang J, Yan H, Li X (2010) Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid. In: Proceedings of the 2010 conference on empirical methods in natural language processing, Cambridge, MA. Association for Computational Linguistics, pp 56–65
Metadata
Title
Mining collective knowledge: inferring functional labels from online review for business
Authors
Feifan Fan
Wayne Xin Zhao
Ji-Rong Wen
Ge Xu
Edward Y. Chang
Publication date
17-04-2017
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 3/2017
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-017-1050-4

Other articles of this Issue 3/2017

Knowledge and Information Systems 3/2017 Go to the issue

Premium Partner