Skip to main content
Erschienen in: Information Systems Frontiers 3/2013

01.07.2013

Storing and analysing voice of the market data in the corporate data warehouse

verfasst von: Lisette García-Moya, Shahad Kudama, María José Aramburu, Rafael Berlanga

Erschienen in: Information Systems Frontiers | Ausgabe 3/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Web opinion feeds have become one of the most popular information sources users consult before buying products or contracting services. Negative opinions about a product can have a high impact in its sales figures. As a consequence, companies are more and more concerned about how to integrate opinion data in their business intelligence models so that they can predict sales figures or define new strategic goals. After analysing the requirements of this new application, this paper proposes a multidimensional data model to integrate sentiment data extracted from opinion posts in a traditional corporate data warehouse. Then, a new sentiment data extraction method that applies semantic annotation as a means to facilitate the integration of both types of data is presented. In this method, Wikipedia is used as the main knowledge resource, together with some well-known lexicons of opinion words and other corporate data and metadata stores describing the company products like, for example, technical specifications and user manuals. The resulting information system allows users to perform new analysis tasks by using the traditional OLAP-based data warehouse operators. We have developed a case study over a set of real opinions about digital devices which are offered by a wholesale dealer. Over this case study, the quality of the extracted sentiment data is evaluated, and some query examples that illustrate the potential uses of the integrated model are provided.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Archak, N., Ghose, A., Ipeirotis, P.G. (2007). Show me the money!: Deriving the pricing power of product features by mining consumer reviews. In Proceedings of the 13th ACM SIGKDD (pp. 56–65). Archak, N., Ghose, A., Ipeirotis, P.G. (2007). Show me the money!: Deriving the pricing power of product features by mining consumer reviews. In Proceedings of the 13th ACM SIGKDD (pp. 56–65).
Zurück zum Zitat Berger, A., & Lafferty, J. (1999). Information retrieval as statistical translation. In Proceedings of the 22nd annual conference on research and development in information retrieval (ACM SIGIR) (pp. 222–229). Berkeley, CA. Berger, A., & Lafferty, J. (1999). Information retrieval as statistical translation. In Proceedings of the 22nd annual conference on research and development in information retrieval (ACM SIGIR) (pp. 222–229). Berkeley, CA.
Zurück zum Zitat Berry, M.W., & Castellanos, M. (2007). Survey of text mining II: Clustering, classification, and retrieval, 1st Edn. ISBN 1848000456, 9781848000452. Berry, M.W., & Castellanos, M. (2007). Survey of text mining II: Clustering, classification, and retrieval, 1st Edn. ISBN 1848000456, 9781848000452.
Zurück zum Zitat Bhide, M., Chakravarthy, V., Gupta, A., Gupta, H., Mohania, M., Puniyani, K., Roy, P., Roy, S., Sengar, V. (2008). Enhanced business intelligence using EROCS. In Proceedings of the 2008 IEEE 24th international conference on data engineering (pp. 1616–1619). Bhide, M., Chakravarthy, V., Gupta, A., Gupta, H., Mohania, M., Puniyani, K., Roy, P., Roy, S., Sengar, V. (2008). Enhanced business intelligence using EROCS. In Proceedings of the 2008 IEEE 24th international conference on data engineering (pp. 1616–1619).
Zurück zum Zitat Bryl, V., Giuliano, C., Serafini, L., Tymoshenko, K. (2010). Supporting natural language processing with background knowledge: Coreference resolution case. In International semantic web conference (1) (pp. 80–95). Bryl, V., Giuliano, C., Serafini, L., Tymoshenko, K. (2010). Supporting natural language processing with background knowledge: Coreference resolution case. In International semantic web conference (1) (pp. 80–95).
Zurück zum Zitat Codd, E.F. (1993). Providing OLAP (On-line Analytical Processing) to user-analysts: an IT mandate. Technical Report, E.F. Codd and Associates. Codd, E.F. (1993). Providing OLAP (On-line Analytical Processing) to user-analysts: an IT mandate. Technical Report, E.F. Codd and Associates.
Zurück zum Zitat Dánger, R., & Berlanga, R. (2009). Generating complex ontology instances from documents. Journal of Algorithms, 64(1), 16–30. 1208CrossRef Dánger, R., & Berlanga, R. (2009). Generating complex ontology instances from documents. Journal of Algorithms, 64(1), 16–30. 1208CrossRef
Zurück zum Zitat Deng, H., Lyu, M.R., King, I. (2009). A generalized Co-HITS algorithm and its application to bipartite graphs. In KDD ’09: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 239–248). New York, NY, U.S.A.: ACM. doi:10.1145/1557019.1557051, ISBN 978-1-60558-495-9.CrossRef Deng, H., Lyu, M.R., King, I. (2009). A generalized Co-HITS algorithm and its application to bipartite graphs. In KDD ’09: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 239–248). New York, NY, U.S.A.: ACM. doi:10.​1145/​1557019.​1557051, ISBN 978-1-60558-495-9.CrossRef
Zurück zum Zitat Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S. (2007). Duplicate record detection: a survey. IEEE Transactions on Knowledge and Data Engineering, 19, 1–16. doi:10.1109/TKDE.2007.9, ISSN 1041-4347.CrossRef Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S. (2007). Duplicate record detection: a survey. IEEE Transactions on Knowledge and Data Engineering, 19, 1–16. doi:10.​1109/​TKDE.​2007.​9, ISSN 1041-4347.CrossRef
Zurück zum Zitat Etzioni, O., Banko, M., Soderland, S., Weld, D.S. (2008). Open information extraction from the web. Communications of the Association for Computing Machinery, 51, 68–74. doi:10.1145/1409360.1409378, ISSN 0001-0782.CrossRef Etzioni, O., Banko, M., Soderland, S., Weld, D.S. (2008). Open information extraction from the web. Communications of the Association for Computing Machinery, 51, 68–74. doi:10.​1145/​1409360.​1409378, ISSN 0001-0782.CrossRef
Zurück zum Zitat Funk, A., Li, Y., Saggion, H., Bontcheva, K., Leibold, C. (2008). Opinion analysis for business intelligence applications. In A. Duke, M. Hepp, K. Bontcheva, M.B. Vilain (Eds.), OBI, ACM international conference proceeding series (Vol. 308, p. 3). ACM, ISBN 978-1-60558-219-1. Funk, A., Li, Y., Saggion, H., Bontcheva, K., Leibold, C. (2008). Opinion analysis for business intelligence applications. In A. Duke, M. Hepp, K. Bontcheva, M.B. Vilain (Eds.), OBI, ACM international conference proceeding series (Vol. 308, p. 3). ACM, ISBN 978-1-60558-219-1.
Zurück zum Zitat García, L., Anaya, H., Berlanga, R., Aramburu, M.J. (2011). Probabilistic ranking of product features from customer reviews. In Iberian conference on pattern recognition and image analysis (IbPRIA 2011). Springer (to appear in Lecture Notes in Computer Science). García, L., Anaya, H., Berlanga, R., Aramburu, M.J. (2011). Probabilistic ranking of product features from customer reviews. In Iberian conference on pattern recognition and image analysis (IbPRIA 2011). Springer (to appear in Lecture Notes in Computer Science).
Zurück zum Zitat Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 168–177). New York, NY: ACM Press. Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 168–177). New York, NY: ACM Press.
Zurück zum Zitat Inmon, W.H. (2005). Building the data warehouse. Wiley. Inmon, W.H. (2005). Building the data warehouse. Wiley.
Zurück zum Zitat Jimeno-Yepes, A., Jiménez-Ruiz, E., Lee, V., Gaudan, S., Berlanga, R., Rebholz-Schuhmann, D. (2008). Assessment of disease named entity recognition on a corpus of annotated sentences. BMC Bioinformatics, 9(Suppl 3), S3. doi:10.1186/1471-2105-9-S3-S3. Jimeno-Yepes, A., Jiménez-Ruiz, E., Lee, V., Gaudan, S., Berlanga, R., Rebholz-Schuhmann, D. (2008). Assessment of disease named entity recognition on a corpus of annotated sentences. BMC Bioinformatics, 9(Suppl 3), S3. doi:10.1186/1471-2105-9-S3-S3.
Zurück zum Zitat Johne, A. (1994). Listening to the voice of the market. International Marketing Review, 11(1), 47–59.CrossRef Johne, A. (1994). Listening to the voice of the market. International Marketing Review, 11(1), 47–59.CrossRef
Zurück zum Zitat Kahan, J., & Koivunen, M.-R. (2001). Annotea: An open rdf infrastructure for shared web annotations. In Proceedings of the 10th international conference on World Wide Web, WWW ’01 (pp. 623–632). New York, NY, USA: ACM. doi:10.1145/371920.372166, ISBN 1-58113-348-0. Kahan, J., & Koivunen, M.-R. (2001). Annotea: An open rdf infrastructure for shared web annotations. In Proceedings of the 10th international conference on World Wide Web, WWW ’01 (pp. 623–632). New York, NY, USA: ACM. doi:10.​1145/​371920.​372166, ISBN 1-58113-348-0.
Zurück zum Zitat Kiryakov, A., Popov, B., Terziev, I., Manov, D., Ognyanoff, D. (2004). Semantic annotation, indexing, and retrieval. Web Semantics: Science, Services and Agents on the World Wide Web, 2(1), 49–79.CrossRef Kiryakov, A., Popov, B., Terziev, I., Manov, D., Ognyanoff, D. (2004). Semantic annotation, indexing, and retrieval. Web Semantics: Science, Services and Agents on the World Wide Web, 2(1), 49–79.CrossRef
Zurück zum Zitat Kudama, S., Berlanga, R., García, L., Nebot, V., Aramburu, M.J. (2011). Towards tailored semantic annotation systems from Wikipedia. In Proceedings of the DEXA workshop, DEXA 2011. IEEE. Kudama, S., Berlanga, R., García, L., Nebot, V., Aramburu, M.J. (2011). Towards tailored semantic annotation systems from Wikipedia. In Proceedings of the DEXA workshop, DEXA 2011. IEEE.
Zurück zum Zitat Liu, B., Hu, M., Cheng, J. (2005). Opinion observer: Analyzing and comparing opinions on the web. In Proceedings of the 14th international conference on the World Wide Web (pp. 342–351). Liu, B., Hu, M., Cheng, J. (2005). Opinion observer: Analyzing and comparing opinions on the web. In Proceedings of the 14th international conference on the World Wide Web (pp. 342–351).
Zurück zum Zitat Liu, Y., Huang, X., An, A., Yu, X. (2007). ARSA: A sentiment-aware model for predicting sales performance using blogs. In Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (pp. 607–614). Liu, Y., Huang, X., An, A., Yu, X. (2007). ARSA: A sentiment-aware model for predicting sales performance using blogs. In Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (pp. 607–614).
Zurück zum Zitat Lu, Y., Castellanos, M., Dayal, U., Zhai, C.X. (2011). Automatic construction of a context-aware sentiment lexicon: An optimization approach. In Proceedings of the 20th international conference on World Wide Web, WWW ’11 (pp. 347–356). New York, NY, USA: ACM. doi:10.1145/1963405.1963456, ISBN 978-1-4503-0632-4.CrossRef Lu, Y., Castellanos, M., Dayal, U., Zhai, C.X. (2011). Automatic construction of a context-aware sentiment lexicon: An optimization approach. In Proceedings of the 20th international conference on World Wide Web, WWW ’11 (pp. 347–356). New York, NY, USA: ACM. doi:10.​1145/​1963405.​1963456, ISBN 978-1-4503-0632-4.CrossRef
Zurück zum Zitat Mihalcea, R., & Csomai, A. (2007). Wikify!: Linking documents to encyclopedic knowledge. In CIKM ’07: Proceedings of the sixteenth ACM conference on conference on information and knowledge management (pp. 233–242). ACM. doi:10.1145/1321440.1321475, ISBN 978-1-59593-803-9. Mihalcea, R., & Csomai, A. (2007). Wikify!: Linking documents to encyclopedic knowledge. In CIKM ’07: Proceedings of the sixteenth ACM conference on conference on information and knowledge management (pp. 233–242). ACM. doi:10.​1145/​1321440.​1321475, ISBN 978-1-59593-803-9.
Zurück zum Zitat Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Now Publishers Inc. Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Now Publishers Inc.
Zurück zum Zitat Pérez, J.M., Berlanga, R., Aramburu, M.J., Pedersen, T.B. (2007). R-Cubes: OLAP cubes contextualized with documents. In Proceedings of the IEEE 23rd international conference on data engineering (pp. 1477–1478). 1282 Pérez, J.M., Berlanga, R., Aramburu, M.J., Pedersen, T.B. (2007). R-Cubes: OLAP cubes contextualized with documents. In Proceedings of the IEEE 23rd international conference on data engineering (pp. 1477–1478). 1282
Zurück zum Zitat Pérez, J.M., Berlanga, R., Aramburu, M.J., Pedersen, T.B. (2008a). Towards a data warehouse contextualized with web opinions. In Proceedings of the 2008 IEEE international conference on e-Business engineering (pp. 697–702). Pérez, J.M., Berlanga, R., Aramburu, M.J., Pedersen, T.B. (2008a). Towards a data warehouse contextualized with web opinions. In Proceedings of the 2008 IEEE international conference on e-Business engineering (pp. 697–702).
Zurück zum Zitat Pérez, J.M., Berlanga, R., Aramburu, M.J., Pedersen, T.B. (2008b). Contextualizing data warehouses with documents. Decision Support Systems, 45(1), 77–94.CrossRef Pérez, J.M., Berlanga, R., Aramburu, M.J., Pedersen, T.B. (2008b). Contextualizing data warehouses with documents. Decision Support Systems, 45(1), 77–94.CrossRef
Zurück zum Zitat Reidenbach, R.E. (2009). Listening to the voice of the market: How to increase market share and satisfy current customers. Crc Press. Reidenbach, R.E. (2009). Listening to the voice of the market: How to increase market share and satisfy current customers. Crc Press.
Zurück zum Zitat Stone, P.J., Dunphy, D.C., Smith, M.S., Ogilvie, D.M. (1966). The general inquirer: A computer approach to content analysis (Vol. 08). MIT Press. Stone, P.J., Dunphy, D.C., Smith, M.S., Ogilvie, D.M. (1966). The general inquirer: A computer approach to content analysis (Vol. 08). MIT Press.
Zurück zum Zitat Uren, V., Cimiano, P., Iria, J., Handschuh, S., Vargas-Vera, M., Motta, E., Ciravegna, F. (2006). Semantic annotation for knowledge management: Requirements and a survey of the state of the art. In Web semantics: Science, services and agents on the World Wide Web (Vol. 4, no. 1, pp. 14–28). doi:10.1016/j.websem.2005.10.002, ISSN 15708268. Uren, V., Cimiano, P., Iria, J., Handschuh, S., Vargas-Vera, M., Motta, E., Ciravegna, F. (2006). Semantic annotation for knowledge management: Requirements and a survey of the state of the art. In Web semantics: Science, services and agents on the World Wide Web (Vol. 4, no. 1, pp. 14–28). doi:10.​1016/​j.​websem.​2005.​10.​002, ISSN 15708268.
Zurück zum Zitat Wang, H., Lu, Y., Zhai, C. (2010). Latent aspect rating analysis on review text data: A rating regression approach. In Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10 (pp. 783–792). New York, NY, USA: ACM. doi:10.1145/1835804.1835903.CrossRef Wang, H., Lu, Y., Zhai, C. (2010). Latent aspect rating analysis on review text data: A rating regression approach. In Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10 (pp. 783–792). New York, NY, USA: ACM. doi:10.​1145/​1835804.​1835903.CrossRef
Zurück zum Zitat Zhang, L., Liu, B., Lim, S.H., O’Brien-Strain, E. (2010). Extracting and ranking product features in opinion documents. In Proceedings of the 23rd international conference on computational linguistics (pp. 1462–1470). Beijing, China. Zhang, L., Liu, B., Lim, S.H., O’Brien-Strain, E. (2010). Extracting and ranking product features in opinion documents. In Proceedings of the 23rd international conference on computational linguistics (pp. 1462–1470). Beijing, China.
Metadaten
Titel
Storing and analysing voice of the market data in the corporate data warehouse
verfasst von
Lisette García-Moya
Shahad Kudama
María José Aramburu
Rafael Berlanga
Publikationsdatum
01.07.2013
Verlag
Springer US
Erschienen in
Information Systems Frontiers / Ausgabe 3/2013
Print ISSN: 1387-3326
Elektronische ISSN: 1572-9419
DOI
https://doi.org/10.1007/s10796-012-9400-y

Weitere Artikel der Ausgabe 3/2013

Information Systems Frontiers 3/2013 Zur Ausgabe

Premium Partner