Skip to main content
Top
Published in: Information Systems and e-Business Management 4/2013

01-12-2013 | Original Article

Using statistics, visualization and data mining for monitoring the quality of meta-data in web portals

Authors: Marcos Aurélio Domingues, Carlos Soares, Alípio Mário Jorge

Published in: Information Systems and e-Business Management | Issue 4/2013

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The goal of many web portals is to select, organize and distribute content in order to satisfy its users/customers. This process is usually based on meta-data that represent and describe content. In this paper we describe a methodology and a system to monitor the quality of the meta-data used to describe content in web portals. The methodology is based on the analysis of the meta-data using statistics, visualization and data mining tools. The methodology enables the site’s editor to detect and correct problems in the description of contents, thus improving the quality of the web portal and the satisfaction of its users. We also define a general architecture for a system to support the proposed methodology. We have implemented this system and tested it on a Portuguese portal for management executives. The results validate the methodology proposed.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
go back to reference Baeza-Yates R, Rello L (2012) On measuring the lexical quality of the web. In: Proceedings of the 2012 Joint WICOW/AIRWeb workshop on web quality (WebQuality 2012), pp 1–6. doi:10.1145/2184305.2184307 Baeza-Yates R, Rello L (2012) On measuring the lexical quality of the web. In: Proceedings of the 2012 Joint WICOW/AIRWeb workshop on web quality (WebQuality 2012), pp 1–6. doi:10.​1145/​2184305.​2184307
go back to reference Blanco L, Crescenzi V, Merialdo P, Papotti P (2011) Characterizing the uncertainty of web data: models and experiences. In: Proceedings of the 2011 joint WICOW/AIRWeb workshop on web quality (WebQuality 2011), pp 1–8. doi:10.1145/1964114.1964116 Blanco L, Crescenzi V, Merialdo P, Papotti P (2011) Characterizing the uncertainty of web data: models and experiences. In: Proceedings of the 2011 joint WICOW/AIRWeb workshop on web quality (WebQuality 2011), pp 1–8. doi:10.​1145/​1964114.​1964116
go back to reference Bruce TR, Hillmann D (2004) The continuum of metadata quality: defining, expressing, exploiting. American Library Association, Chicago Bruce TR, Hillmann D (2004) The continuum of metadata quality: defining, expressing, exploiting. American Library Association, Chicago
go back to reference Carneiro A (2008) Using web data for measuring the effectiveness of an e-commerce site. Master’s thesis, University of Porto, Faculty of Economics, Portugal Carneiro A (2008) Using web data for measuring the effectiveness of an e-commerce site. Master’s thesis, University of Porto, Faculty of Economics, Portugal
go back to reference Cleverdon CW, Mills J, Keen M (1966) Aslib cranfield research project—factors determining the performance of indexing systems; volume 1, design; part 1, text. Tech. rep., Cranfield University. http://hdl.handle.net/1826/861. Accessed 30 Nov 2009 Cleverdon CW, Mills J, Keen M (1966) Aslib cranfield research project—factors determining the performance of indexing systems; volume 1, design; part 1, text. Tech. rep., Cranfield University. http://​hdl.​handle.​net/​1826/​861. Accessed 30 Nov 2009
go back to reference Domingues MA (2008) An independent platform for the monitoring, analysis and adaptation of web sites. In: Pu P, Bridge DG, Mobasher B, Ricci F (eds) Proceedings of the 2008 ACM conference on recommender systems, RecSys 2008, Lausanne, Switzerland, October 23–25, 2008, pp 299–302 Domingues MA (2008) An independent platform for the monitoring, analysis and adaptation of web sites. In: Pu P, Bridge DG, Mobasher B, Ricci F (eds) Proceedings of the 2008 ACM conference on recommender systems, RecSys 2008, Lausanne, Switzerland, October 23–25, 2008, pp 299–302
go back to reference Domingues MA, Soares C, Jorge AM (2006) A web-based system to monitor the quality of meta-data in web portals. In: WI-IATW ’06: proceedings of the 2006 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology, IEEE Computer Society, Hong-Kong, China, pp 188–191. doi:10.1109/WI-IATW.2006.24 Domingues MA, Soares C, Jorge AM (2006) A web-based system to monitor the quality of meta-data in web portals. In: WI-IATW ’06: proceedings of the 2006 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology, IEEE Computer Society, Hong-Kong, China, pp 188–191. doi:10.​1109/​WI-IATW.​2006.​24
go back to reference Domingues MA, Jorge AM, Soares C, Leal JP, Machado P (2007) A data warehouse for web intelligence. In: Proceedings of the 13th Portuguese conference on artificial intelligence, pp 487–499 Domingues MA, Jorge AM, Soares C, Leal JP, Machado P (2007) A data warehouse for web intelligence. In: Proceedings of the 13th Portuguese conference on artificial intelligence, pp 487–499
go back to reference Domingues MA, Jorge AM, Soares C, Leal JP, Machado P (2008) A platform to support web site adaptation and monitoring of its effects: a case study. In: Proceedings of the 6th workshop on intelligent techniques for web personalization and recommender systems (ITWP 2008), Chicago, Illinois, pp 29–36 Domingues MA, Jorge AM, Soares C, Leal JP, Machado P (2008) A platform to support web site adaptation and monitoring of its effects: a case study. In: Proceedings of the 6th workshop on intelligent techniques for web personalization and recommender systems (ITWP 2008), Chicago, Illinois, pp 29–36
go back to reference Fluit C, Wester J (2002) Using visualization for information management tasks. In: International conference on information visualisation Fluit C, Wester J (2002) Using visualization for information management tasks. In: International conference on information visualisation
go back to reference Lex E, Voelske M, Errecalde M, Ferretti E, Cagnina L, Horn C, Stein B, Granitzer M (2012) Measuring the quality of web content using factual information. In: Proceedings of the 2012 joint WICOW/AIRWeb workshop on web quality (WebQuality 2012), pp 7–10. doi:10.1145/2184305.2184308 Lex E, Voelske M, Errecalde M, Ferretti E, Cagnina L, Horn C, Stein B, Granitzer M (2012) Measuring the quality of web content using factual information. In: Proceedings of the 2012 joint WICOW/AIRWeb workshop on web quality (WebQuality 2012), pp 7–10. doi:10.​1145/​2184305.​2184308
go back to reference Malinowski E, Zimnyi E (2008) Advanced data warehouse design: from conventional to spatial and temporal applications (Data-Centric Systems and Applications). Springer Publishing Company, Incorporated Malinowski E, Zimnyi E (2008) Advanced data warehouse design: from conventional to spatial and temporal applications (Data-Centric Systems and Applications). Springer Publishing Company, Incorporated
go back to reference Nichols DM, Chan CH, Bainbridge D, McKay D, Twidale MB (2008) A lightweight metadata quality tool. In: Proceedings of the 8th ACM/IEEE-CS joint conference on digital libraries (JCDL 2008), pp 385–388. doi:10.1145/1378889.1378957 Nichols DM, Chan CH, Bainbridge D, McKay D, Twidale MB (2008) A lightweight metadata quality tool. In: Proceedings of the 8th ACM/IEEE-CS joint conference on digital libraries (JCDL 2008), pp 385–388. doi:10.​1145/​1378889.​1378957
go back to reference Ochoa X, Duval E (2006) Towards automatic evaluation of learning object metadata quality. In: Proceedings of the 2006 international conference on advances in conceptual modeling: theory and practice. Springer, Berlin, Heidelberg, pp 372–381. doi:10.1007/11908883_44 Ochoa X, Duval E (2006) Towards automatic evaluation of learning object metadata quality. In: Proceedings of the 2006 international conference on advances in conceptual modeling: theory and practice. Springer, Berlin, Heidelberg, pp 372–381. doi:10.​1007/​11908883_​44
go back to reference Pipino L L, Lee YW, Wang RY (2002) Data quality assessment. Commun ACM 45(4):211–218CrossRef Pipino L L, Lee YW, Wang RY (2002) Data quality assessment. Commun ACM 45(4):211–218CrossRef
go back to reference Rijsbergen CJV (1979) Information retrieval. Butterworth-Heinemann, Newton, MA, USA Rijsbergen CJV (1979) Information retrieval. Butterworth-Heinemann, Newton, MA, USA
go back to reference Soares C, Jorge AM, Domingues MA (2005) Monitoring the quality of meta-data in web portals using statistics, visualization and data mining. In: Proceedings of Twelfth Portuguese conference on artificial intelligence (EPIA 2005), LNAI 3808, Covilhã, Portugal, pp 371–382 Soares C, Jorge AM, Domingues MA (2005) Monitoring the quality of meta-data in web portals using statistics, visualization and data mining. In: Proceedings of Twelfth Portuguese conference on artificial intelligence (EPIA 2005), LNAI 3808, Covilhã, Portugal, pp 371–382
go back to reference Spiliopoulou M, Pohle C (2001) Data mining for measuring and improving the success of web sites. Data Min Knowl Discov 5(1–2):85–114CrossRef Spiliopoulou M, Pohle C (2001) Data mining for measuring and improving the success of web sites. Data Min Knowl Discov 5(1–2):85–114CrossRef
go back to reference Stvilia B, Gasser L, Twidale MB, Shreeves SL, Cole TW (2004) Metadata quality for federated collections. In: 9th international conference on information quality (IQ 2004), pp 111–125 Stvilia B, Gasser L, Twidale MB, Shreeves SL, Cole TW (2004) Metadata quality for federated collections. In: 9th international conference on information quality (IQ 2004), pp 111–125
go back to reference Velasquez JD, Palade V (2008) Adaptive web sites: a knowledge extraction from web data approach—volume 170 frontiers in artificial intelligence and applications. IOS Press, Amsterdam, The Netherlands Velasquez JD, Palade V (2008) Adaptive web sites: a knowledge extraction from web data approach—volume 170 frontiers in artificial intelligence and applications. IOS Press, Amsterdam, The Netherlands
go back to reference Vuong BQ, Lim EP, Sun A, Chang CH, Chatterjea K, Goh DHL, Theng YL, Zhang J (2007) Key element-context model: an approach to efficient web metadata maintenance. In: ECDL’07: Proceedings of the 11th European conference on research and advanced technology for digital libraries, Springer, Berlin, Heidelberg, pp 63–74. doi:10.1007/978-3-540-74851-9_6 Vuong BQ, Lim EP, Sun A, Chang CH, Chatterjea K, Goh DHL, Theng YL, Zhang J (2007) Key element-context model: an approach to efficient web metadata maintenance. In: ECDL’07: Proceedings of the 11th European conference on research and advanced technology for digital libraries, Springer, Berlin, Heidelberg, pp 63–74. doi:10.​1007/​978-3-540-74851-9_​6
go back to reference Zaïane OR, Xin M, Han J (1998) Discovering web access patterns and trends by applying olap and data mining technology on web logs. In: Proceedings of the advances in digital libraries conference (ADL-1998), IEEE Computer Society, Washington, DC, USA, pp 19–29 Zaïane OR, Xin M, Han J (1998) Discovering web access patterns and trends by applying olap and data mining technology on web logs. In: Proceedings of the advances in digital libraries conference (ADL-1998), IEEE Computer Society, Washington, DC, USA, pp 19–29
Metadata
Title
Using statistics, visualization and data mining for monitoring the quality of meta-data in web portals
Authors
Marcos Aurélio Domingues
Carlos Soares
Alípio Mário Jorge
Publication date
01-12-2013
Publisher
Springer Berlin Heidelberg
Published in
Information Systems and e-Business Management / Issue 4/2013
Print ISSN: 1617-9846
Electronic ISSN: 1617-9854
DOI
https://doi.org/10.1007/s10257-012-0209-5

Other articles of this Issue 4/2013

Information Systems and e-Business Management 4/2013 Go to the issue

Premium Partner