Skip to main content
Erschienen in: Knowledge and Information Systems 2/2014

01.08.2014 | Regular Paper

A theoretical model for the automatic generation of tag clouds

verfasst von: Ursula Torres-Parejo, Jesús R. Campaña, M. Amparo Vila, Miguel Delgado

Erschienen in: Knowledge and Information Systems | Ausgabe 2/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents a new approach to information retrieval from non-structured attributes in databases, which involves the processing of text attributes. To make retrieval more effective, frequent text sequences are extracted and mathematically represented as intermediate forms which permit a clearer and more precise definition of operations on texts. These intermediate forms appear to users in the form of tag clouds to facilitate content identification, exploration, and querying. In this sense, tag cloud visualization is a simple, user-friendly visual interface to data. This paper proposes a theoretical model for the representation of frequent text sequences and their operations as well as a general procedure for generating tag clouds from text attributes in databases. The tag clouds thus obtained were compared with conventional tag clouds composed of single terms. Our study showed that automatically generated multi-term tag clouds provide better results than mono-term tag clouds.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Agili A, Fabbri M, Panunzi A, Zini M (2008) Integration of a multilingual keyword extractor in a document management system. In: Proceedings of the 6th international language resources and evaluation, LREC Agili A, Fabbri M, Panunzi A, Zini M (2008) Integration of a multilingual keyword extractor in a document management system. In: Proceedings of the 6th international language resources and evaluation, LREC
2.
Zurück zum Zitat Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceeding of the 20th international conference in very large data bases, VLDB, Citeseer, vol 1215, pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceeding of the 20th international conference in very large data bases, VLDB, Citeseer, vol 1215, pp 487–499
3.
Zurück zum Zitat Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the eleventh international conference on data engineering, pp 3–14 Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the eleventh international conference on data engineering, pp 3–14
4.
Zurück zum Zitat Balachandran V, Balachandran V, Khemani D (2012) Interpretable and reconfigurable clustering of document datasets by deriving word-based rules. Knowl Inf Syst 32(3):475–503CrossRef Balachandran V, Balachandran V, Khemani D (2012) Interpretable and reconfigurable clustering of document datasets by deriving word-based rules. Knowl Inf Syst 32(3):475–503CrossRef
5.
Zurück zum Zitat Bar-Ilan J, Shoham S, Idan A, Miller Y, Shachak A (2008) Structured versus unstructured tagging: a case study. Online Inf Rev 32:635–647CrossRef Bar-Ilan J, Shoham S, Idan A, Miller Y, Shachak A (2008) Structured versus unstructured tagging: a case study. Online Inf Rev 32:635–647CrossRef
6.
Zurück zum Zitat Begelman G, Keller P, Smadja F (2006) Automated tag clustering: improving search and exploration in the tag space. In: Collaborative web tagging workshop at WWW2006. Citeseer Begelman G, Keller P, Smadja F (2006) Automated tag clustering: improving search and exploration in the tag space. In: Collaborative web tagging workshop at WWW2006. Citeseer
7.
Zurück zum Zitat Campaña JR, Martín-Bautista MJ, Medina JM, Vila MA (2009) Semantic enrichment of database textual attributes. In: Flexible query answering systems, pp 488–499 Campaña JR, Martín-Bautista MJ, Medina JM, Vila MA (2009) Semantic enrichment of database textual attributes. In: Flexible query answering systems, pp 488–499
8.
Zurück zum Zitat Campaña JR, Medina JM, Vila MA (2011) Semantic processing of database textual attributes using wikipedia. In: Flexible query answering systems, pp 84–95 Campaña JR, Medina JM, Vila MA (2011) Semantic processing of database textual attributes using wikipedia. In: Flexible query answering systems, pp 84–95
9.
Zurück zum Zitat Don A, Zheleva E, Gregory M, Tarkan S, Auvil L, Clement T, Shneiderman B, Plaisant C (2007) Discovering interesting usage patterns in text collections: Integrating text mining with visualization. In: Proceedings of the 16th ACM conference on information and knowledge management, ACM, pp 213–222 Don A, Zheleva E, Gregory M, Tarkan S, Auvil L, Clement T, Shneiderman B, Plaisant C (2007) Discovering interesting usage patterns in text collections: Integrating text mining with visualization. In: Proceedings of the 16th ACM conference on information and knowledge management, ACM, pp 213–222
10.
Zurück zum Zitat Durao F, Dolog P, Leginus M, Lage R (2012) SimSpectrum: a similarity based spectral clustering approach to generate a tag cloud. In: Current trends in web, engineering, pp 145–154 Durao F, Dolog P, Leginus M, Lage R (2012) SimSpectrum: a similarity based spectral clustering approach to generate a tag cloud. In: Current trends in web, engineering, pp 145–154
11.
Zurück zum Zitat García-Silva A, Corcho O, Alani H, Gómez-Pérez A (2012) Review of the state of the art: discovering and associating semantics to tags in folksonomies. Knowl Eng Rev 27(01):57–85CrossRef García-Silva A, Corcho O, Alani H, Gómez-Pérez A (2012) Review of the state of the art: discovering and associating semantics to tags in folksonomies. Knowl Eng Rev 27(01):57–85CrossRef
12.
Zurück zum Zitat Grahl M, Hotho A, Stumme G (2007) Conceptual clustering of social bookmarking sites. In: Proceedings of I-KNOW, vol 7, pp 5–7 Grahl M, Hotho A, Stumme G (2007) Conceptual clustering of social bookmarking sites. In: Proceedings of I-KNOW, vol 7, pp 5–7
13.
Zurück zum Zitat Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, ACM, pp 1–12 Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, ACM, pp 1–12
14.
Zurück zum Zitat Hassan-Montero Y, Herrero-Solana V (2006) Improving tag-clouds as visual information retrieval interfaces. In: International conference on multidisciplinary information sciences and technologies, Citeseer, pp 25–28 Hassan-Montero Y, Herrero-Solana V (2006) Improving tag-clouds as visual information retrieval interfaces. In: International conference on multidisciplinary information sciences and technologies, Citeseer, pp 25–28
15.
Zurück zum Zitat Hearst M, Rosner D (2008) Tag clouds: data analysis tool or social signaller? In: Hawaii international conference on system sciences (HICSS), IEEE computer society, pp 160–169 Hearst M, Rosner D (2008) Tag clouds: data analysis tool or social signaller? In: Hawaii international conference on system sciences (HICSS), IEEE computer society, pp 160–169
16.
Zurück zum Zitat Helic D, Trattner C, Strohmaier M, Andrews K (2011) Are tag clouds useful for navigation? A network-theoretic analysis. Int J Soc Comput Cyber-Phys Syst 1(1):33–55CrossRef Helic D, Trattner C, Strohmaier M, Andrews K (2011) Are tag clouds useful for navigation? A network-theoretic analysis. Int J Soc Comput Cyber-Phys Syst 1(1):33–55CrossRef
17.
Zurück zum Zitat Heymann P, Garcia-Molina H (2006) Collaborative creation of communal hierarchical taxonomies in social tagging systems. Technical Report. University of Stanford, Infolab Heymann P, Garcia-Molina H (2006) Collaborative creation of communal hierarchical taxonomies in social tagging systems. Technical Report. University of Stanford, Infolab
18.
Zurück zum Zitat Hipp J, Güntzer U, Nakhaeizadeh G (2000) Algorithms for association rule mining—a general survey and comparison. SIGKDD Explor Newsl 2(1):58–64CrossRef Hipp J, Güntzer U, Nakhaeizadeh G (2000) Algorithms for association rule mining—a general survey and comparison. SIGKDD Explor Newsl 2(1):58–64CrossRef
19.
Zurück zum Zitat Howard H (2009) Knowledge discovery in databases. Online Notes. Computer Science. University of Regina Howard H (2009) Knowledge discovery in databases. Online Notes. Computer Science. University of Regina
20.
Zurück zum Zitat Hsieh W, Lai W, Chou S (2006) A collaborative tagging system for learning resources sharing. Current Dev Technol Assist Educ 2:1364–1368 Hsieh W, Lai W, Chou S (2006) A collaborative tagging system for learning resources sharing. Current Dev Technol Assist Educ 2:1364–1368
21.
Zurück zum Zitat Koutrika G, Zadeh Z, Garcia-Molina H (2009) Data Clouds: Summarizing keyword search results over structured data. In: Proceedings of the 12th international conference on extending database technology: advances in database technology, ACM, pp 391–402 Koutrika G, Zadeh Z, Garcia-Molina H (2009) Data Clouds: Summarizing keyword search results over structured data. In: Proceedings of the 12th international conference on extending database technology: advances in database technology, ACM, pp 391–402
22.
Zurück zum Zitat Kuo B, Hentrich T, Good B, Wilkinson M (2007) Tag clouds for summarizing web search results. In: Proceedings of the 16th international conference on world wide web, ACM, pp 1204–1205 Kuo B, Hentrich T, Good B, Wilkinson M (2007) Tag clouds for summarizing web search results. In: Proceedings of the 16th international conference on world wide web, ACM, pp 1204–1205
23.
Zurück zum Zitat Leone S, Geel M, Müller C, Norrie M (2011) Exploiting tag clouds for database browsing and querying. In: Information systems, evolution, pp 15–28 Leone S, Geel M, Müller C, Norrie M (2011) Exploiting tag clouds for database browsing and querying. In: Information systems, evolution, pp 15–28
24.
Zurück zum Zitat Marín N, Martín-Bautista MJ, Prados M, Vila MA (2006) Enhancing short text retrieval in databases. In: Flexible query answering systems, pp 613–624 Marín N, Martín-Bautista MJ, Prados M, Vila MA (2006) Enhancing short text retrieval in databases. In: Flexible query answering systems, pp 613–624
25.
Zurück zum Zitat Marinho L, Hotho A, Jáschke R, Nanopoulos A, Rendle S, Schmidt-Thieme L, Stumme G, Symeonidis P (2012) Social tagging systems. In: Recommender systems for social tagging systems, pp 3–15 Marinho L, Hotho A, Jáschke R, Nanopoulos A, Rendle S, Schmidt-Thieme L, Stumme G, Symeonidis P (2012) Social tagging systems. In: Recommender systems for social tagging systems, pp 3–15
26.
Zurück zum Zitat Martín-Bautista MJ, Prados M, Vila MA, Martínez-Folgoso S (2006) A knowledge representation for short texts based on frequent itemsets. In: Proceedings of the 11th conference of information processing and management of uncertainty (IPMU), Paris, pp 1065–1070 Martín-Bautista MJ, Prados M, Vila MA, Martínez-Folgoso S (2006) A knowledge representation for short texts based on frequent itemsets. In: Proceedings of the 11th conference of information processing and management of uncertainty (IPMU), Paris, pp 1065–1070
27.
Zurück zum Zitat Martín-Bautista MJ, Vila MA, Martínez-Folgoso S (2008) A new semantic representation for short texts. In: Data warehousing and knowledge discovery, vol 5182, pp 347–356 Martín-Bautista MJ, Vila MA, Martínez-Folgoso S (2008) A new semantic representation for short texts. In: Data warehousing and knowledge discovery, vol 5182, pp 347–356
28.
Zurück zum Zitat Martínez-Folgoso S (2008) Una solución semántica al tratamiento de atributos textuales en un modelo relacional orientado a objetos: implementación en software libre. Ph.D. thesis, Department of Computer Sciencie and Artificial Intelligence. University of Granada, Spain Martínez-Folgoso S (2008) Una solución semántica al tratamiento de atributos textuales en un modelo relacional orientado a objetos: implementación en software libre. Ph.D. thesis, Department of Computer Sciencie and Artificial Intelligence. University of Granada, Spain
29.
Zurück zum Zitat Milgram S, Jodelet D (1976) Psychological maps of paris. In: Environmental psychology, pp 104–124 Milgram S, Jodelet D (1976) Psychological maps of paris. In: Environmental psychology, pp 104–124
30.
Zurück zum Zitat Morik K, Kaspari A, Wurst M, Skirzynski M (2012) Multi-objective frequent termset clustering. Knowl Inf Syst 30(3):715–738CrossRef Morik K, Kaspari A, Wurst M, Skirzynski M (2012) Multi-objective frequent termset clustering. Knowl Inf Syst 30(3):715–738CrossRef
31.
Zurück zum Zitat Panunzi A, Marco F, Massimo M (2006) Integrating methods and lrs for automatic keyword extraction from open domain texts. In: Proceedings of the 5th international language resources and evaluation (LREC), pp 1917–1920 Panunzi A, Marco F, Massimo M (2006) Integrating methods and lrs for automatic keyword extraction from open domain texts. In: Proceedings of the 5th international language resources and evaluation (LREC), pp 1917–1920
32.
Zurück zum Zitat Savasere A, Omiecinski E, Navathe SB (1995) An efficient algorithm for mining association rules in large databases. In: Proceedings of the 21th international conference on very large data bases, VLDB ’95. Morgan Kaufmann, pp 432–444 Savasere A, Omiecinski E, Navathe SB (1995) An efficient algorithm for mining association rules in large databases. In: Proceedings of the 21th international conference on very large data bases, VLDB ’95. Morgan Kaufmann, pp 432–444
33.
Zurück zum Zitat Schmitz P (2006) Inducing ontology from Flickr tags. In: Collaborative web tagging workshop at WWW2006, Citeseer, pp 210–214 Schmitz P (2006) Inducing ontology from Flickr tags. In: Collaborative web tagging workshop at WWW2006, Citeseer, pp 210–214
34.
Zurück zum Zitat Sinclair J, Cardew-Hall M (2008) The folksonomy tag cloud: when is it useful? J Inf Sci 34:15–30CrossRef Sinclair J, Cardew-Hall M (2008) The folksonomy tag cloud: when is it useful? J Inf Sci 34:15–30CrossRef
35.
Zurück zum Zitat Tao F, Murtagh F, Farid M (2003) Weighted association rule mining using weighted support and significance framework. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 661–666 Tao F, Murtagh F, Farid M (2003) Weighted association rule mining using weighted support and significance framework. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 661–666
36.
Zurück zum Zitat Torres-Parejo U (2010) Tratamiento semántico de la información recuperada de internet con fines de consulta y exploración. Master Thesis. Department of Computer Sciencie and Artificial Intelligence. University of Granada, Spain Torres-Parejo U (2010) Tratamiento semántico de la información recuperada de internet con fines de consulta y exploración. Master Thesis. Department of Computer Sciencie and Artificial Intelligence. University of Granada, Spain
37.
Zurück zum Zitat Torres-Parejo U, Campaña JR, Vila MA, Delgado M (2012) Text retrieval and visualization in databases using tag clouds. Commun Comput Inf Sci 297:390–399CrossRef Torres-Parejo U, Campaña JR, Vila MA, Delgado M (2012) Text retrieval and visualization in databases using tag clouds. Commun Comput Inf Sci 297:390–399CrossRef
38.
Zurück zum Zitat Venetis P, Koutrika G, Garcia-Molina H (2011) On the selection of tags for tag clouds. In: Proceedings of the fourth ACM international conference on Web search and data mining, ACM, pp 835–844 Venetis P, Koutrika G, Garcia-Molina H (2011) On the selection of tags for tag clouds. In: Proceedings of the fourth ACM international conference on Web search and data mining, ACM, pp 835–844
39.
Zurück zum Zitat Viégas FB, Wattenberg M (2008) TIMELINES: Tag clouds and the case for vernacular visualization. Interactions 15:49–52CrossRef Viégas FB, Wattenberg M (2008) TIMELINES: Tag clouds and the case for vernacular visualization. Interactions 15:49–52CrossRef
40.
Zurück zum Zitat Viégas FB, Wattenberg M, Feinberg J (2009) Participatory visualization with Wordle. IEEE Trans Vis Comput Graph 15:1137–1144CrossRef Viégas FB, Wattenberg M, Feinberg J (2009) Participatory visualization with Wordle. IEEE Trans Vis Comput Graph 15:1137–1144CrossRef
41.
Zurück zum Zitat Watters D, Chicago I (2008) Meaningful clouds: towards a novel interface for document visualization. Online Notes. University of Chicago Watters D, Chicago I (2008) Meaningful clouds: towards a novel interface for document visualization. Online Notes. University of Chicago
42.
Zurück zum Zitat Xexéo G, Morgado F, Fiuza P (2009) Automatically generated tag clouds. XXIV Simpósio Brasileiro de Banco de Datos Xexéo G, Morgado F, Fiuza P (2009) Automatically generated tag clouds. XXIV Simpósio Brasileiro de Banco de Datos
43.
Zurück zum Zitat Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. In: 3rd Intl. Conf. on Knowledge Discovery and Data Mining. Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. In: 3rd Intl. Conf. on Knowledge Discovery and Data Mining.
Metadaten
Titel
A theoretical model for the automatic generation of tag clouds
verfasst von
Ursula Torres-Parejo
Jesús R. Campaña
M. Amparo Vila
Miguel Delgado
Publikationsdatum
01.08.2014
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 2/2014
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-013-0651-9

Weitere Artikel der Ausgabe 2/2014

Knowledge and Information Systems 2/2014 Zur Ausgabe