Skip to main content
Erschienen in: Knowledge and Information Systems 3/2014

01.09.2014 | Regular Paper

Introducing semantic variables in mixed distance measures: Impact on hierarchical clustering

verfasst von: Karina Gibert, Aïda Valls, Montserrat Batet

Erschienen in: Knowledge and Information Systems | Ausgabe 3/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Today, it is well known that taking into account the semantic information available for categorical variables sensibly improves the meaningfulness of the final results of any analysis. The paper presents a generalization of mixed Gibert’s metrics, which originally handled numerical and categorical variables, to include also semantic variables. Semantic variables are defined as categorical variables related to a reference ontology (ontologies are formal structures to model semantic relationships between the concepts of a certain domain). The superconcept-based distance (SCD) is introduced to compare semantic variables taking into account the information provided by the reference ontology. A benchmark shows the good performance of SCD with respect to other proposals, taken from the literature, to compare semantic features. Mixed Gibert’s metrics is generalized incorporating SCD. Finally, two real applications based on touristic data show the impact of the generalized Gibert’s metrics in clustering procedures and, in consequence, the impact of taking into account the reference ontology in clustering. The main conclusion is that the reference ontology, when available, can sensibly improve the meaningfulness of the final clusters.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Anderberg MR (1973) Cluster analysis for applications. Monographs and textbooks on probability and mathematical statistics. Academic Press, NY Anderberg MR (1973) Cluster analysis for applications. Monographs and textbooks on probability and mathematical statistics. Academic Press, NY
2.
Zurück zum Zitat Ankerst M, Breuing MM, Kriegel HP, Sander J (1999) Optics: ordering points to identify the clustering structure. In: ACM SIGMOD international conference on Management of data, pp 49–60 Ankerst M, Breuing MM, Kriegel HP, Sander J (1999) Optics: ordering points to identify the clustering structure. In: ACM SIGMOD international conference on Management of data, pp 49–60
3.
Zurück zum Zitat Annichiarico R, Gibert K et al (2004) Qualitative profiles of disability. JRRD 41(6A):835–845CrossRef Annichiarico R, Gibert K et al (2004) Qualitative profiles of disability. JRRD 41(6A):835–845CrossRef
4.
Zurück zum Zitat Anton-Clavé S, Nello MG, Orellana A (2007) Coastal tourism in natural parks. An analysis of demand profiles and recreational uses in coastal protected natural areas. Revista Turismo and Desenvolvimento 7–8:9–81 Anton-Clavé S, Nello MG, Orellana A (2007) Coastal tourism in natural parks. An analysis of demand profiles and recreational uses in coastal protected natural areas. Revista Turismo and Desenvolvimento 7–8:9–81
5.
Zurück zum Zitat Antunes C (2007) Onto4ar: a framework for mining association rules. In Proceedings ECML/PKDD07 Antunes C (2007) Onto4ar: a framework for mining association rules. In Proceedings ECML/PKDD07
6.
Zurück zum Zitat Baralis E, Cagliero L, Cerquitelli T, Garza P, Marchetti M (2011) Cas-mine: providing personalized services in context-aware applications by means of generalized rules. Knowl Inf Syst 28:283–310CrossRef Baralis E, Cagliero L, Cerquitelli T, Garza P, Marchetti M (2011) Cas-mine: providing personalized services in context-aware applications by means of generalized rules. Knowl Inf Syst 28:283–310CrossRef
7.
Zurück zum Zitat Batet M, Sánchez D, Valls A, Gibert K (2013) Semantic similarity estimation from multiple ontologies. Appl Intell 38:29–44CrossRef Batet M, Sánchez D, Valls A, Gibert K (2013) Semantic similarity estimation from multiple ontologies. Appl Intell 38:29–44CrossRef
8.
Zurück zum Zitat Batet M, Valls A, Gibert K (2010). A distance function to assess the similarity of words using ontologies. In Proc. XV ESTYLF’10, pages 561–566 Batet M, Valls A, Gibert K (2010). A distance function to assess the similarity of words using ontologies. In Proc. XV ESTYLF’10, pages 561–566
9.
Zurück zum Zitat Benzécri J (1980) Pratique de l’analyse des données. Analyse des correspondances, expose elementaire, vol 1. Paris: Dunod Benzécri J (1980) Pratique de l’analyse des données. Analyse des correspondances, expose elementaire, vol 1. Paris: Dunod
10.
Zurück zum Zitat Bernstein A, Provost F, Hill S (2005) Toward intelligent assistance for a data mining process: an ontology-based approach for cost-sensitive classification. Knowl Data Eng IEEE Trans 17(4):503–518CrossRef Bernstein A, Provost F, Hill S (2005) Toward intelligent assistance for a data mining process: an ontology-based approach for cost-sensitive classification. Knowl Data Eng IEEE Trans 17(4):503–518CrossRef
11.
Zurück zum Zitat Blanchard E, Harzallah M, Kuntz P (2008) A generic framework for comparing semantic similarities on a subsumption hierarchy. In: Ghallab M, Spyropoulos CD, Fakotakis N, Avouris NM (eds). Procedeings of 18th European conference on artificial intelligence (ECAI), vol 178, pp 20–24. IOS Press, Patras, Greece Blanchard E, Harzallah M, Kuntz P (2008) A generic framework for comparing semantic similarities on a subsumption hierarchy. In: Ghallab M, Spyropoulos CD, Fakotakis N, Avouris NM (eds). Procedeings of 18th European conference on artificial intelligence (ECAI), vol 178, pp 20–24. IOS Press, Patras, Greece
12.
Zurück zum Zitat Breen C, Khan L, Ponnusamy A (2002) Image classification using neural networks and ontologies. In: Database and expert systems applications, 2002. In: Proceedings of the 13th international workshop on, pp 98–102 Breen C, Khan L, Ponnusamy A (2002) Image classification using neural networks and ontologies. In: Database and expert systems applications, 2002. In: Proceedings of the 13th international workshop on, pp 98–102
13.
Zurück zum Zitat Brun C, Chevenet F, Martin D, Wojcik J, Guenoche A, Jacq B (2004) Functional classification of proteins for the prediction of cellular function from a protein–protein interaction network. Genome Biol 5:6–6CrossRef Brun C, Chevenet F, Martin D, Wojcik J, Guenoche A, Jacq B (2004) Functional classification of proteins for the prediction of cellular function from a protein–protein interaction network. Genome Biol 5:6–6CrossRef
14.
Zurück zum Zitat Buitelaar P, et al. (2004) Ontoselect: a dynamic ontology library with support for ontology selection. In: Proceedings of the International Semantic Web Conference Buitelaar P, et al. (2004) Ontoselect: a dynamic ontology library with support for ontology selection. In: Proceedings of the International Semantic Web Conference
15.
Zurück zum Zitat Cardoso J (2006) Developing an owl ontology for e-tourism. In: Cardoso J, Sheth A (eds). Semantic web services, processes and applications, vol 3 of semantic web and beyond, pp 247–282. Springer, US Cardoso J (2006) Developing an owl ontology for e-tourism. In: Cardoso J, Sheth A (eds). Semantic web services, processes and applications, vol 3 of semantic web and beyond, pp 247–282. Springer, US
16.
Zurück zum Zitat Ceccaroni L, Cortés U, Sánchez-Marré M (2004) Ontowedss: augmenting environmental decision-support systems with ontologies. Environ Model Softw 19(9):785–797 Ceccaroni L, Cortés U, Sánchez-Marré M (2004) Ontowedss: augmenting environmental decision-support systems with ontologies. Environ Model Softw 19(9):785–797
17.
Zurück zum Zitat Cespivova H, Rauch J, Svatek V, Kejkula M, Tomeckova M (2004) Roles of medical ontology in association mining crisp-dm cycle. In: ECML/PKDD04 workshop on knowledge discovery and ontologies (KDO 2004) Cespivova H, Rauch J, Svatek V, Kejkula M, Tomeckova M (2004) Roles of medical ontology in association mining crisp-dm cycle. In: ECML/PKDD04 workshop on knowledge discovery and ontologies (KDO 2004)
18.
Zurück zum Zitat Chemudugunta C, Smyth P, Steyvers M (2008) Combining concept hierarchies and statistical topic models. In: Proceedings of the CIKM), pp 1469–1470 Chemudugunta C, Smyth P, Steyvers M (2008) Combining concept hierarchies and statistical topic models. In: Proceedings of the CIKM), pp 1469–1470
19.
Zurück zum Zitat Choi C, Cho M, Choi J, Hwang M, Park J, Kim P (2009) Travel ontology for intelligent recommendation system. In: Modelling simulation, 2009. AMS ’09. Third Asia international conference on, pp 637–642 Choi C, Cho M, Choi J, Hwang M, Park J, Kim P (2009) Travel ontology for intelligent recommendation system. In: Modelling simulation, 2009. AMS ’09. Third Asia international conference on, pp 637–642
20.
Zurück zum Zitat Cimiano P (2006) Ontology learning and population from text. Algorithms, evaluation and applications. Springer, Berlin Cimiano P (2006) Ontology learning and population from text. Algorithms, evaluation and applications. Springer, Berlin
21.
Zurück zum Zitat Dillon WR, Goldstein M (1984) Multivariate analysis: methods and applications. Wiley, LondonMATH Dillon WR, Goldstein M (1984) Multivariate analysis: methods and applications. Wiley, LondonMATH
22.
Zurück zum Zitat Ding L, et al. (2004) Swoogle: a search and metadata engine for the semantic web. In: Proceedings of the XIIIth ACM international CIKM04, pp 652–659. ACM Press, NY Ding L, et al. (2004) Swoogle: a search and metadata engine for the semantic web. In: Proceedings of the XIIIth ACM international CIKM04, pp 652–659. ACM Press, NY
23.
Zurück zum Zitat Domingues R (2005) Using ontologies to facilitate the analysis of association rules. In: ECML/PKDD07 Workshop on knowledge discovery and ontologies Domingues R (2005) Using ontologies to facilitate the analysis of association rules. In: ECML/PKDD07 Workshop on knowledge discovery and ontologies
24.
Zurück zum Zitat Downey D et al. (2007) Locating complex named entities in web text. In: Proceedings of the 20th IJCAI, pp 2733–2739 Downey D et al. (2007) Locating complex named entities in web text. In: Proceedings of the 20th IJCAI, pp 2733–2739
25.
Zurück zum Zitat Ensan F, Du W (2011) A knowledge encapsulation approach to ontology modularization. Knowl Inf Syst 26:249–283CrossRef Ensan F, Du W (2011) A knowledge encapsulation approach to ontology modularization. Knowl Inf Syst 26:249–283CrossRef
26.
Zurück zum Zitat Wood ME (2002) Ecotourism: Principles, practices and policies for sustainability. UNEP, TIES Wood ME (2002) Ecotourism: Principles, practices and policies for sustainability. UNEP, TIES
27.
Zurück zum Zitat Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Press A (ed) KDD’96, pp 226–231 Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Press A (ed) KDD’96, pp 226–231
28.
Zurück zum Zitat Fan J, Gao Y, Luo H (2008) Integrating concept ontology and multitask learning to achieve more effective classifier training for multilevel image annotation. Image Process, IEEE Trans 17(3):407–426MathSciNetCrossRef Fan J, Gao Y, Luo H (2008) Integrating concept ontology and multitask learning to achieve more effective classifier training for multilevel image annotation. Image Process, IEEE Trans 17(3):407–426MathSciNetCrossRef
29.
Zurück zum Zitat Fayyad U, et al (1996) Advances in KDD and data mining, chapter From data mining to knowledge discovery: an overview. AAAI/MIT Press, Cambridge Fayyad U, et al (1996) Advances in KDD and data mining, chapter From data mining to knowledge discovery: an overview. AAAI/MIT Press, Cambridge
31.
Zurück zum Zitat Ganti V, Gehrke J, Ramakrishnan R (1999) Cactus: clustering categorical data using summaries. In: Proceedings of the 5th ACM SIGKDD international Conference on knowledge discovery in data mining, pp 73–83 Ganti V, Gehrke J, Ramakrishnan R (1999) Cactus: clustering categorical data using summaries. In: Proceedings of the 5th ACM SIGKDD international Conference on knowledge discovery in data mining, pp 73–83
32.
Zurück zum Zitat Garcia A, Bentes C, Melo R, Zadrozny B, Penna T (2011) Sensor data analysis for equipment monitoring. Knowl Inf Syst 28:333–364CrossRef Garcia A, Bentes C, Melo R, Zadrozny B, Penna T (2011) Sensor data analysis for equipment monitoring. Knowl Inf Syst 28:333–364CrossRef
33.
Zurück zum Zitat Garcia-Crespo A, Lopez-Cuadrado JL, Colomo-Palacios R, Gonzalez-Carrasco I, Ruiz-Mezcua B (2011) Sem-fit: a semantic based expert system to provide recommendations in the tourism domain. Expert Syst Appl 38(10):13310–13319CrossRef Garcia-Crespo A, Lopez-Cuadrado JL, Colomo-Palacios R, Gonzalez-Carrasco I, Ruiz-Mezcua B (2011) Sem-fit: a semantic based expert system to provide recommendations in the tourism domain. Expert Syst Appl 38(10):13310–13319CrossRef
34.
Zurück zum Zitat Gibert K, Cortés U (1997) Weighing quantitative and qualitative variables in clustering methods. Mathware Soft Comput 4(3):251–266 Gibert K, Cortés U (1997) Weighing quantitative and qualitative variables in clustering methods. Mathware Soft Comput 4(3):251–266
35.
Zurück zum Zitat Gibert K, Cortés U (1998) Clustering based on rules and knowledge discovery in ill-structured domains. Comput Sistemas 1(4):213–227 Gibert K, Cortés U (1998) Clustering based on rules and knowledge discovery in ill-structured domains. Comput Sistemas 1(4):213–227
36.
Zurück zum Zitat Gibert K, García-Rudolph, et al (2008) Response to TBI-neurorehabilitation through an AI & stats hybrid kdd methodology. Med Arch 62(3):132–135 Gibert K, García-Rudolph, et al (2008) Response to TBI-neurorehabilitation through an AI & stats hybrid kdd methodology. Med Arch 62(3):132–135
37.
Zurück zum Zitat Gibert K, Nonell et al (2005) Kdd with clustering: impact of metrics and reporting phase by using klass. Neural Net World 15(4):319–326 Gibert K, Nonell et al (2005) Kdd with clustering: impact of metrics and reporting phase by using klass. Neural Net World 15(4):319–326
38.
Zurück zum Zitat Gibert K, Nonell R (2003) Impact of mixed metrics on clustering. Lect Notes Comput Sci 2905:464–471CrossRef Gibert K, Nonell R (2003) Impact of mixed metrics on clustering. Lect Notes Comput Sci 2905:464–471CrossRef
39.
Zurück zum Zitat Gibert K, Nonell R (2008) Pre and post-processing in klass. In: iEMSs 2008 Proceedings, pp 1965–1966 Gibert K, Nonell R (2008) Pre and post-processing in klass. In: iEMSs 2008 Proceedings, pp 1965–1966
40.
Zurück zum Zitat Gibert K, Rodríguez-Silva G, Rodríguez-Roda I (2010a) Knowledge discovery with clustering based on rules by states: a water treatment application. Environ Model Softw 25:712–723CrossRef Gibert K, Rodríguez-Silva G, Rodríguez-Roda I (2010a) Knowledge discovery with clustering based on rules by states: a water treatment application. Environ Model Softw 25:712–723CrossRef
41.
Zurück zum Zitat Gibert K, Salvador-Carulla L, García Alonso C (2010b) Integrating clinicians, knowledge and data: expert-based cooperative analysis in medical decision support. Health Research Policy and Systems, (in press) Gibert K, Salvador-Carulla L, García Alonso C (2010b) Integrating clinicians, knowledge and data: expert-based cooperative analysis in medical decision support. Health Research Policy and Systems, (in press)
42.
Zurück zum Zitat Gibert K, Sonicki Z, Martín JC (2002) Impact of data encoding and thyroids dysfunctions. Stud Health Tech Inf 90:494–498 Gibert K, Sonicki Z, Martín JC (2002) Impact of data encoding and thyroids dysfunctions. Stud Health Tech Inf 90:494–498
43.
Zurück zum Zitat Gómez-Pérez A, Fernández-López M, Corcho O (2004) Ontological engineering. 2nd printing. Springer, Berlin. ISBN 1-85233-551-3 Gómez-Pérez A, Fernández-López M, Corcho O (2004) Ontological engineering. 2nd printing. Springer, Berlin. ISBN 1-85233-551-3
44.
Zurück zum Zitat Gower J (1971a) A general coefficient of similarity and some of its properties. Biometrics 27(4):857–874CrossRef Gower J (1971a) A general coefficient of similarity and some of its properties. Biometrics 27(4):857–874CrossRef
45.
Zurück zum Zitat Guha S, Rastogi R, Shim K (1999) Rock: A robust clustering algorithm for categorical attributes. In Proc. 15th Int. Conf. on Data, Engineering, pp. 512–521 Guha S, Rastogi R, Shim K (1999) Rock: A robust clustering algorithm for categorical attributes. In Proc. 15th Int. Conf. on Data, Engineering, pp. 512–521
46.
Zurück zum Zitat Helsper E, van der Gaag L (2002) Building bayesian networks through ontologies. In: Proceedings of ECAI2002, pp 680–684 Helsper E, van der Gaag L (2002) Building bayesian networks through ontologies. In: Proceedings of ECAI2002, pp 680–684
47.
Zurück zum Zitat Hotho A, Staab S, Stumme G (2003) Ontologies improve text document clustering. In: Data mining, 2003. ICDM 2003. Third IEEE international conference on, pp 541–544 Hotho A, Staab S, Stumme G (2003) Ontologies improve text document clustering. In: Data mining, 2003. ICDM 2003. Third IEEE international conference on, pp 541–544
48.
Zurück zum Zitat Huang Y, Bian L (2009) A bayesian network and analytic hierarchy process based personalized recommendations for tourist attractions over the internet. Expert Syst Appl 36(1):933–943CrossRef Huang Y, Bian L (2009) A bayesian network and analytic hierarchy process based personalized recommendations for tourist attractions over the internet. Expert Syst Appl 36(1):933–943CrossRef
49.
Zurück zum Zitat Song M (2008) Biomedical ontologies and text mining for biomedicine and healthcare: a survey. J Comput Sci Eng 2(2):109–136 Song M (2008) Biomedical ontologies and text mining for biomedicine and healthcare: a survey. J Comput Sci Eng 2(2):109–136
50.
Zurück zum Zitat Ichino M, Yaguchi H (1994 April) Generalized Minkowski metrics for mixed feature-type data analysis. IEEE Trans SMC 22(2):146–153 Ichino M, Yaguchi H (1994 April) Generalized Minkowski metrics for mixed feature-type data analysis. IEEE Trans SMC 22(2):146–153
51.
Zurück zum Zitat Meng XJ, Chen QC, Wang XL (2009) A tolerance rough set based semantic clustering method for web search results. Inf Technol J 8(4):453–464CrossRef Meng XJ, Chen QC, Wang XL (2009) A tolerance rough set based semantic clustering method for web search results. Inf Technol J 8(4):453–464CrossRef
52.
Zurück zum Zitat Jing L, Zhou L, Ng MK, Huang JZ (2006) Ontology-based distance measure for text clustering. In: SIAM SDM workshop on text mining, Bethesda, Maryland, USA Jing L, Zhou L, Ng MK, Huang JZ (2006) Ontology-based distance measure for text clustering. In: SIAM SDM workshop on text mining, Bethesda, Maryland, USA
53.
Zurück zum Zitat Gibert K, Conti D, Vrecko D (2012) Assisting the end-user in the interpretation of profiles for decision support. an application to wastewater treatment plants. Environ Eng Manage J 11(5):931–944 Gibert K, Conti D, Vrecko D (2012) Assisting the end-user in the interpretation of profiles for decision support. an application to wastewater treatment plants. Environ Eng Manage J 11(5):931–944
54.
Zurück zum Zitat Knappe R (2005) Measures of semantic similarity and relatedness for use in ontology-based information retrieval. PhD thesis, Roskilde University, DN Knappe R (2005) Measures of semantic similarity and relatedness for use in ontology-based information retrieval. PhD thesis, Roskilde University, DN
55.
Zurück zum Zitat Lamsfus C, Grun C, Alzua-Sorzabal A, Werthner H (2010) Context-based matchmaking to enhance tourists’ experience. J Inform Prof 203:17–23 Lamsfus C, Grun C, Alzua-Sorzabal A, Werthner H (2010) Context-based matchmaking to enhance tourists’ experience. J Inform Prof 203:17–23
56.
Zurück zum Zitat Leacock C, Chodorow M (1998) Combining local context and WordNet similarity for word sense identification, chapter WordNet: an electronic lexical database, pp 265–283. MIT Press Leacock C, Chodorow M (1998) Combining local context and WordNet similarity for word sense identification, chapter WordNet: an electronic lexical database, pp 265–283. MIT Press
58.
Zurück zum Zitat Maedche A, Zacharias V (2002) Clustering ontology-based metadata in the semantic web. In: vol 2431 of LNCS, pp 348–360, London, UK. Springer Maedche A, Zacharias V (2002) Clustering ontology-based metadata in the semantic web. In: vol 2431 of LNCS, pp 348–360, London, UK. Springer
59.
Zurück zum Zitat Miller G, Charles W (1991) Contextual correlates of semantic similarity. Lang Cogn Process 6(1):1–28CrossRef Miller G, Charles W (1991) Contextual correlates of semantic similarity. Lang Cogn Process 6(1):1–28CrossRef
60.
Zurück zum Zitat Minguez I, Berrueta D, Polo L (2010) Cruzar: an application of semantic matchmaking to e-tourism. Practices and applications. In: IGI Global, In Cases on semantic interoperability for information systems integration, pp 255–271 Minguez I, Berrueta D, Polo L (2010) Cruzar: an application of semantic matchmaking to e-tourism. Practices and applications. In: IGI Global, In Cases on semantic interoperability for information systems integration, pp 255–271
61.
Zurück zum Zitat Moreno A, Valls A, Isern D, Marin L, Borras J (2013) Sigtur/e-destination: ontology-based personalized recommendation of tourism and leisure activities. Eng Appl Artif Intell 26(1):633–651CrossRef Moreno A, Valls A, Isern D, Marin L, Borras J (2013) Sigtur/e-destination: ontology-based personalized recommendation of tourism and leisure activities. Eng Appl Artif Intell 26(1):633–651CrossRef
62.
Zurück zum Zitat Nakhaeizadeh G (1996) Classification as a subtask of data mining experiences form some industrial projects. In Proceedings of the IFCS, vI, pp 17–20 Nakhaeizadeh G (1996) Classification as a subtask of data mining experiences form some industrial projects. In Proceedings of the IFCS, vI, pp 17–20
63.
64.
Zurück zum Zitat Pandey G, Myers CL, Kumar V (2009) Incorporating functional inter-relationships into protein function prediction algorithms. BMC Bioinform I:42 Pandey G, Myers CL, Kumar V (2009) Incorporating functional inter-relationships into protein function prediction algorithms. BMC Bioinform I:42
65.
Zurück zum Zitat Pavlidis P, Qin J, Arango V, Mann J, Sibille E (2004) Using the gene ontology for microarray data mining: A comparison of methods and application to age effects in human prefrontal cortex. Neurochemical Research 29:1213–1222CrossRef Pavlidis P, Qin J, Arango V, Mann J, Sibille E (2004) Using the gene ontology for microarray data mining: A comparison of methods and application to age effects in human prefrontal cortex. Neurochemical Research 29:1213–1222CrossRef
66.
Zurück zum Zitat Pedersen T, Pakhomov S, Patwardhan S, Chute C (2007) Measures of semantic similarity and relatedness in the biomedical domain. J Biomed Inform 40:288–299CrossRef Pedersen T, Pakhomov S, Patwardhan S, Chute C (2007) Measures of semantic similarity and relatedness in the biomedical domain. J Biomed Inform 40:288–299CrossRef
67.
Zurück zum Zitat Pérez-Bonilla A, Gibert K (2007) Automatic generation of conceptual interpretation of clustering. In: Progress in pattern recognition, image analysis and applications. LNCS, vol 4756, pp 653–663. Springer Pérez-Bonilla A, Gibert K (2007) Automatic generation of conceptual interpretation of clustering. In: Progress in pattern recognition, image analysis and applications. LNCS, vol 4756, pp 653–663. Springer
68.
Zurück zum Zitat Rada R, Mili H, Bichnell E, Blettner M (1989) Development and application of a metric on semantic nets. IEEE Trans SMC 9(1):17–30 Rada R, Mili H, Bichnell E, Blettner M (1989) Development and application of a metric on semantic nets. IEEE Trans SMC 9(1):17–30
69.
Zurück zum Zitat Rajpathak D, Chougule R, Bandyopadhyay P (2012) A domain-specific decision support system for knowledge discovery using association and text mining. Knowl Inf Syst 31:405–432CrossRef Rajpathak D, Chougule R, Bandyopadhyay P (2012) A domain-specific decision support system for knowledge discovery using association and text mining. Knowl Inf Syst 31:405–432CrossRef
70.
Zurück zum Zitat Ralambondrainy H (1988) A clustering method for nominal data and mixture of numerical and nominal data. Classification and related methods of data analysis. H.H. Bock, Elsevier Science Publishers, B.V. (North-Holland) Ralambondrainy H (1988) A clustering method for nominal data and mixture of numerical and nominal data. Classification and related methods of data analysis. H.H. Bock, Elsevier Science Publishers, B.V. (North-Holland)
71.
Zurück zum Zitat Ramamohanarao K, Krishna P R, et al. (2007) Advances in databases: concepts, systems and applications DASFAA, vol 4443 Ramamohanarao K, Krishna P R, et al. (2007) Advances in databases: concepts, systems and applications DASFAA, vol 4443
72.
Zurück zum Zitat Renso C, Baglioni M, Macedo JA, Trasarti R, Wachowicz M (2012) How you move reveals who you are: understanding human behavior by analyzing trajectory data. Knowl Inf Syst, 1–32 Renso C, Baglioni M, Macedo JA, Trasarti R, Wachowicz M (2012) How you move reveals who you are: understanding human behavior by analyzing trajectory data. Knowl Inf Syst, 1–32
73.
Zurück zum Zitat Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the IJCAI 95, pp 448–453, Montreal, Canada Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the IJCAI 95, pp 448–453, Montreal, Canada
74.
Zurück zum Zitat Rubenstein H, Goodenough J (1965) Contextual correlates of synonymy. Commun ACM 8(10):627–633CrossRef Rubenstein H, Goodenough J (1965) Contextual correlates of synonymy. Commun ACM 8(10):627–633CrossRef
75.
Zurück zum Zitat Ruiz-Montiel M, Aldana J (2009) Semantically enhanced recommender systems. In: Proceedings of the OTM 2009 conference, workshop on the move to meaningful internet systems, pp 604–609 Ruiz-Montiel M, Aldana J (2009) Semantically enhanced recommender systems. In: Proceedings of the OTM 2009 conference, workshop on the move to meaningful internet systems, pp 604–609
76.
Zurück zum Zitat Sánchez D, Batet M, Valls A, Gibert K (2010) Ontology-driven web-based semantic similarity. J Intell Inf Syst 35:383–413CrossRef Sánchez D, Batet M, Valls A, Gibert K (2010) Ontology-driven web-based semantic similarity. J Intell Inf Syst 35:383–413CrossRef
77.
Zurück zum Zitat Senkul P, Salin S (2012) Improving pattern quality in web usage mining by using semantic information. Knowl Inf Syst 30:527–541CrossRef Senkul P, Salin S (2012) Improving pattern quality in web usage mining by using semantic information. Knowl Inf Syst 30:527–541CrossRef
78.
Zurück zum Zitat Shin K, Abraham A (2006) IDEAL 2006, LNCS, chapter two Phase Semi-supervised clustering using background knowledge, pp 707–712. Springer Shin K, Abraham A (2006) IDEAL 2006, LNCS, chapter two Phase Semi-supervised clustering using background knowledge, pp 707–712. Springer
79.
Zurück zum Zitat Sokal R, Sneath P (1963) Principles of numerical taxonomy. Freeman, San Francisco Sokal R, Sneath P (1963) Principles of numerical taxonomy. Freeman, San Francisco
80.
Zurück zum Zitat Song S, Guo Z, Chen P (2011) Fuzzy document clustering using weighted conceptual model. Inf Technol J 10(6):1178–1185CrossRef Song S, Guo Z, Chen P (2011) Fuzzy document clustering using weighted conceptual model. Inf Technol J 10(6):1178–1185CrossRef
81.
Zurück zum Zitat Steyvers M, Smyth P, Chemuduganta C (2011) Combining background knowledge and learned topics. Topics Cogn Sci 3:18–47CrossRef Steyvers M, Smyth P, Chemuduganta C (2011) Combining background knowledge and learned topics. Topics Cogn Sci 3:18–47CrossRef
82.
Zurück zum Zitat Thangamani M, Thangaraj P (2010) Integrated clustering and feature selection scheme for text documents. J Comput Sci 6(5):536–541CrossRef Thangamani M, Thangaraj P (2010) Integrated clustering and feature selection scheme for text documents. J Comput Sci 6(5):536–541CrossRef
83.
Zurück zum Zitat Tiffin N, Kelso JF, Powell AR, Pan H, Bajic VB, Hide WA (2005) Integration of text- and data-mining using ontologies successfully selects disease gene candidates. Nucleic Acids Res 33(5):1544–1552CrossRef Tiffin N, Kelso JF, Powell AR, Pan H, Bajic VB, Hide WA (2005) Integration of text- and data-mining using ontologies successfully selects disease gene candidates. Nucleic Acids Res 33(5):1544–1552CrossRef
84.
Zurück zum Zitat Tversky A (1977) Features of similarity. Phychol Rev 84:327–352 Tversky A (1977) Features of similarity. Phychol Rev 84:327–352
85.
Zurück zum Zitat Valls A, Batet M, Lopez E (2009) Using experts rules as background knowledge in the ClusDM methodology. EJOR 193(3):864–875CrossRef Valls A, Batet M, Lopez E (2009) Using experts rules as background knowledge in the ClusDM methodology. EJOR 193(3):864–875CrossRef
86.
Zurück zum Zitat Wang F, Sun J, Ebadollahi S (2011) Integrating distance metrics learned from multiple experts and its application in inter-patient similarity assessment. In: Proceedings of the 11th SIAM international conference on data mining (SDM), pp 59–70 Wang F, Sun J, Ebadollahi S (2011) Integrating distance metrics learned from multiple experts and its application in inter-patient similarity assessment. In: Proceedings of the 11th SIAM international conference on data mining (SDM), pp 59–70
87.
Zurück zum Zitat Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd ACL, pp 133–138, New Mexico, USA Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd ACL, pp 133–138, New Mexico, USA
88.
Zurück zum Zitat Xing E, Ng AY, Jordan MI, Russell S (2002) Distance metric learning, with application to clustering with side-information. NIPS, pp 505–512 Xing E, Ng AY, Jordan MI, Russell S (2002) Distance metric learning, with application to clustering with side-information. NIPS, pp 505–512
89.
Zurück zum Zitat Yang S, Liao P, Ho C (2005) An ontology-supported case-based reasoning technique for faq proxy service. In: Proceedings of the 17th international conference on software engineering and knowledge, engineering, pp 639–644 Yang S, Liao P, Ho C (2005) An ontology-supported case-based reasoning technique for faq proxy service. In: Proceedings of the 17th international conference on software engineering and knowledge, engineering, pp 639–644
90.
91.
Zurück zum Zitat Zamir O, Etzioni O (1999) Grouper: a dynamic clustering interface to web search results. Comput Netw 31:1361–1374CrossRef Zamir O, Etzioni O (1999) Grouper: a dynamic clustering interface to web search results. Comput Netw 31:1361–1374CrossRef
92.
Zurück zum Zitat Zhang J, Silvescu A, Honavar V (2002) Ontology-driven induction of decision trees at multiple levels of abstraction. In: Koenig S, Holte R (eds) Abstraction, reformulation, and approximation, vol 2371 of lecture notes in computer science, pp 316–323. Springer, Berlin, Heidelberg Zhang J, Silvescu A, Honavar V (2002) Ontology-driven induction of decision trees at multiple levels of abstraction. In: Koenig S, Holte R (eds) Abstraction, reformulation, and approximation, vol 2371 of lecture notes in computer science, pp 316–323. Springer, Berlin, Heidelberg
Metadaten
Titel
Introducing semantic variables in mixed distance measures: Impact on hierarchical clustering
verfasst von
Karina Gibert
Aïda Valls
Montserrat Batet
Publikationsdatum
01.09.2014
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 3/2014
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-013-0663-5

Weitere Artikel der Ausgabe 3/2014

Knowledge and Information Systems 3/2014 Zur Ausgabe