Skip to main content
Erschienen in: Data Mining and Knowledge Discovery 4/2018

22.02.2018

Interpretation of text patterns

verfasst von: Md Abul Bashar, Yuefeng Li

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 4/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Patterns are used as a fundamental means to analyse data in many text mining applications. Many efficient techniques have been developed to discover patterns. However, the excessive number of discovered patterns and lack of grounded (e.g. a priori defined) semantics have made it difficult for a user to interpret and explore the patterns. An insight into the meanings of the patterns can benefit users in the process of exploring them. In this regard, this paper presents a model to automatically interpret patterns by achieving two goals: (1) providing the meanings of patterns in terms of ontology concepts and (2) providing a new method for generating and extracting features from an ontology to describe the relevant information more effectively. Taking advantage of a domain ontology and a set of relevant statistics (e.g. term frequency in a document, inverse term frequency in a domain ontology, etc.), our proposed model can give an insight into the hidden meanings of the patterns. The model is evaluated by comparing it with different baseline models on three standard datasets. The results show that the performance of the proposed model is significantly better than baseline models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Afrati F, Gionis A, Mannila H (2004) Approximating a collection of frequent sets. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Seattle, WA, USA, pp 12–19 Afrati F, Gionis A, Mannila H (2004) Approximating a collection of frequent sets. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Seattle, WA, USA, pp 12–19
Zurück zum Zitat Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data, vol 22. ACM, Washington, DC, USA, pp 207–216 Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data, vol 22. ACM, Washington, DC, USA, pp 207–216
Zurück zum Zitat Anderson JR (1983) A spreading activation theory of memory. J Verbal Learn Verbal Behav 22(3):261–295CrossRef Anderson JR (1983) A spreading activation theory of memory. J Verbal Learn Verbal Behav 22(3):261–295CrossRef
Zurück zum Zitat Banko M, Cafarella MJ, Soderland S, Broadhead M, Etzioni O (2007) Open information extraction from the web. In: Proceedings of the 20th international joint conference on artifical intelligence, vol 7. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 2670–2676 Banko M, Cafarella MJ, Soderland S, Broadhead M, Etzioni O (2007) Open information extraction from the web. In: Proceedings of the 20th international joint conference on artifical intelligence, vol 7. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 2670–2676
Zurück zum Zitat Bayardo Jr RJ (1998) Efficiently mining long patterns from databases. In: ACM Sigmod record, vol 27. ACM, Seattle, Washington, USA, pp 85–93 Bayardo Jr RJ (1998) Efficiently mining long patterns from databases. In: ACM Sigmod record, vol 27. ACM, Seattle, Washington, USA, pp 85–93
Zurück zum Zitat Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3(Feb):1137–1155MATH Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3(Feb):1137–1155MATH
Zurück zum Zitat Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATH
Zurück zum Zitat Bloehdorn S, Cimiano P, Hotho A (2006) Learning ontologies to improve text clustering and classification. From data and information analysis to knowledge engineering. Springer, Magdeburg, pp 334–341CrossRef Bloehdorn S, Cimiano P, Hotho A (2006) Learning ontologies to improve text clustering and classification. From data and information analysis to knowledge engineering. Springer, Magdeburg, pp 334–341CrossRef
Zurück zum Zitat Brewster C, Alani H, Dasmahapatra S, Wilks Y (2004) Data driven ontology evaluation. In: International conference on language resources and evaluation (LREC 2004). Lisbon, Portugal Brewster C, Alani H, Dasmahapatra S, Wilks Y (2004) Data driven ontology evaluation. In: International conference on language resources and evaluation (LREC 2004). Lisbon, Portugal
Zurück zum Zitat Buckley C, Voorhees EM (2000) Evaluating evaluation measure stability. In: Proceedings of the 23rd annual international ACM SIGIR conference on research and development in information retrieval. ACM, Athens, Greece, pp 33–40 Buckley C, Voorhees EM (2000) Evaluating evaluation measure stability. In: Proceedings of the 23rd annual international ACM SIGIR conference on research and development in information retrieval. ACM, Athens, Greece, pp 33–40
Zurück zum Zitat Bunescu R, Mooney RJ (2006) Subsequence kernels for relation extraction. Advances in neural information processing systems. MIT Press, Cambridge, pp 171–178 Bunescu R, Mooney RJ (2006) Subsequence kernels for relation extraction. Advances in neural information processing systems. MIT Press, Cambridge, pp 171–178
Zurück zum Zitat Calegari S, Pasi G (2013) Personal ontologies: generation of user profiles based on the yago ontology. Inf Process Manag 49(3):640–658CrossRef Calegari S, Pasi G (2013) Personal ontologies: generation of user profiles based on the yago ontology. Inf Process Manag 49(3):640–658CrossRef
Zurück zum Zitat Caropreso MF, Matwin S, Sebastiani F (2001) A learner-independent evaluation of the usefulness of statistical phrases for automated text categorization. Text Databases Doc Manag Theory Pract 5478:78–102 Caropreso MF, Matwin S, Sebastiani F (2001) A learner-independent evaluation of the usefulness of statistical phrases for automated text categorization. Text Databases Doc Manag Theory Pract 5478:78–102
Zurück zum Zitat Chemudugunta C, Holloway A, Smyth P, Steyvers M (2008a) Modeling documents by combining semantic concepts with unsupervised statistical learning. In: International semantic web conference. Springer, Karlsruhe, pp 229–244 Chemudugunta C, Holloway A, Smyth P, Steyvers M (2008a) Modeling documents by combining semantic concepts with unsupervised statistical learning. In: International semantic web conference. Springer, Karlsruhe, pp 229–244
Zurück zum Zitat Chemudugunta C, Smyth P, Steyvers M (2008b) Combining concept hierarchies and statistical topic models. In: Proceedings of the 17th ACM conference on information and knowledge management, ACM, Napa Valley, California, USA, pp 1469–1470 Chemudugunta C, Smyth P, Steyvers M (2008b) Combining concept hierarchies and statistical topic models. In: Proceedings of the 17th ACM conference on information and knowledge management, ACM, Napa Valley, California, USA, pp 1469–1470
Zurück zum Zitat Collins AM, Loftus EF (1975) A spreading-activation theory of semantic processing. Psychol Rev 82(6):407CrossRef Collins AM, Loftus EF (1975) A spreading-activation theory of semantic processing. Psychol Rev 82(6):407CrossRef
Zurück zum Zitat Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th international conference on machine learning, ACM, pp 160–167 Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. In: Proceedings of the 25th international conference on machine learning, ACM, pp 160–167
Zurück zum Zitat Crestani F (1997) Application of spreading activation techniques in information retrieval. Artif Intell Rev 11(6):453–482CrossRef Crestani F (1997) Application of spreading activation techniques in information retrieval. Artif Intell Rev 11(6):453–482CrossRef
Zurück zum Zitat Deerwester SC, Dumais ST, Landauer TK, Furnas GW, Harshman RA (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407CrossRef Deerwester SC, Dumais ST, Landauer TK, Furnas GW, Harshman RA (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407CrossRef
Zurück zum Zitat Del Corro L, Gemulla R (2013) Clausie: clause-based open information extraction. In: Proceedings of the 22nd international conference on world wide web. ACM, pp 355–366 Del Corro L, Gemulla R (2013) Clausie: clause-based open information extraction. In: Proceedings of the 22nd international conference on world wide web. ACM, pp 355–366
Zurück zum Zitat Egozi O, Gabrilovich E, Markovitch S (2008) Concept-based feature generation and selection for information retrieval. In: AAAI conference on artificial intelligence, vol 8. Chicago, Illinois, pp 1132–1137 Egozi O, Gabrilovich E, Markovitch S (2008) Concept-based feature generation and selection for information retrieval. In: AAAI conference on artificial intelligence, vol 8. Chicago, Illinois, pp 1132–1137
Zurück zum Zitat Egozi O, Markovitch S, Gabrilovich E (2011) Concept-based information retrieval using explicit semantic analysis. ACM Trans Inf Syst (TOIS) 29(2):1–38CrossRef Egozi O, Markovitch S, Gabrilovich E (2011) Concept-based information retrieval using explicit semantic analysis. ACM Trans Inf Syst (TOIS) 29(2):1–38CrossRef
Zurück zum Zitat Fader A, Soderland S, Etzioni O (2011) Identifying relations for open information extraction. In: Proceedings of the conference on empirical methods in natural language processing, Association for Computational Linguistics, pp 1535–1545 Fader A, Soderland S, Etzioni O (2011) Identifying relations for open information extraction. In: Proceedings of the conference on empirical methods in natural language processing, Association for Computational Linguistics, pp 1535–1545
Zurück zum Zitat Gabrilovich E, Markovitch S (2005) Feature generation for text categorization using world knowledge. In: Proceedings of the 19th international joint conference on artificial intelligence, vol 5. Edinburgh, Scotland, pp 1048–1053 Gabrilovich E, Markovitch S (2005) Feature generation for text categorization using world knowledge. In: Proceedings of the 19th international joint conference on artificial intelligence, vol 5. Edinburgh, Scotland, pp 1048–1053
Zurück zum Zitat Gabrilovich E, Markovitch S (2007a) Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: Proceedings of the 20th international joint conference on artificial intelligence, vol 6. Hyderabad, India, pp 1606–1611 Gabrilovich E, Markovitch S (2007a) Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: Proceedings of the 20th international joint conference on artificial intelligence, vol 6. Hyderabad, India, pp 1606–1611
Zurück zum Zitat Gabrilovich E, Markovitch S (2007b) Harnessing the expertise of 70, 000 human editors: knowledge-based feature generation for text categorization. J Mach Learn Res 8(10):2297–2345 Gabrilovich E, Markovitch S (2007b) Harnessing the expertise of 70, 000 human editors: knowledge-based feature generation for text categorization. J Mach Learn Res 8(10):2297–2345
Zurück zum Zitat Gabrilovich E, Markovitch S (2009) Wikipedia-based semantic interpretation for natural language processing. J Artif Intell Res 34(2):443–498CrossRefMATH Gabrilovich E, Markovitch S (2009) Wikipedia-based semantic interpretation for natural language processing. J Artif Intell Res 34(2):443–498CrossRefMATH
Zurück zum Zitat Gallo A, De Bie T, Cristianini N (2007) Mini: mining informative non-redundant itemsets. In: European conference on principles of data mining and knowledge discovery. Springer, pp 438–445 Gallo A, De Bie T, Cristianini N (2007) Mini: mining informative non-redundant itemsets. In: European conference on principles of data mining and knowledge discovery. Springer, pp 438–445
Zurück zum Zitat Gauch S, Chaffee J, Pretschner A (2003) Ontology-based personalized search and browsing. Web Intell Agent Syst 1(3):219–234 Gauch S, Chaffee J, Pretschner A (2003) Ontology-based personalized search and browsing. Web Intell Agent Syst 1(3):219–234
Zurück zum Zitat Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 513–520 Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 513–520
Zurück zum Zitat Goutsias J, Mahler RP, Nguyen HT (2012) Random sets: theory and applications, vol 97. Springer, Berlin Goutsias J, Mahler RP, Nguyen HT (2012) Random sets: theory and applications, vol 97. Springer, Berlin
Zurück zum Zitat Grossman DA (2004) Information retrieval: algorithms and heuristics, vol 15. Springer, BerlinCrossRefMATH Grossman DA (2004) Information retrieval: algorithms and heuristics, vol 15. Springer, BerlinCrossRefMATH
Zurück zum Zitat Guns T, Nijssen S, De Raedt L (2013) k-pattern set mining under constraints. IEEE Trans Knowl Data Eng 25(2):402–418CrossRef Guns T, Nijssen S, De Raedt L (2013) k-pattern set mining under constraints. IEEE Trans Knowl Data Eng 25(2):402–418CrossRef
Zurück zum Zitat Han J, Wang J, Lu Y, Tzvetkov P (2002) Mining top-k frequent closed patterns without minimum support. In: IEEE international conference on data mining (ICDM), IEEE, Maebashi City, Japan, pp 211–218 Han J, Wang J, Lu Y, Tzvetkov P (2002) Mining top-k frequent closed patterns without minimum support. In: IEEE international conference on data mining (ICDM), IEEE, Maebashi City, Japan, pp 211–218
Zurück zum Zitat Hennig L, Umbrath W, Wetzker R (2008) An ontology-based approach to text summarization. In: IEEE/WIC/ACM international joint conference on web intelligence (WI) and intelligent agent technology (IAT), vol 3. IEEE. Sydney, NSW, Australia, pp 291–294 Hennig L, Umbrath W, Wetzker R (2008) An ontology-based approach to text summarization. In: IEEE/WIC/ACM international joint conference on web intelligence (WI) and intelligent agent technology (IAT), vol 3. IEEE. Sydney, NSW, Australia, pp 291–294
Zurück zum Zitat Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, ACM, pp 50–57 Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, ACM, pp 50–57
Zurück zum Zitat Hopfield JJ (1984) Neurons with graded response have collective computational properties like those of two-state neurons. Proc Nat Acad Sci 81(10):3088–3092CrossRefMATH Hopfield JJ (1984) Neurons with graded response have collective computational properties like those of two-state neurons. Proc Nat Acad Sci 81(10):3088–3092CrossRefMATH
Zurück zum Zitat Hotho A, Nürnberger A, Paaß G (2005) A brief survey of text mining. Ldv Forum 20:19–62 Hotho A, Nürnberger A, Paaß G (2005) A brief survey of text mining. Ldv Forum 20:19–62
Zurück zum Zitat Hulpus I, Hayes C, Karnstedt M, Greene D (2013) Unsupervised graph-based topic labelling using dbpedia. In: Proceedings of the sixth ACM international conference on Web search and data mining, ACM, Rome, Italy, pp 465–474 Hulpus I, Hayes C, Karnstedt M, Greene D (2013) Unsupervised graph-based topic labelling using dbpedia. In: Proceedings of the sixth ACM international conference on Web search and data mining, ACM, Rome, Italy, pp 465–474
Zurück zum Zitat Ingaramo D, Pinto D, Rosso P, Errecalde M (2008) Evaluation of internal validity measures in short-text corpora. In: Computational linguistics and intelligent text processing, Springer, Haifa, Israel, pp 555–567 Ingaramo D, Pinto D, Rosso P, Errecalde M (2008) Evaluation of internal validity measures in short-text corpora. In: Computational linguistics and intelligent text processing, Springer, Haifa, Israel, pp 555–567
Zurück zum Zitat Karp RM (1972) Reducibility among combinatorial problems. Complexity of computer computations. Springer, Berlin, pp 85–103CrossRef Karp RM (1972) Reducibility among combinatorial problems. Complexity of computer computations. Springer, Berlin, pp 85–103CrossRef
Zurück zum Zitat Knobbe AJ, Ho EK (2006) Pattern teams. In: European conference on principles of data mining and knowledge discovery, Springer, pp 577–584 Knobbe AJ, Ho EK (2006) Pattern teams. In: European conference on principles of data mining and knowledge discovery, Springer, pp 577–584
Zurück zum Zitat Kriegel HP, Borgwardt KM, Kröger P, Pryakhin A, Schubert M, Zimek A (2007) Future trends in data mining. Data Min Knowl Disc 15(1):87–97MathSciNetCrossRef Kriegel HP, Borgwardt KM, Kröger P, Pryakhin A, Schubert M, Zimek A (2007) Future trends in data mining. Data Min Knowl Disc 15(1):87–97MathSciNetCrossRef
Zurück zum Zitat Kruse R, Schwecke E, Heinsohn J (1991) Uncertainty and vagueness in knowledge based systems. Springer, New York Inc., New YorkCrossRefMATH Kruse R, Schwecke E, Heinsohn J (1991) Uncertainty and vagueness in knowledge based systems. Springer, New York Inc., New YorkCrossRefMATH
Zurück zum Zitat Kruse R, Schwecke E, Heinsohn J (2012) Uncertainty and vagueness in knowledge based systems: numerical methods. Springer, BerlinMATH Kruse R, Schwecke E, Heinsohn J (2012) Uncertainty and vagueness in knowledge based systems: numerical methods. Springer, BerlinMATH
Zurück zum Zitat Lau JH, Newman D, Karimi S, Baldwin T (2010) Best topic word selection for topic labelling. In: Proceedings of the 23rd international conference on computational linguistics: Posters, Association for Computational Linguistics, Beijing, China, pp 605–613 Lau JH, Newman D, Karimi S, Baldwin T (2010) Best topic word selection for topic labelling. In: Proceedings of the 23rd international conference on computational linguistics: Posters, Association for Computational Linguistics, Beijing, China, pp 605–613
Zurück zum Zitat Lau JH, Grieser K, Newman D, Baldwin T (2011) Automatic labelling of topic models. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, vol 1. Association for Computational Linguistics, Portland, Oregon, USA, pp 1536–1545 Lau JH, Grieser K, Newman D, Baldwin T (2011) Automatic labelling of topic models. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, vol 1. Association for Computational Linguistics, Portland, Oregon, USA, pp 1536–1545
Zurück zum Zitat Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5:361–397 Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5:361–397
Zurück zum Zitat Li G, Zaki MJ (2016) Sampling frequent and minimal boolean patterns: theory and application in classification. Data Min Knowl Disc 30(1):181–225MathSciNetCrossRef Li G, Zaki MJ (2016) Sampling frequent and minimal boolean patterns: theory and application in classification. Data Min Knowl Disc 30(1):181–225MathSciNetCrossRef
Zurück zum Zitat Li Y, Zhong N (2006) Mining ontology for automatically acquiring web user information needs. IEEE Trans Knowl Data Eng 18(4):554–568CrossRef Li Y, Zhong N (2006) Mining ontology for automatically acquiring web user information needs. IEEE Trans Knowl Data Eng 18(4):554–568CrossRef
Zurück zum Zitat Li Y, Algarni A, Zhong N (2010) Mining positive and negative patterns for relevance feature discovery. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Washington, DC, USA, pp 753–762 Li Y, Algarni A, Zhong N (2010) Mining positive and negative patterns for relevance feature discovery. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Washington, DC, USA, pp 753–762
Zurück zum Zitat Liu B, Zhao K, Benkler J, Xiao W (2006) Rule interestingness analysis using olap operations. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Philadelphia, PA, USA, pp 297–306 Liu B, Zhao K, Benkler J, Xiao W (2006) Rule interestingness analysis using olap operations. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Philadelphia, PA, USA, pp 297–306
Zurück zum Zitat Liu J, Shang J, Wang C, Ren X, Han J (2015) Mining quality phrases from massive text corpora. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, ACM, pp 1729–1744 Liu J, Shang J, Wang C, Ren X, Han J (2015) Mining quality phrases from massive text corpora. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, ACM, pp 1729–1744
Zurück zum Zitat Liu J, Ren X, Shang J, Cassidy T, Voss CR, Han J (2016) Representing documents via latent keyphrase inference. In: Proceedings of the 25th international conference on world wide web, International World Wide Web Conferences Steering Committee, pp 1057–1067 Liu J, Ren X, Shang J, Cassidy T, Voss CR, Han J (2016) Representing documents via latent keyphrase inference. In: Proceedings of the 25th international conference on world wide web, International World Wide Web Conferences Steering Committee, pp 1057–1067
Zurück zum Zitat Mao XL, Ming ZY, Zha ZJ, Chua TS, Yan H, Li X (2012) Automatic labeling hierarchical topics. In: Proceedings of the 21st ACM international conference on Information and knowledge management, ACM, pp 2383–2386 Mao XL, Ming ZY, Zha ZJ, Chua TS, Yan H, Li X (2012) Automatic labeling hierarchical topics. In: Proceedings of the 21st ACM international conference on Information and knowledge management, ACM, pp 2383–2386
Zurück zum Zitat Mei Q, Liu C, Su H, Zhai C (2006a) A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In: Proceedings of the 15th international conference on world wide web, ACM, Edinburgh, Scotland, pp 533–542 Mei Q, Liu C, Su H, Zhai C (2006a) A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In: Proceedings of the 15th international conference on world wide web, ACM, Edinburgh, Scotland, pp 533–542
Zurück zum Zitat Mei Q, Xin D, Cheng H, Han J, Zhai C (2006b) Generating semantic annotations for frequent patterns with context analysis. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Philadelphia, PA, USA, pp 337–346 Mei Q, Xin D, Cheng H, Han J, Zhai C (2006b) Generating semantic annotations for frequent patterns with context analysis. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Philadelphia, PA, USA, pp 337–346
Zurück zum Zitat Mei Q, Shen X, Zhai C (2007a) Automatic labeling of multinomial topic models. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, San Jose, California, USA, pp 490–499 Mei Q, Shen X, Zhai C (2007a) Automatic labeling of multinomial topic models. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, San Jose, California, USA, pp 490–499
Zurück zum Zitat Mei Q, Xin D, Cheng H, Han J, Zhai C (2007b) Semantic annotation of frequent patterns. ACM Trans Knowl Discov Data (TKDD) 1(3):11:1–11:30 Mei Q, Xin D, Cheng H, Han J, Zhai C (2007b) Semantic annotation of frequent patterns. ACM Trans Knowl Discov Data (TKDD) 1(3):11:1–11:30
Zurück zum Zitat Michelson M, Macskassy SA (2010) Discovering users’ topics of interest on twitter: a first look. In: Proceedings of the fourth workshop on analytics for noisy unstructured text data, ACM, pp 73–80 Michelson M, Macskassy SA (2010) Discovering users’ topics of interest on twitter: a first look. In: Proceedings of the fourth workshop on analytics for noisy unstructured text data, ACM, pp 73–80
Zurück zum Zitat Mielikäinen T, Mannila H (2003) The pattern ordering problem. In: European conference on principles of data mining and knowledge discovery, Springer, pp 327–338 Mielikäinen T, Mannila H (2003) The pattern ordering problem. In: European conference on principles of data mining and knowledge discovery, Springer, pp 327–338
Zurück zum Zitat Mikolov T (2012) Statistical language models based on neural networks. Presentation at google, mountain view, 2nd April Mikolov T (2012) Statistical language models based on neural networks. Presentation at google, mountain view, 2nd April
Zurück zum Zitat Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. In: International conference on learning representations (ICLR) workshop Mikolov T, Chen K, Corrado G, Dean J (2013a) Efficient estimation of word representations in vector space. In: International conference on learning representations (ICLR) workshop
Zurück zum Zitat Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013b) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119 Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013b) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
Zurück zum Zitat Mikolov T, Yih Wt, Zweig G (2013c) Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 conference of the North American chapter of the Association for Computational Linguistics: human language technologies (NAACL-HLT), vol 13, pp 746–751 Mikolov T, Yih Wt, Zweig G (2013c) Linguistic regularities in continuous space word representations. In: Proceedings of the 2013 conference of the North American chapter of the Association for Computational Linguistics: human language technologies (NAACL-HLT), vol 13, pp 746–751
Zurück zum Zitat Molchanov I (2006) Theory of random sets. Springer, Berlin Molchanov I (2006) Theory of random sets. Springer, Berlin
Zurück zum Zitat Navigli R, Velardi P, Gangemi A (2003) Ontology learning and its application to automated terminology translation. IEEE Intell Syst 18(1):22–31CrossRef Navigli R, Velardi P, Gangemi A (2003) Ontology learning and its application to automated terminology translation. IEEE Intell Syst 18(1):22–31CrossRef
Zurück zum Zitat Parida L, Ramakrishnan N (2005) Redescription mining: structure theory and algorithms. In: AAAI, vol 5, pp 837–844 Parida L, Ramakrishnan N (2005) Redescription mining: structure theory and algorithms. In: AAAI, vol 5, pp 837–844
Zurück zum Zitat Parthasarathy S, Zaki MJ, Ogihara M, Dwarkadas S (1999) Incremental and interactive sequence mining. In: Proceedings of the eighth international conference on information and knowledge management, ACM, Kansas City, Missouri, USA, pp 251–258 Parthasarathy S, Zaki MJ, Ogihara M, Dwarkadas S (1999) Incremental and interactive sequence mining. In: Proceedings of the eighth international conference on information and knowledge management, ACM, Kansas City, Missouri, USA, pp 251–258
Zurück zum Zitat Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceedings of the 7th international conference on database theory. Springer, London, UK, pp 398–416 Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceedings of the 7th international conference on database theory. Springer, London, UK, pp 398–416
Zurück zum Zitat Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), vol 14, pp 1532–1543 Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), vol 14, pp 1532–1543
Zurück zum Zitat Porter MF (1980) An algorithm for suffix stripping. Program Electron Libr Inf Syst 14(3):130–137CrossRef Porter MF (1980) An algorithm for suffix stripping. Program Electron Libr Inf Syst 14(3):130–137CrossRef
Zurück zum Zitat Quillan MR (1966) Semantic memory. Technical report, DTIC Document Quillan MR (1966) Semantic memory. Technical report, DTIC Document
Zurück zum Zitat Raedt LD, Zimmermann A (2007) Constraint-based pattern set mining. In: Proceedings of the 2007 SIAM international conference on data mining, SIAM, pp 237–248 Raedt LD, Zimmermann A (2007) Constraint-based pattern set mining. In: Proceedings of the 2007 SIAM international conference on data mining, SIAM, pp 237–248
Zurück zum Zitat Ramakrishnan N, Kumar D, Mishra B, Potts M, Helm RF (2004) Turning cartwheels: an alternating algorithm for mining redescriptions. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 266–275 Ramakrishnan N, Kumar D, Mishra B, Potts M, Helm RF (2004) Turning cartwheels: an alternating algorithm for mining redescriptions. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 266–275
Zurück zum Zitat Robertson SE, Soboroff I (2002) The trec 2002 filtering track report. In: TREC, vol 2002, Gaithersburg, Maryland, USA, pp 27–39 Robertson SE, Soboroff I (2002) The trec 2002 filtering track report. In: TREC, vol 2002, Gaithersburg, Maryland, USA, pp 27–39
Zurück zum Zitat Rocchio JJ (1971) Relevance feedback in information retrieval. The smart retrieval system-experiments in automatic document processing, pp 313–323 Rocchio JJ (1971) Relevance feedback in information retrieval. The smart retrieval system-experiments in automatic document processing, pp 313–323
Zurück zum Zitat Rose T, Stevenson M, Whitehead M (2002) The reuters corpus volume 1—from yesterday’s news to tomorrow’s language resources. In: Proceedings of the third international conference on language resources and evaluation (LREC), vol 2, Canary Islands, Spain, pp 827–832 Rose T, Stevenson M, Whitehead M (2002) The reuters corpus volume 1—from yesterday’s news to tomorrow’s language resources. In: Proceedings of the third international conference on language resources and evaluation (LREC), vol 2, Canary Islands, Spain, pp 827–832
Zurück zum Zitat Ruggieri S (2010) Frequent regular itemset mining. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 263–272 Ruggieri S (2010) Frequent regular itemset mining. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 263–272
Zurück zum Zitat Rumelhart DE, Hinton GE, Williams RJ (1988) Learning representations by back-propagating errors. Cognit Model 5(3):1MATH Rumelhart DE, Hinton GE, Williams RJ (1988) Learning representations by back-propagating errors. Cognit Model 5(3):1MATH
Zurück zum Zitat Salton G (1968) Automatic information organization and retrieval. McGraw-Hill, New York Salton G (1968) Automatic information organization and retrieval. McGraw-Hill, New York
Zurück zum Zitat Schmitz M, Bart R, Soderland S, Etzioni O, et al (2012) Open language learning for information extraction. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, Association for Computational Linguistics, pp 523–534 Schmitz M, Bart R, Soderland S, Etzioni O, et al (2012) Open language learning for information extraction. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, Association for Computational Linguistics, pp 523–534
Zurück zum Zitat Schwenk H (2007) Continuous space language models. Comput Speech Lang 21(3):492–518CrossRef Schwenk H (2007) Continuous space language models. Comput Speech Lang 21(3):492–518CrossRef
Zurück zum Zitat Shen Y, Li Y, Xu Y (2012) Adopting relevance feature to learn personalized ontologies. In: Australasian joint conference on artificial intelligence, Springer, Sydney, Australia, pp 457–468 Shen Y, Li Y, Xu Y (2012) Adopting relevance feature to learn personalized ontologies. In: Australasian joint conference on artificial intelligence, Springer, Sydney, Australia, pp 457–468
Zurück zum Zitat Siebes A, Vreeken J, Leeuwen Mv (2006) Item sets that compress. In: Proceedings of the 2006 SIAM international conference on data mining, SIAM, pp 395–406 Siebes A, Vreeken J, Leeuwen Mv (2006) Item sets that compress. In: Proceedings of the 2006 SIAM international conference on data mining, SIAM, pp 395–406
Zurück zum Zitat Sieg A, Mobasher B, Burke R (2007) Web search personalization with ontological user profiles. In: Proceedings of the sixteenth ACM conference on information and knowledge management, ACM, Lisbon, Portugal, pp 525–534 Sieg A, Mobasher B, Burke R (2007) Web search personalization with ontological user profiles. In: Proceedings of the sixteenth ACM conference on information and knowledge management, ACM, Lisbon, Portugal, pp 525–534
Zurück zum Zitat Socher R, Lin CC, Manning C, Ng AY (2011) Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 129–136 Socher R, Lin CC, Manning C, Ng AY (2011) Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 129–136
Zurück zum Zitat Song Y, Wang H, Wang Z, Li H, Chen W (2011) Short text conceptualization using a probabilistic knowledgebase. In: Proceedings of the twenty-second international joint conference on artificial intelligence—vol 3. AAAI Press, Barcelona, pp 2330–2336 Song Y, Wang H, Wang Z, Li H, Chen W (2011) Short text conceptualization using a probabilistic knowledgebase. In: Proceedings of the twenty-second international joint conference on artificial intelligence—vol 3. AAAI Press, Barcelona, pp 2330–2336
Zurück zum Zitat Spasic I, Ananiadou S, McNaught J, Kumar A (2005) Text mining and ontologies in biomedicine: making sense of raw text. Brief Bioinform 6(3):239–251CrossRef Spasic I, Ananiadou S, McNaught J, Kumar A (2005) Text mining and ontologies in biomedicine: making sense of raw text. Brief Bioinform 6(3):239–251CrossRef
Zurück zum Zitat Sun X, Xiao Y, Wang H, Wang W (2015) On conceptual labeling of a bag of words. In: Proceedings of the 24th international conference on artificial intelligence. AAAI Press, Buenos Aires, pp 1326–1332 Sun X, Xiao Y, Wang H, Wang W (2015) On conceptual labeling of a bag of words. In: Proceedings of the 24th international conference on artificial intelligence. AAAI Press, Buenos Aires, pp 1326–1332
Zurück zum Zitat Tan AH, et al (1999) Text mining: the state of the art and the challenges. In: Proceedings of the PAKDD 1999 workshop on knowledge discovery from advanced databases, vol 8, pp 65–70 Tan AH, et al (1999) Text mining: the state of the art and the challenges. In: Proceedings of the PAKDD 1999 workshop on knowledge discovery from advanced databases, vol 8, pp 65–70
Zurück zum Zitat Tao X, Li Y, Zhong N (2011) A personalized ontology model for web information gathering. IEEE Trans Knowl Data Eng 23(4):496–511CrossRef Tao X, Li Y, Zhong N (2011) A personalized ontology model for web information gathering. IEEE Trans Knowl Data Eng 23(4):496–511CrossRef
Zurück zum Zitat Thiel K, Berthold MR (2010) Node similarities from spreading activation. In: 10th international conference on data mining (ICDM). IEEE, pp 1085–1090 Thiel K, Berthold MR (2010) Node similarities from spreading activation. In: 10th international conference on data mining (ICDM). IEEE, pp 1085–1090
Zurück zum Zitat Tran T, Cimiano P, Rudolph S, Studer R (2007) Ontology-based interpretation of keywords for semantic search. The semantic web, pp 523–536 Tran T, Cimiano P, Rudolph S, Studer R (2007) Ontology-based interpretation of keywords for semantic search. The semantic web, pp 523–536
Zurück zum Zitat Turney PD (2013) Distributional semantics beyond words: supervised learning of analogy and paraphrase. Trans Assoc Comput Linguist 1:353–366 Turney PD (2013) Distributional semantics beyond words: supervised learning of analogy and paraphrase. Trans Assoc Comput Linguist 1:353–366
Zurück zum Zitat Verma R, Chen P, Lu W (2007) A semantic free-text summarization system using ontology knowledge. In: Proceedings of document understanding conference, Citeseer, Rochester, New York, USA, p 5 Verma R, Chen P, Lu W (2007) A semantic free-text summarization system using ontology knowledge. In: Proceedings of document understanding conference, Citeseer, Rochester, New York, USA, p 5
Zurück zum Zitat Wang P, Domeniconi C (2008) Building semantic kernels for text classification using wikipedia. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Las Vegas, Nevada, USA, pp 713–721 Wang P, Domeniconi C (2008) Building semantic kernels for text classification using wikipedia. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Las Vegas, Nevada, USA, pp 713–721
Zurück zum Zitat Wang X, McCallum A (2006) Topics over time: a non-Markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Philadelphia, PA, USA, pp 424–433 Wang X, McCallum A (2006) Topics over time: a non-Markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, Philadelphia, PA, USA, pp 424–433
Zurück zum Zitat Weston J, Bengio S, Usunier N (2011) Wsabie: Scaling up to large vocabulary image annotation. In: Proceedings of the twenty-second international joint conference on artificial intelligence, vol 11, pp 2764–2770 Weston J, Bengio S, Usunier N (2011) Wsabie: Scaling up to large vocabulary image annotation. In: Proceedings of the twenty-second international joint conference on artificial intelligence, vol 11, pp 2764–2770
Zurück zum Zitat Wortsman J, Matsuoka LY, Chen TC, Lu Z, Holick MF (2000) Decreased bioavailability of vitamin D in obesity. Am J Clin Nutr 72(3):690–693CrossRef Wortsman J, Matsuoka LY, Chen TC, Lu Z, Holick MF (2000) Decreased bioavailability of vitamin D in obesity. Am J Clin Nutr 72(3):690–693CrossRef
Zurück zum Zitat Wu F, Weld DS (2010) Open information extraction using wikipedia. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Association for Computational Linguistics, pp 118–127 Wu F, Weld DS (2010) Open information extraction using wikipedia. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Association for Computational Linguistics, pp 118–127
Zurück zum Zitat Wu ST (2007) Knowledge discovery using pattern taxonomy model in text mining. PhD thesis, Electrical Engineering and Computer Science, Queensland University of Technology Wu ST (2007) Knowledge discovery using pattern taxonomy model in text mining. PhD thesis, Electrical Engineering and Computer Science, Queensland University of Technology
Zurück zum Zitat Wu ST, Li Y, Xu Y (2006) Deploying approaches for pattern refinement in text mining. In: Sixth international conference on data mining, ICDM’06, IEEE, pp 1157–1161 Wu ST, Li Y, Xu Y (2006) Deploying approaches for pattern refinement in text mining. In: Sixth international conference on data mining, ICDM’06, IEEE, pp 1157–1161
Zurück zum Zitat Xin D, Han J, Yan X, Cheng H (2005) Mining compressed frequent-pattern sets. In: Proceedings of the 31st international conference on very large data bases, VLDB Endowment, Trondheim, Norway, pp 709–720 Xin D, Han J, Yan X, Cheng H (2005) Mining compressed frequent-pattern sets. In: Proceedings of the 31st international conference on very large data bases, VLDB Endowment, Trondheim, Norway, pp 709–720
Zurück zum Zitat Xue GR, Zeng HJ, Chen Z, Yu Y, Ma WY, Xi W, Fan W (2004) Optimizing web search using web click-through data. In: Proceedings of the thirteenth ACM international conference on Information and knowledge management, ACM, pp 118–126 Xue GR, Zeng HJ, Chen Z, Yu Y, Ma WY, Xi W, Fan W (2004) Optimizing web search using web click-through data. In: Proceedings of the thirteenth ACM international conference on Information and knowledge management, ACM, pp 118–126
Zurück zum Zitat Yan X, Cheng H, Han J, Xin D (2005) Summarizing itemset patterns: a profile-based approach. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, ACM, Chicago, Illinois, USA, pp 314–323 Yan X, Cheng H, Han J, Xin D (2005) Summarizing itemset patterns: a profile-based approach. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, ACM, Chicago, Illinois, USA, pp 314–323
Zurück zum Zitat Yi K, Chan LM (2009) Linking folksonomy to library of congress subject headings: an exploratory study. J Doc 65(6):872–900CrossRef Yi K, Chan LM (2009) Linking folksonomy to library of congress subject headings: an exploratory study. J Doc 65(6):872–900CrossRef
Zurück zum Zitat Zaki MJ, Ramakrishnan N (2005) Reasoning about sets using redescription mining. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, ACM, pp 364–373 Zaki MJ, Ramakrishnan N (2005) Reasoning about sets using redescription mining. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, ACM, pp 364–373
Zurück zum Zitat Zhong N, Li Y, Wu ST (2012) Effective pattern discovery for text mining. IEEE Trans Knowl Data Eng 24(1):30–44CrossRef Zhong N, Li Y, Wu ST (2012) Effective pattern discovery for text mining. IEEE Trans Knowl Data Eng 24(1):30–44CrossRef
Zurück zum Zitat Zhou G, Qian L, Fan J (2010) Tree kernel-based semantic relation extraction with rich syntactic and semantic information. Inf Sci 180(8):1313–1325MathSciNetCrossRef Zhou G, Qian L, Fan J (2010) Tree kernel-based semantic relation extraction with rich syntactic and semantic information. Inf Sci 180(8):1313–1325MathSciNetCrossRef
Zurück zum Zitat Zhu J, Nie Z, Liu X, Zhang B, Wen JR (2009) Statsnowball: a statistical approach to extracting entity relationships. In: Proceedings of the 18th international conference on World wide web, ACM, pp 101–110 Zhu J, Nie Z, Liu X, Zhang B, Wen JR (2009) Statsnowball: a statistical approach to extracting entity relationships. In: Proceedings of the 18th international conference on World wide web, ACM, pp 101–110
Metadaten
Titel
Interpretation of text patterns
verfasst von
Md Abul Bashar
Yuefeng Li
Publikationsdatum
22.02.2018
Verlag
Springer US
Erschienen in
Data Mining and Knowledge Discovery / Ausgabe 4/2018
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI
https://doi.org/10.1007/s10618-018-0556-z

Weitere Artikel der Ausgabe 4/2018

Data Mining and Knowledge Discovery 4/2018 Zur Ausgabe

Premium Partner