Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 1/2017

18.03.2016 | Original Article

Name identification and extraction with formal concept analysis

verfasst von: Kazem Taghva

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 1/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

One of the applications of the Formal concept analysis (FCA) is the ability to extract structured information from textual documents. Typically, one can define a set of attributes that will characterize the objects. Consequently, these defined objects will be extracted by standard FCA algorithms. In this paper, we describe how FCA identifies and extracts personal names as units of thought similar to the decoding of text sequences by Viterbi algorithm as used with Hidden Markov Models. We further exhibit how FCA mimics the thought process that goes into a rule-based information extraction system. We then observe that the formal approach of FCA combined with already established computational techniques such as bottom up intersection algorithm avoids the difficulties associated with hand coding and maintenance of rule-based systems.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Appelt DE, Israel DJ (1999) Introduction to information extraction technology. Tutorial Prepared for IJCAI-99 Appelt DE, Israel DJ (1999) Introduction to information extraction technology. Tutorial Prepared for IJCAI-99
2.
Zurück zum Zitat Ganter B, Wille R (1999) Formal Concept Analysis: Logical Foundations. Springer-Verlag Ganter B, Wille R (1999) Formal Concept Analysis: Logical Foundations. Springer-Verlag
3.
Zurück zum Zitat Dias SM, Vieira NJ (2013) Applying the jbos reduction method for relevant knowledge extraction. Expert Syst Appl 40(5):1880–1887CrossRef Dias SM, Vieira NJ (2013) Applying the jbos reduction method for relevant knowledge extraction. Expert Syst Appl 40(5):1880–1887CrossRef
4.
Zurück zum Zitat Freitag D, McCallum AKD (1999) Information extraction with hmms and shrinkage. In: Proceedings of the AAAI-99 Workshop on Machine Learning for Information Extraction Freitag D, McCallum AKD (1999) Information extraction with hmms and shrinkage. In: Proceedings of the AAAI-99 Workshop on Machine Learning for Information Extraction
5.
Zurück zum Zitat Grishman R, Sundheim B (1996) Message understanding conference-6: a brief history. In: Proceedings of the 16th conference on Computational linguistics, vol 1, COLING ’96. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 466–471 Grishman R, Sundheim B (1996) Message understanding conference-6: a brief history. In: Proceedings of the 16th conference on Computational linguistics, vol 1, COLING ’96. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 466–471
6.
Zurück zum Zitat Hall GR, Taghva K (2015) Using the web 1t 5-gram database for attribute selection in formal concept analysis to correct overstemmed clusters. In: 2015 12th International Conference on Information Technology—New Generations (ITNG), pp 651–654 Hall GR, Taghva K (2015) Using the web 1t 5-gram database for attribute selection in formal concept analysis to correct overstemmed clusters. In: 2015 12th International Conference on Information Technology—New Generations (ITNG), pp 651–654
7.
Zurück zum Zitat Kumar CA, Ishwarya MS, Loo CK (2015) Formal concept analysis approach to cognitive functionalities of bidirectional associative memory. Biol Inspired Cogn Archit Kumar CA, Ishwarya MS, Loo CK (2015) Formal concept analysis approach to cognitive functionalities of bidirectional associative memory. Biol Inspired Cogn Archit
8.
Zurück zum Zitat Li J, Mei C, Weihua X, Qian Y (2015a) Concept learning via granular computing: a cognitive viewpoint. Inf Sci 298:447–467MathSciNetCrossRef Li J, Mei C, Weihua X, Qian Y (2015a) Concept learning via granular computing: a cognitive viewpoint. Inf Sci 298:447–467MathSciNetCrossRef
9.
Zurück zum Zitat Li J, Ren Y, Mei C, Qian Y, Yang Xibei (2016) A comparative study of multigranulation rough sets and concept lattices via rule acquisition. Knowl Based Syst 91:152–164CrossRef Li J, Ren Y, Mei C, Qian Y, Yang Xibei (2016) A comparative study of multigranulation rough sets and concept lattices via rule acquisition. Knowl Based Syst 91:152–164CrossRef
11.
Zurück zum Zitat Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1):3–26CrossRef Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1):3–26CrossRef
12.
Zurück zum Zitat Poibeau T, Kosseim L (2001) Proper name extraction from non-journalistic texts. Lang Comput 37(1):144–157MATH Poibeau T, Kosseim L (2001) Proper name extraction from non-journalistic texts. Lang Comput 37(1):144–157MATH
13.
Zurück zum Zitat Powley B, Dale R (2007) High accuracy citation extraction and named entity recognition for a heterogeneous corpus of academic papers. In: IEEE International Conference on Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007, pp 119–124 Powley B, Dale R (2007) High accuracy citation extraction and named entity recognition for a heterogeneous corpus of academic papers. In: IEEE International Conference on Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007, pp 119–124
14.
Zurück zum Zitat Priss U (2005) Linguistic applications of formal concept analysis. In: Formal Concept Analysis. Springer, pp 149–160 Priss U (2005) Linguistic applications of formal concept analysis. In: Formal Concept Analysis. Springer, pp 149–160
15.
Zurück zum Zitat Rabiner LR (1989) Readings in speech recognition. In: Waibel A, Lee K-F (eds) Readings in speech recognition, chapter A tutorial on hidden Markov models and selected applications in speech recognition. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 267–296, 1990. ISBN 1-55860-124-4 Rabiner LR (1989) Readings in speech recognition. In: Waibel A, Lee K-F (eds) Readings in speech recognition, chapter A tutorial on hidden Markov models and selected applications in speech recognition. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 267–296, 1990. ISBN 1-55860-124-4
16.
Zurück zum Zitat Rocha LM (2002) Proximity and semi-metric analysis of social networks. In: Report of Advanced Knowledge Integration In Assessing Terrorist Threats LDRD-DR Network Analysis Component. LAUR 02-6557 Rocha LM (2002) Proximity and semi-metric analysis of social networks. In: Report of Advanced Knowledge Integration In Assessing Terrorist Threats LDRD-DR Network Analysis Component. LAUR 02-6557
17.
Zurück zum Zitat Siff M, Reps TW (1999) Identifying modules via concept analysis. IEEE Trans Softw Eng 25(6):749–768CrossRef Siff M, Reps TW (1999) Identifying modules via concept analysis. IEEE Trans Softw Eng 25(6):749–768CrossRef
18.
Zurück zum Zitat Stumme G (2002) Efficient data mining based on formal concept analysis. In: DEXA, pp 534–546 Stumme G (2002) Efficient data mining based on formal concept analysis. In: DEXA, pp 534–546
19.
Zurück zum Zitat Taghva K (2009) Identification of Sensitive Unclassified Information. Springer, pp 89–103 Taghva K (2009) Identification of Sensitive Unclassified Information. Springer, pp 89–103
20.
Zurück zum Zitat Taghva K, Gilbreth J (1999) Recognizing acronyms and their definitions. IJDAR 1(4):191–198CrossRef Taghva K, Gilbreth J (1999) Recognizing acronyms and their definitions. IJDAR 1(4):191–198CrossRef
21.
Zurück zum Zitat Taghva K, Coombs JS, Pereda R, Nartker TA (2005) Address extraction using hidden markov models. In: Proceedings Document Recognition and Retrieval XII, 16-20 January 2005, San Jose, California, USA, pp 119–126 Taghva K, Coombs JS, Pereda R, Nartker TA (2005) Address extraction using hidden markov models. In: Proceedings Document Recognition and Retrieval XII, 16-20 January 2005, San Jose, California, USA, pp 119–126
22.
Zurück zum Zitat Taghva K, Beckley R, Coombs JS (2006) The effects of ocr error on the extraction of private information. In: Document Analysis Systems, pp 348–357 Taghva K, Beckley R, Coombs JS (2006) The effects of ocr error on the extraction of private information. In: Document Analysis Systems, pp 348–357
23.
Zurück zum Zitat Taghva K, Beckley R, Coombs JS (2011) Name extraction and formal concept analysis. In: Proceedings Conceptual Structures for Discovering Knowledge—19th International Conference on Conceptual Structures, ICCS 2011, Derby, UK, July 25–29, pp 339–345 Taghva K, Beckley R, Coombs JS (2011) Name extraction and formal concept analysis. In: Proceedings Conceptual Structures for Discovering Knowledge—19th International Conference on Conceptual Structures, ICCS 2011, Derby, UK, July 25–29, pp 339–345
25.
Zurück zum Zitat Weihua X, Pang J, Luo S (2014) A novel cognitive system model and approach to transformation of information granules. Int J Approx Reason 55(3):853–866MathSciNetCrossRefMATH Weihua X, Pang J, Luo S (2014) A novel cognitive system model and approach to transformation of information granules. Int J Approx Reason 55(3):853–866MathSciNetCrossRefMATH
26.
Zurück zum Zitat Xu WH, Li WT (2016) Granular computing approach to two-way learning based on formal concept analysis in fuzzy datasets. IEEE Transactions on Cybernetics (To appear) Xu WH, Li WT (2016) Granular computing approach to two-way learning based on formal concept analysis in fuzzy datasets. IEEE Transactions on Cybernetics (To appear)
Metadaten
Titel
Name identification and extraction with formal concept analysis
verfasst von
Kazem Taghva
Publikationsdatum
18.03.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 1/2017
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-016-0514-2

Weitere Artikel der Ausgabe 1/2017

International Journal of Machine Learning and Cybernetics 1/2017 Zur Ausgabe

Neuer Inhalt