Skip to main content
Top

2017 | OriginalPaper | Chapter

On the Helmholtz Principle for Data Mining

Authors : Alexander Balinsky, Helen Balinsky, Steven Simske

Published in: Uncertainty Modeling

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Keyword and feature extraction is a fundamental problem in text data mining and document processing. A majority of document processing applications directly depend on the quality and speed of keyword extraction algorithms. In this article, an approach, introduced in [1], to rapid change detection in data streams and documents is developed and analysed. It is based on ideas from image processing and especially on the Helmholtz Principle from the Gestalt Theory of human perception. Applied to the problem of keywords extraction, it delivers fast and effective tools to identify meaningful keywords using parameter-free methods. We also define a level of meaningfulness of the keywords which can be used to modify the set of keywords depending on application needs.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference A. Balinsky, H. Balinsky, and S.Simske, On Helmholtzs principle for documents processing, Proc. 10th ACM symposium on Document engineering, Sep. 2010. A. Balinsky, H. Balinsky, and S.Simske, On Helmholtzs principle for documents processing, Proc. 10th ACM symposium on Document engineering, Sep. 2010.
2.
go back to reference A. N. Srivastava and M. Sahami (editors), Text Mining: classification, clustering, and applications, CRC Press, 2009. A. N. Srivastava and M. Sahami (editors), Text Mining: classification, clustering, and applications, CRC Press, 2009.
3.
go back to reference D. Lowe, Perceptual Organization and Visual Recognition, Amsterdam: Kluwer Academic Publishers, 1985. D. Lowe, Perceptual Organization and Visual Recognition, Amsterdam: Kluwer Academic Publishers, 1985.
4.
go back to reference K. Spärck Jones, A statistical interpretation of term specificity and its application in retrieval, Journal of Documentation, vol. 28, no. 1, pp. 1121, 1972. K. Spärck Jones, A statistical interpretation of term specificity and its application in retrieval, Journal of Documentation, vol. 28, no. 1, pp. 1121, 1972.
5.
go back to reference S. Robertson, Understanding inverse document frequency: On theoretical arguments for idf, Journal of Documentation, vol. 60, no. 5, pp. 503520, 2004. S. Robertson, Understanding inverse document frequency: On theoretical arguments for idf, Journal of Documentation, vol. 60, no. 5, pp. 503520, 2004.
6.
go back to reference J. Kleinberg, Bursty and hierarchical structure in streams, Proc. 8th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2002. J. Kleinberg, Bursty and hierarchical structure in streams, Proc. 8th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2002.
7.
go back to reference A. Desolneux, L. Moisan, and J.-M. Morel, From Gestalt Theory to Image Analysis: A Probabilistic Approach, ser. Interdisciplinary Applied Mathematics, Springer, 2008, vol.34. A. Desolneux, L. Moisan, and J.-M. Morel, From Gestalt Theory to Image Analysis: A Probabilistic Approach, ser. Interdisciplinary Applied Mathematics, Springer, 2008, vol.34.
Metadata
Title
On the Helmholtz Principle for Data Mining
Authors
Alexander Balinsky
Helen Balinsky
Steven Simske
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-51052-1_2

Premium Partner