Skip to main content
Erschienen in: Pattern Analysis and Applications 2/2017

08.02.2015 | Theoretical Advances

A texture-based pixel labeling approach for historical books

verfasst von: Maroua Mehri, Petra Gomez-Krämer, Pierre Héroux, Alain Boucher, Rémy Mullot

Erschienen in: Pattern Analysis and Applications | Ausgabe 2/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Over the last few years, there has been tremendous growth in the automatic processing of digitized historical documents. In fact, finding reliable systems for the interpretation of ancient documents has been a topic of major interest for many libraries and the prime issue of research in the document analysis community. One important challenge is to refine well-known approaches based on strong a priori knowledge (e.g., the document image content, layout, typography, font size and type, scanning resolution, image size, etc.). Nevertheless, a texture analysis approach has consistently been chosen to segment a page layout when information is lacking on document structure and content. Thus, in this article, a framework is proposed to investigate the use of texture as a tool for automatically determining homogeneous regions in a digitized historical book and segmenting its contents by extracting and analyzing texture features independently of the layout of the pages. The proposed framework is parameter free and applicable to a large variety of ancient of books. It does not assume a priori information regarding document image content and structure. It consists of two phases: a texture-based feature extraction step and unsupervised clustering and labeling task based on the consensus clustering, hierarchical ascendant classification, and nearest neighbor search algorithms. The novelty of this work lies in the clustering of extracted texture descriptors to find automatically homogeneous regions, i.e., graphic and textual regions, using the clustering approach on an entire book instead of processing each page individually. Our framework has been evaluated on a large variety of historical books and achieved promising results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat André J, Chabin MA (1999) Les documents anciens, Document Numérique André J, Chabin MA (1999) Les documents anciens, Document Numérique
2.
Zurück zum Zitat LeBourgeois F, Trinh E, Allier B, Eglin V, Emptoz H (2004) Document images analysis solutions for digital libraries. In: International workshop on document image analysis for libraries. IEEE, New York, pp 2–24 LeBourgeois F, Trinh E, Allier B, Eglin V, Emptoz H (2004) Document images analysis solutions for digital libraries. In: International workshop on document image analysis for libraries. IEEE, New York, pp 2–24
3.
Zurück zum Zitat LeBourgeois F, Emptoz H (2007) DEBORA: Digital AccEss to BOoks of the RenAissance. Int J Doc Anal Recognit 193–221 LeBourgeois F, Emptoz H (2007) DEBORA: Digital AccEss to BOoks of the RenAissance. Int J Doc Anal Recognit 193–221
4.
Zurück zum Zitat Baechler M, Fischer A, Naji N, Ingold R, Bunke H, Savoy J (2012) HisDoc: historical document analysis, recognition, and retrieval. In: Digital humanities—international conference of the alliance of digital humanities organizations (ADHO) Baechler M, Fischer A, Naji N, Ingold R, Bunke H, Savoy J (2012) HisDoc: historical document analysis, recognition, and retrieval. In: Digital humanities—international conference of the alliance of digital humanities organizations (ADHO)
5.
Zurück zum Zitat Ogier JM, Tombre K (2006) Madonne: document image analysis techniques for cultural heritage documents. In: International conference on digital cultural heritage Ogier JM, Tombre K (2006) Madonne: document image analysis techniques for cultural heritage documents. In: International conference on digital cultural heritage
6.
Zurück zum Zitat Rath TM, Manmatha R (2007) Word spotting for historical documents. Int J Doc Anal Recognit 139–152 Rath TM, Manmatha R (2007) Word spotting for historical documents. Int J Doc Anal Recognit 139–152
7.
Zurück zum Zitat Baird HS (2003) Digital libraries and document image analysis. In: International conference on document analysis and recognition. IEEE, New York, pp 2–14 Baird HS (2003) Digital libraries and document image analysis. In: International conference on document analysis and recognition. IEEE, New York, pp 2–14
8.
Zurück zum Zitat Ogier JM (2005) Ancient document analysis: a set of new research problems. In: Colloque international francophone sur l’Ecrit et le document Ogier JM (2005) Ancient document analysis: a set of new research problems. In: Colloque international francophone sur l’Ecrit et le document
9.
Zurück zum Zitat Coustaty M, Raveaux R, Ogier JM (2011) Historical document analysis: a review of French projects and open issues. In: European signal processing conference, EURASIP, pp 1445–1449 Coustaty M, Raveaux R, Ogier JM (2011) Historical document analysis: a review of French projects and open issues. In: European signal processing conference, EURASIP, pp 1445–1449
10.
Zurück zum Zitat Okun O, Pietikäinen M (1999) A survey of texture-based methods for document layout analysis. In: Workshop on texture analysis in machine vision. Springer, Berlin, pp 137–148 Okun O, Pietikäinen M (1999) A survey of texture-based methods for document layout analysis. In: Workshop on texture analysis in machine vision. Springer, Berlin, pp 137–148
11.
Zurück zum Zitat Piper A (2013) Reading’s refrain: from bibliography to topology. In: Readings: selected essays from the English Institute, pp 373–399 Piper A (2013) Reading’s refrain: from bibliography to topology. In: Readings: selected essays from the English Institute, pp 373–399
12.
Zurück zum Zitat Nalisnick ET, Baird HS (2013) Extracting sentiment networks from Shakespeare’s plays. In: International conference on document analysis and recognition. IEEE, New York, pp 758–762 Nalisnick ET, Baird HS (2013) Extracting sentiment networks from Shakespeare’s plays. In: International conference on document analysis and recognition. IEEE, New York, pp 758–762
13.
Zurück zum Zitat Agam G, Bal G, Frieder G, Frieder O (2007) Degraded document image enhancement. In: Document recognition and retrieval. SPIE, UK Agam G, Bal G, Frieder G, Frieder O (2007) Degraded document image enhancement. In: Document recognition and retrieval. SPIE, UK
14.
Zurück zum Zitat Likforman-Sulem L (2003) Apport du traitement des images à la numérisation des documents anciens. Document Numérique, pp 13–26 Likforman-Sulem L (2003) Apport du traitement des images à la numérisation des documents anciens. Document Numérique, pp 13–26
15.
Zurück zum Zitat André J, Richy H, Likforman-Sulem L, Ventabert G (1999) Electronic representation and use of old documents (texts and images): about philectre project experiments. Document Numérique, pp 57–73 André J, Richy H, Likforman-Sulem L, Ventabert G (1999) Electronic representation and use of old documents (texts and images): about philectre project experiments. Document Numérique, pp 57–73
16.
Zurück zum Zitat Likforman-Sulem L, Zahour A, Taconet B (2007) Text line segmentation of historical documents: a survey. Int J Doc Anal Recognit 9:123–138CrossRef Likforman-Sulem L, Zahour A, Taconet B (2007) Text line segmentation of historical documents: a survey. Int J Doc Anal Recognit 9:123–138CrossRef
17.
Zurück zum Zitat Nagy G, Seth S (1984) Hierarchical representation of optically scanned documents. In: International conference on pattern recognition. IEEE, New York, pp 347–349 Nagy G, Seth S (1984) Hierarchical representation of optically scanned documents. In: International conference on pattern recognition. IEEE, New York, pp 347–349
18.
Zurück zum Zitat Wahl FM, Wong KY, Casey RG (1982) Block segmentation and text extraction in mixed text/image documents. Comput Graph Image Process 20:375–390CrossRef Wahl FM, Wong KY, Casey RG (1982) Block segmentation and text extraction in mixed text/image documents. Comput Graph Image Process 20:375–390CrossRef
19.
Zurück zum Zitat Zhou YP, Tan CL (2000) Hough technique for bar charts detection and recognition in document images. In: International conference on image processing. IEEE, New York, pp 605–608 Zhou YP, Tan CL (2000) Hough technique for bar charts detection and recognition in document images. In: International conference on image processing. IEEE, New York, pp 605–608
20.
Zurück zum Zitat Belaïd A, Ouwayed N (2011) Guide to OCR for Arabic scripts: segmentation of ancient Arabic documents. Springer, Berlin Belaïd A, Ouwayed N (2011) Guide to OCR for Arabic scripts: segmentation of ancient Arabic documents. Springer, Berlin
21.
Zurück zum Zitat Nikolaou N, Makridis M, Gatos B, Stamatopoulos N, Papamarkos N (2010) Segmentation of historical machine-printed documents using adaptive run-length smoothing and skeleton segmentation paths. Imag Vis Comput 28:590–604CrossRef Nikolaou N, Makridis M, Gatos B, Stamatopoulos N, Papamarkos N (2010) Segmentation of historical machine-printed documents using adaptive run-length smoothing and skeleton segmentation paths. Imag Vis Comput 28:590–604CrossRef
22.
Zurück zum Zitat Serra J (1982) Image analysis and mathematical morphology. Academic Press, LondonMATH Serra J (1982) Image analysis and mathematical morphology. Academic Press, LondonMATH
23.
Zurück zum Zitat Granado I, Mengucci M, Muge F (2000) Extraction de textes et de figures dans les livres anciens à l’aide de la morphologie mathématique. In: Colloque International Francophone sur l’Ecrit et le Document Granado I, Mengucci M, Muge F (2000) Extraction de textes et de figures dans les livres anciens à l’aide de la morphologie mathématique. In: Colloque International Francophone sur l’Ecrit et le Document
24.
Zurück zum Zitat Muge F, Granado I, Mengucci M, Pina P, Ramos V, Sirakov N, Pinto JRC, Marcolino A, Ramalho M, Vieira P, Amaral AMD (2000) Automatic feature extraction and recognition for digital access of books of the Renaissance. In: Research and advanced technology for digital libraries. Lecture notes in computer science. Springer, Berlin, pp 1–13 Muge F, Granado I, Mengucci M, Pina P, Ramos V, Sirakov N, Pinto JRC, Marcolino A, Ramalho M, Vieira P, Amaral AMD (2000) Automatic feature extraction and recognition for digital access of books of the Renaissance. In: Research and advanced technology for digital libraries. Lecture notes in computer science. Springer, Berlin, pp 1–13
25.
Zurück zum Zitat Mengucci M, Granado I (2002) Morphological segmentation of text and figures in renaissance books (XVI century). In: Mathematical morphology and its applications to image and signal processing computational imaging and vision, pp 397–404 Mengucci M, Granado I (2002) Morphological segmentation of text and figures in renaissance books (XVI century). In: Mathematical morphology and its applications to image and signal processing computational imaging and vision, pp 397–404
26.
Zurück zum Zitat Ramel JY, Leriche S, Demonet ML, Busson S (2007) User-driven page layout analysis of historical printed books. Int J Doc Anal Recognit 9:243–261CrossRef Ramel JY, Leriche S, Demonet ML, Busson S (2007) User-driven page layout analysis of historical printed books. Int J Doc Anal Recognit 9:243–261CrossRef
27.
Zurück zum Zitat Crasson A, Fekete JD (2004) Structuration des manuscrits: du corpus à la région. In: Colloque International Francophone sur l’Ecrit et le Document Crasson A, Fekete JD (2004) Structuration des manuscrits: du corpus à la région. In: Colloque International Francophone sur l’Ecrit et le Document
28.
Zurück zum Zitat Kise K (2014) Page segmentation techniques in document analysis. In: Handbook of document image processing and recognition. Springer, Berlin Kise K (2014) Page segmentation techniques in document analysis. In: Handbook of document image processing and recognition. Springer, Berlin
29.
30.
Zurück zum Zitat Chen N, Blostein D (2007) A survey of document image classification: problem statement, classifier architecture and performance evaluation. Int J Doc Anal Recognit 10:1–16CrossRef Chen N, Blostein D (2007) A survey of document image classification: problem statement, classifier architecture and performance evaluation. Int J Doc Anal Recognit 10:1–16CrossRef
31.
Zurück zum Zitat Journet N, Ramel J, Mullot R, Eglin V (2008) Document image characterization using a multiresolution analysis of the texture: application to old documents. Int J Doc Anal Recognit 11:9–18CrossRef Journet N, Ramel J, Mullot R, Eglin V (2008) Document image characterization using a multiresolution analysis of the texture: application to old documents. Int J Doc Anal Recognit 11:9–18CrossRef
32.
Zurück zum Zitat Mehri M, Héroux P, Gomez-Krämer P, Mullot R (2013) A pixel labeling approach for historical digitized books. In: International conference on document analysis and recognition. IEEE, New York, pp 817–821 Mehri M, Héroux P, Gomez-Krämer P, Mullot R (2013) A pixel labeling approach for historical digitized books. In: International conference on document analysis and recognition. IEEE, New York, pp 817–821
33.
Zurück zum Zitat Cohen R, Asi A, Kedem K, El-Sana J, Dinstein I (2013) Robust text and drawing segmentation algorithm for historical documents. In: International workshop on historical document imaging and processing. ACM, New York, pp 110–117 Cohen R, Asi A, Kedem K, El-Sana J, Dinstein I (2013) Robust text and drawing segmentation algorithm for historical documents. In: International workshop on historical document imaging and processing. ACM, New York, pp 110–117
34.
Zurück zum Zitat Lai HP, Visani M, Boucher A, Ogier JM (2012) An experimental comparison of clustering methods for content-based indexing of large image databases. Pattern Anal Appl 15:345–366MathSciNetCrossRef Lai HP, Visani M, Boucher A, Ogier JM (2012) An experimental comparison of clustering methods for content-based indexing of large image databases. Pattern Anal Appl 15:345–366MathSciNetCrossRef
35.
Zurück zum Zitat Allier B, Duong J, Gagneux A, Mallet P, Emptoz H (2003) Texture feature characterization for logical pre-labeling. In: International conference on document analysis and recognition. IEEE, New York, pp 567–571 Allier B, Duong J, Gagneux A, Mallet P, Emptoz H (2003) Texture feature characterization for logical pre-labeling. In: International conference on document analysis and recognition. IEEE, New York, pp 567–571
36.
Zurück zum Zitat Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. Pattern Anal Mach Intell 22:4–37CrossRef Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. Pattern Anal Mach Intell 22:4–37CrossRef
37.
Zurück zum Zitat Liua Y, Wub S, Zhoua X (2003) Texture segmentation based on features in wavelet domain for image retrieval, pp 2026–2034 Liua Y, Wub S, Zhoua X (2003) Texture segmentation based on features in wavelet domain for image retrieval, pp 2026–2034
38.
Zurück zum Zitat Jain AK, Bkattacharjee SK, Chen Y (1992) On texture in document images. In: Computer vision and pattern recognition. IEEE, New York, pp 677–680 Jain AK, Bkattacharjee SK, Chen Y (1992) On texture in document images. In: Computer vision and pattern recognition. IEEE, New York, pp 677–680
39.
Zurück zum Zitat Chen CH, Pau LF, Wang P (1998) Texture analysis in the handbook of pattern recognition and computer vision, 2nd edn. World Scientific, Singapore Chen CH, Pau LF, Wang P (1998) Texture analysis in the handbook of pattern recognition and computer vision, 2nd edn. World Scientific, Singapore
40.
Zurück zum Zitat Tuceryan M, Jain AK (1998) Texture analysis. In: Chen CH, Pau LF, Wang PSP (eds) The handbook of pattern recognition and computer vision, 2nd edn. World Scientific Publishing Co, Singapore Tuceryan M, Jain AK (1998) Texture analysis. In: Chen CH, Pau LF, Wang PSP (eds) The handbook of pattern recognition and computer vision, 2nd edn. World Scientific Publishing Co, Singapore
41.
Zurück zum Zitat Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. Syst Man Cybern 3:610–621CrossRef Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. Syst Man Cybern 3:610–621CrossRef
42.
Zurück zum Zitat Tuceryan M, Jain AK (1990) Texture segmentation using Voronoi polygons. Pattern Anal Mach Intell 12:211–216CrossRef Tuceryan M, Jain AK (1990) Texture segmentation using Voronoi polygons. Pattern Anal Mach Intell 12:211–216CrossRef
43.
Zurück zum Zitat Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: International conference on machine learning, pp 282–289 Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: International conference on machine learning, pp 282–289
44.
Zurück zum Zitat Nicolas S, Kessentini Y, Paquet T, Heutte L (2005) Handwritten document segmentation using hidden Markov random fields. In: International conference on document analysis and recognition. IEEE, New York, pp 212–216 Nicolas S, Kessentini Y, Paquet T, Heutte L (2005) Handwritten document segmentation using hidden Markov random fields. In: International conference on document analysis and recognition. IEEE, New York, pp 212–216
45.
Zurück zum Zitat Chellappa R, Chatterjee S (1984) Classification of textures using Markov random field models. In: International conference on acoustics, speech, and signal processing. IEEE, New York, pp 694–697 Chellappa R, Chatterjee S (1984) Classification of textures using Markov random field models. In: International conference on acoustics, speech, and signal processing. IEEE, New York, pp 694–697
46.
Zurück zum Zitat Ferrell R, Gleason S, Tobin K (2003) Application of fractal encoding techniques for image segmentation. In: International conference on quality control by artificial vision. SPIE, Bellingham, pp 69–77 Ferrell R, Gleason S, Tobin K (2003) Application of fractal encoding techniques for image segmentation. In: International conference on quality control by artificial vision. SPIE, Bellingham, pp 69–77
47.
Zurück zum Zitat Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Anal Mach Intell 24:971–987CrossRefMATH Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Anal Mach Intell 24:971–987CrossRefMATH
48.
Zurück zum Zitat Jain AK, Bhattacharjee S (1992) Text segmentation using Gabor filters for automatic document processing. Mach Vis Appl 5:169–184CrossRef Jain AK, Bhattacharjee S (1992) Text segmentation using Gabor filters for automatic document processing. Mach Vis Appl 5:169–184CrossRef
49.
Zurück zum Zitat Sabharwal C, Subramanya S (2001) Indexing image databases using wavelet and discrete Fourier transform. In: Symposium on applied computing. ACM, New York, pp 434–439 Sabharwal C, Subramanya S (2001) Indexing image databases using wavelet and discrete Fourier transform. In: Symposium on applied computing. ACM, New York, pp 434–439
50.
Zurück zum Zitat Mallat SG (1989) A theory for multiresolution signal decomposition: the wavelet representation. Pattern Anal Mach Intell 11:674–693CrossRefMATH Mallat SG (1989) A theory for multiresolution signal decomposition: the wavelet representation. Pattern Anal Mach Intell 11:674–693CrossRefMATH
51.
Zurück zum Zitat Tuceryan M (1994) Moment based texture segmentation. Pattern Recognit Lett 15:659–668CrossRef Tuceryan M (1994) Moment based texture segmentation. Pattern Recognit Lett 15:659–668CrossRef
52.
Zurück zum Zitat Uttama S, Loonis P, Delalandre M, Ogier JM (2006) Segmentation and retrieval of ancient graphic documents. In: International workshop on graphics recognition on graphics recognition (GREC): ten years review and future perspectives. Springer, Berlin, pp 88–98 Uttama S, Loonis P, Delalandre M, Ogier JM (2006) Segmentation and retrieval of ancient graphic documents. In: International workshop on graphics recognition on graphics recognition (GREC): ten years review and future perspectives. Springer, Berlin, pp 88–98
53.
Zurück zum Zitat Mehri M, Gomez-Krämer P, Héroux P, Mullot R (2013) Old document image segmentation using the autocorrelation function and multiresolution analysis. In: Document recognition and retrieval. SPIE, Bellingham Mehri M, Gomez-Krämer P, Héroux P, Mullot R (2013) Old document image segmentation using the autocorrelation function and multiresolution analysis. In: Document recognition and retrieval. SPIE, Bellingham
54.
Zurück zum Zitat Haralick RM (1979) Statistical and structural approaches to texture. In: Proceedings of the IEEE, pp 786–804 Haralick RM (1979) Statistical and structural approaches to texture. In: Proceedings of the IEEE, pp 786–804
55.
Zurück zum Zitat Petrou M, Sevilla PG (2006) Image processing: dealing with texture. Wiley, New YorkCrossRef Petrou M, Sevilla PG (2006) Image processing: dealing with texture. Wiley, New YorkCrossRef
56.
Zurück zum Zitat Eglin V, Bres S, Rivero C (2007) Hermite and Gabor transforms for noise reduction and handwriting classification in ancient manuscripts. Int J Doc Anal Recognit 9:101–122CrossRef Eglin V, Bres S, Rivero C (2007) Hermite and Gabor transforms for noise reduction and handwriting classification in ancient manuscripts. Int J Doc Anal Recognit 9:101–122CrossRef
57.
Zurück zum Zitat Garz A, Sablatnig R (2010) Multi-scale texture-based text recognition in ancient manuscripts. In: International conference on virtual systems and multimedia. IEEE, New York, pp 336–339 Garz A, Sablatnig R (2010) Multi-scale texture-based text recognition in ancient manuscripts. In: International conference on virtual systems and multimedia. IEEE, New York, pp 336–339
58.
Zurück zum Zitat Grana C, Borghesani D, Cucchiara R (2011) Automatic segmentation of digitalized historical manuscripts. Multimed Tools Appl 55:483–506CrossRef Grana C, Borghesani D, Cucchiara R (2011) Automatic segmentation of digitalized historical manuscripts. Multimed Tools Appl 55:483–506CrossRef
59.
Zurück zum Zitat Ouji A, Leydier Y, LeBourgeois F (2011) Chromatic/achromatic separation in noisy document images. In: International conference on document analysis and recognition. IEEE, New York, pp 167–171 Ouji A, Leydier Y, LeBourgeois F (2011) Chromatic/achromatic separation in noisy document images. In: International conference on document analysis and recognition. IEEE, New York, pp 167–171
60.
Zurück zum Zitat Bres S (1994) Contributions à la quantification des critères de transparence et d’anisotropie par une approche globale : Application au contrôle de qualité de matériaux composites. Ph.D. dissertation, Institut National des Sciences Appliquées de Lyon, Lyon, France Bres S (1994) Contributions à la quantification des critères de transparence et d’anisotropie par une approche globale : Application au contrôle de qualité de matériaux composites. Ph.D. dissertation, Institut National des Sciences Appliquées de Lyon, Lyon, France
61.
Zurück zum Zitat Mehri M, Gomez-Krämer P, Héroux P, Boucher A, Mullot R (2013) Texture feature evaluation for segmentation of historical document images. In: International workshop on historical document imaging and processing. ACM, New York, pp 102–109 Mehri M, Gomez-Krämer P, Héroux P, Boucher A, Mullot R (2013) Texture feature evaluation for segmentation of historical document images. In: International workshop on historical document imaging and processing. ACM, New York, pp 102–109
62.
Zurück zum Zitat Mehri M, Gomez-Krämer P, Héroux P, Boucher A, Mullot R (2014) A pixel labeling framework for comparing texture features: application to digitized ancient books. In: International conference on pattern recognition applications and methods. SciTePress, Canada, pp 553–560 Mehri M, Gomez-Krämer P, Héroux P, Boucher A, Mullot R (2014) A pixel labeling framework for comparing texture features: application to digitized ancient books. In: International conference on pattern recognition applications and methods. SciTePress, Canada, pp 553–560
63.
Zurück zum Zitat Peake G, Tan T (1997) Script and language identification from document images. In: Document image analysis. IEEE, New York, pp 10–17 Peake G, Tan T (1997) Script and language identification from document images. In: Document image analysis. IEEE, New York, pp 10–17
64.
Zurück zum Zitat Busch A, Boles WW, Sridharan S (2005) Texture for script identification. Pattern Anal Mach Intell 27:1720–1732CrossRef Busch A, Boles WW, Sridharan S (2005) Texture for script identification. Pattern Anal Mach Intell 27:1720–1732CrossRef
65.
Zurück zum Zitat Zhu Y, Tan T, Wang Y (2001) Font recognition based on global texture analysis. Pattern Anal Mach Intell 23:1192–1200CrossRef Zhu Y, Tan T, Wang Y (2001) Font recognition based on global texture analysis. Pattern Anal Mach Intell 23:1192–1200CrossRef
66.
Zurück zum Zitat Ma H, Doermann D (2003) Gabor filter based multi-class classifier for scanned document images. In: International conference on document analysis and recognition. IEEE, New York, pp 968–972 Ma H, Doermann D (2003) Gabor filter based multi-class classifier for scanned document images. In: International conference on document analysis and recognition. IEEE, New York, pp 968–972
67.
Zurück zum Zitat Jain AK, Zhong Y (1996) Page segmentation using texture analysis. Pattern Recognit 29:743–770CrossRef Jain AK, Zhong Y (1996) Page segmentation using texture analysis. Pattern Recognit 29:743–770CrossRef
68.
Zurück zum Zitat Randen T, Husøy JH (1994) Segmentation of text/image documents using texture approaches Randen T, Husøy JH (1994) Segmentation of text/image documents using texture approaches
69.
Zurück zum Zitat Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy C-means clustering algorithm. In: Computers and geosciences. Pergamon Press, UK, pp 191–203 Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy C-means clustering algorithm. In: Computers and geosciences. Pergamon Press, UK, pp 191–203
70.
Zurück zum Zitat Kovács F, Legány C, Babos A (2006) Cluster validity measurement techniques. In: International conference on artificial intelligence, knowledge engineering and data bases. World Scientific and Engineering Academy and Society, Greece, pp 388–393 Kovács F, Legány C, Babos A (2006) Cluster validity measurement techniques. In: International conference on artificial intelligence, knowledge engineering and data bases. World Scientific and Engineering Academy and Society, Greece, pp 388–393
71.
Zurück zum Zitat MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Berkeley symposium on mathematical statistics and probability. University of California Press, California, pp 281–297 MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Berkeley symposium on mathematical statistics and probability. University of California Press, California, pp 281–297
72.
Zurück zum Zitat Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New YorkCrossRefMATH Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New YorkCrossRefMATH
73.
Zurück zum Zitat Lance GN, Williams WT (1967) A general theory of classificatory sorting strategies 1. Hierarchical systems. Comput J 9:373–380CrossRef Lance GN, Williams WT (1967) A general theory of classificatory sorting strategies 1. Hierarchical systems. Comput J 9:373–380CrossRef
74.
Zurück zum Zitat Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: International conference on knowledge discovery and data mining. AAAI Press, Palo Alto, pp 226–231 Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: International conference on knowledge discovery and data mining. AAAI Press, Palo Alto, pp 226–231
75.
Zurück zum Zitat Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) OPTICS: ordering points to identify the clustering structure. In: International conference on management of data. ACM Press, New York, pp 49–60 Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) OPTICS: ordering points to identify the clustering structure. In: International conference on management of data. ACM Press, New York, pp 49–60
76.
Zurück zum Zitat McLachlan GJ, Krishnan T (1997) The EM algorithm and extensions. Wiley, New YorkMATH McLachlan GJ, Krishnan T (1997) The EM algorithm and extensions. Wiley, New YorkMATH
77.
Zurück zum Zitat Wang W, Yang J, Muntz R (1997) STING: a statistical information grid approach to spatial data mining. In: International conference on very large data. Morgan Kaufmann, Burlington, pp 186–195 Wang W, Yang J, Muntz R (1997) STING: a statistical information grid approach to spatial data mining. In: International conference on very large data. Morgan Kaufmann, Burlington, pp 186–195
78.
Zurück zum Zitat Sheikholeslami G, Chatterjee S, Zhang A (1998) WaveCluster: a multi-eesolution clustering approach for very large spatial databases. In: International conference on very large data. Morgan Kaufmann, Burlington, pp 428–439 Sheikholeslami G, Chatterjee S, Zhang A (1998) WaveCluster: a multi-eesolution clustering approach for very large spatial databases. In: International conference on very large data. Morgan Kaufmann, Burlington, pp 428–439
79.
Zurück zum Zitat Smigiel E, Belaïd A, Hamza H (2004) Self-organizing maps and ancient documents. In: International workshop on document analysis systems. Springer, Berlin, pp 125–134 Smigiel E, Belaïd A, Hamza H (2004) Self-organizing maps and ancient documents. In: International workshop on document analysis systems. Springer, Berlin, pp 125–134
80.
Zurück zum Zitat Rosenblatt JF (1962) Principles of neurodynamics. Spartan Books, Italy Rosenblatt JF (1962) Principles of neurodynamics. Spartan Books, Italy
81.
Zurück zum Zitat Xu R (2005) Survey of clustering algorithms. Neural Netw 16:645–678CrossRef Xu R (2005) Survey of clustering algorithms. Neural Netw 16:645–678CrossRef
82.
Zurück zum Zitat Cocquerez J, Philipp S (1995) Analyse d’images: filtrage et segmentation, Masson Cocquerez J, Philipp S (1995) Analyse d’images: filtrage et segmentation, Masson
83.
Zurück zum Zitat Duda R, Hart P, Stork D (2001) Pattern classification, 2nd edn. Wiley-Interscience, New YorkMATH Duda R, Hart P, Stork D (2001) Pattern classification, 2nd edn. Wiley-Interscience, New YorkMATH
84.
Zurück zum Zitat Cord M, Cunningham P (2008) Machine learning techniques for multimedia case studies on organization and retrieval, series: cognitive technologies. Springer, BerlinCrossRef Cord M, Cunningham P (2008) Machine learning techniques for multimedia case studies on organization and retrieval, series: cognitive technologies. Springer, BerlinCrossRef
85.
Zurück zum Zitat Cornuéjols A, Miclet L (2010) Apprentissage artificiel: concepts et algorithmes, 2nd edn. Eyrolles, Paris Cornuéjols A, Miclet L (2010) Apprentissage artificiel: concepts et algorithmes, 2nd edn. Eyrolles, Paris
86.
Zurück zum Zitat Iam-on N, Garrett S (2010) LinkCluE: a Matlab package for link-based cluster ensembles. J Stat Softw 36:1–36CrossRef Iam-on N, Garrett S (2010) LinkCluE: a Matlab package for link-based cluster ensembles. J Stat Softw 36:1–36CrossRef
87.
Zurück zum Zitat Ray S, Turi RH (1999) Determination of number of clusters in k-means clustering and application in color image segmentation. In: International conference on advances in pattern recognition and digital techniques. Narosa Publishing House, Chennai, pp 137–143 Ray S, Turi RH (1999) Determination of number of clusters in k-means clustering and application in color image segmentation. In: International conference on advances in pattern recognition and digital techniques. Narosa Publishing House, Chennai, pp 137–143
88.
Zurück zum Zitat Moesa HA, Akutsu DBKCT (2005) Efficient determination of cluster boundaries for analysis of gene expression profile data using hierarchical clustering and wavelet transform. Genome Inform 16:132–141 Moesa HA, Akutsu DBKCT (2005) Efficient determination of cluster boundaries for analysis of gene expression profile data using hierarchical clustering and wavelet transform. Genome Inform 16:132–141
89.
Zurück zum Zitat Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65CrossRefMATH Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65CrossRefMATH
90.
Zurück zum Zitat Lletía R, Ortiza MC, Sarabiab LA, Sánchez MS (2004) Selecting variables for k-means cluster analysis by using a genetic algorithm that optimises the silhouettes. In: Colloquim Chemiometricum Mediterraneum. Elsevier Science, Analytica Chimica Acta, pp 87–100 Lletía R, Ortiza MC, Sarabiab LA, Sánchez MS (2004) Selecting variables for k-means cluster analysis by using a genetic algorithm that optimises the silhouettes. In: Colloquim Chemiometricum Mediterraneum. Elsevier Science, Analytica Chimica Acta, pp 87–100
92.
Zurück zum Zitat Q. Zhao, M. Xu, P. Fränti (2011) Extending external validity measures for determining the number of clusters. In: International conference on intelligent systems design and applications. IEEE, New York, pp 931–936 Q. Zhao, M. Xu, P. Fränti (2011) Extending external validity measures for determining the number of clusters. In: International conference on intelligent systems design and applications. IEEE, New York, pp 931–936
93.
Zurück zum Zitat Kryszczuk K, Hurley P (2010) Estimation of the number of clusters using multiple clustering validity indices. In: International conference on multiple classifier systems. Springer, Berlin, pp 114–123 Kryszczuk K, Hurley P (2010) Estimation of the number of clusters using multiple clustering validity indices. In: International conference on multiple classifier systems. Springer, Berlin, pp 114–123
94.
Zurück zum Zitat Bolshakova N, Azuaje F (2006) Estimating the number of clusters in DNA microarray data. In: Methods of information in medicine, pp 153–157 Bolshakova N, Azuaje F (2006) Estimating the number of clusters in DNA microarray data. In: Methods of information in medicine, pp 153–157
95.
Zurück zum Zitat Cote M, Albu AB (2014) Texture sparseness for pixel classification of business document images. Int J Doc Anal Recognit, 1–17 Cote M, Albu AB (2014) Texture sparseness for pixel classification of business document images. Int J Doc Anal Recognit, 1–17
96.
Zurück zum Zitat Mehri M, Kieu VC, Mhiri M, Héroux P, Gomez-Krämer P, Mahjoub MA, Mullot R (2014) Robustness assessment of texture features for the segmentation of ancient documents. In: International workshop on document analysis systems. IEEE, New York, pp 293–297 Mehri M, Kieu VC, Mhiri M, Héroux P, Gomez-Krämer P, Mahjoub MA, Mullot R (2014) Robustness assessment of texture features for the segmentation of ancient documents. In: International workshop on document analysis systems. IEEE, New York, pp 293–297
97.
Zurück zum Zitat Otsu N (1979) A threshold selection method from gray-level histograms. Syst Man Cybern 62–66 Otsu N (1979) A threshold selection method from gray-level histograms. Syst Man Cybern 62–66
98.
Zurück zum Zitat Shijian L, Tan CL (2008) Script and language identification in noisy and degraded document images. Pattern Anal Mach Intell, 14–24 Shijian L, Tan CL (2008) Script and language identification in noisy and degraded document images. Pattern Anal Mach Intell, 14–24
99.
Zurück zum Zitat He J, Do QDM, Downton AC, Kim JH (2005) A comparison of binarization methods for historical archive documents. In: International conference on document analysis and recognition. IEEE, New York, pp 538–542 He J, Do QDM, Downton AC, Kim JH (2005) A comparison of binarization methods for historical archive documents. In: International conference on document analysis and recognition. IEEE, New York, pp 538–542
100.
Zurück zum Zitat Lasmar AG, Kricha A, Amara NEB (2006) A segmentation text/background method for degraded ancient Arabic manuscript. In: International conference on information & communication technologies. IEEE, New York, pp 1327–1331 Lasmar AG, Kricha A, Amara NEB (2006) A segmentation text/background method for degraded ancient Arabic manuscript. In: International conference on information & communication technologies. IEEE, New York, pp 1327–1331
101.
Zurück zum Zitat Li J, Wang JZ, Wiederhold G (2000) Classification of textured and non-textured images using region segmentation. Image Process, 754–757 Li J, Wang JZ, Wiederhold G (2000) Classification of textured and non-textured images using region segmentation. Image Process, 754–757
102.
Zurück zum Zitat Cinque L, Lombardi L, Manzini G (1998) A multiresolution approach for page segmentation. Pattern Recognit Lett, 217–225 Cinque L, Lombardi L, Manzini G (1998) A multiresolution approach for page segmentation. Pattern Recognit Lett, 217–225
103.
Zurück zum Zitat Tan C, Ng P (1998) Text extraction using pyramid. Pattern Recognit, 63–72 Tan C, Ng P (1998) Text extraction using pyramid. Pattern Recognit, 63–72
104.
Zurück zum Zitat Tan C, Zhang Z (2000) Text block segmentation using pyramid structure. In: Document recognition and retrieval. SPIE, UK, pp 297–306 Tan C, Zhang Z (2000) Text block segmentation using pyramid structure. In: Document recognition and retrieval. SPIE, UK, pp 297–306
105.
Zurück zum Zitat Lemaitre A, Camillerapp J, Coüasnon B (2008) Multiresolution cooperation improves document structure recognition. Int J Doc Anal Recognit, 97–109 Lemaitre A, Camillerapp J, Coüasnon B (2008) Multiresolution cooperation improves document structure recognition. Int J Doc Anal Recognit, 97–109
106.
Zurück zum Zitat Greenspan H (1994) Multi-resolution image processing and learning for texture recognition and image enhancement. Ph.D. dissertation, California Institute of Technology Greenspan H (1994) Multi-resolution image processing and learning for texture recognition and image enhancement. Ph.D. dissertation, California Institute of Technology
107.
Zurück zum Zitat Contassot-Vivier S, Bosco GL, Dao NC (1996) Multiresolution approach for image processing. In: Erasmus ICP-A-2007 Contassot-Vivier S, Bosco GL, Dao NC (1996) Multiresolution approach for image processing. In: Erasmus ICP-A-2007
108.
Zurück zum Zitat Kricha A, Amara NEB (2011) Exploring textural analysis for historical documents characterization. J comput, 24–30 Kricha A, Amara NEB (2011) Exploring textural analysis for historical documents characterization. J comput, 24–30
109.
Zurück zum Zitat Ketchen DJ, Shook CL (1996) The application of cluster analysis in strategic management research: an analysis and critique. Strateg Manag J, 441–458 Ketchen DJ, Shook CL (1996) The application of cluster analysis in strategic management research: an analysis and critique. Strateg Manag J, 441–458
110.
Zurück zum Zitat Simpson T, Armstrong J, Jarman A (2010) Merged consensus clustering to assess and improve class discovery with microarray data. Boston Med Center Bioinf, 1471–1482 Simpson T, Armstrong J, Jarman A (2010) Merged consensus clustering to assess and improve class discovery with microarray data. Boston Med Center Bioinf, 1471–1482
111.
Zurück zum Zitat Monti S, Tamayo P, Mesirov J, Golub T (2003) Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach Learn, 91–118 Monti S, Tamayo P, Mesirov J, Golub T (2003) Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach Learn, 91–118
112.
Zurück zum Zitat Nguyen G, Coustaty M, Ogier JM (2010) Stroke feature extraction for lettrine indexing. In: International conference on image processing theory tools and applications. IEEE, New York, pp 355–360 Nguyen G, Coustaty M, Ogier JM (2010) Stroke feature extraction for lettrine indexing. In: International conference on image processing theory tools and applications. IEEE, New York, pp 355–360
113.
Zurück zum Zitat Ward J (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc, 236–244 Ward J (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc, 236–244
114.
Zurück zum Zitat Lalys F, Haegelen C, Mehri M, Drapier S, Vérin M, Jannin P (2013) Anatomo-clinical atlases correlate clinical data and electrode contact coordinates: application to subthalamic deep brain stimulation. J Neurosci, 297–307 Lalys F, Haegelen C, Mehri M, Drapier S, Vérin M, Jannin P (2013) Anatomo-clinical atlases correlate clinical data and electrode contact coordinates: application to subthalamic deep brain stimulation. J Neurosci, 297–307
115.
Zurück zum Zitat Knuth DE (1997) The art of computer programming, sorting and searching, vol 3, 2nd edn. Addison Wesley Longman Publishing Co, UK Knuth DE (1997) The art of computer programming, sorting and searching, vol 3, 2nd edn. Addison Wesley Longman Publishing Co, UK
116.
Zurück zum Zitat Mahalanobis P (1936) On the generalised distance in statistics. In: Proceedings of the National Institute of Sciences of India, NISI, pp 49–55 Mahalanobis P (1936) On the generalised distance in statistics. In: Proceedings of the National Institute of Sciences of India, NISI, pp 49–55
117.
Zurück zum Zitat Doermann D, Zotkina E, Li H (2010) GEDI—a groundtruthing environment for document images. In: International workshop on document analysis systems. ACM, New York Doermann D, Zotkina E, Li H (2010) GEDI—a groundtruthing environment for document images. In: International workshop on document analysis systems. ACM, New York
118.
Zurück zum Zitat Ge F, Wang S, Liu T (2007) New benchmark for image segmentation evaluation. J Electron Imag, 1–16 Ge F, Wang S, Liu T (2007) New benchmark for image segmentation evaluation. J Electron Imag, 1–16
119.
Zurück zum Zitat Zhang H, Fritts J, Goldman S (2008) Image segmentation evaluation: a survey of unsupervised methods. Comput Vis Image Understanding, 260–280 Zhang H, Fritts J, Goldman S (2008) Image segmentation evaluation: a survey of unsupervised methods. Comput Vis Image Understanding, 260–280
120.
Zurück zum Zitat Wontaek S, Agrawal M, Doermann D (2010) Performance evaluation tools for zone segmentation and classification (PETS). In: International conference on pattern recognition. IEEE, New York, pp 503–506 Wontaek S, Agrawal M, Doermann D (2010) Performance evaluation tools for zone segmentation and classification (PETS). In: International conference on pattern recognition. IEEE, New York, pp 503–506
121.
Zurück zum Zitat Rendón E, Abundez I, Arizmendi A, Quiroz EM (2011) Internal versus external cluster validation indexes. Int J Comput Commun, 27–34 Rendón E, Abundez I, Arizmendi A, Quiroz EM (2011) Internal versus external cluster validation indexes. Int J Comput Commun, 27–34
122.
Zurück zum Zitat Rendón E, Abundez I, Gutierrez C, Zagal SD, Arizmendi A, Quiroz EM, Arzate HE (2011) A comparison of internal and external cluster validation indexes. In: Applications of mathematics and computer engineering (AMERICAN-MATH/CEA. World Scientific and Engineering Academy and Society (WSEAS), pp 158–163 Rendón E, Abundez I, Gutierrez C, Zagal SD, Arizmendi A, Quiroz EM, Arzate HE (2011) A comparison of internal and external cluster validation indexes. In: Applications of mathematics and computer engineering (AMERICAN-MATH/CEA. World Scientific and Engineering Academy and Society (WSEAS), pp 158–163
123.
Zurück zum Zitat Silva A (2011) Metrics for evaluating performance in document analysis: application to tables. Int J Doc Anal Recognit, 101–109 Silva A (2011) Metrics for evaluating performance in document analysis: application to tables. Int J Doc Anal Recognit, 101–109
124.
Zurück zum Zitat Jensen JR (1986) Introductory digital image processing. Prentice-Hall, Englewood Cliffs Jensen JR (1986) Introductory digital image processing. Prentice-Hall, Englewood Cliffs
125.
Zurück zum Zitat Mather PM (1999) Computer processing of remotely-sensed images: an introduction, 2nd edn. Wiley, New York Mather PM (1999) Computer processing of remotely-sensed images: an introduction, 2nd edn. Wiley, New York
126.
Zurück zum Zitat Makhoul J, Kubala F, Schwartz R, Weischedel R (1999) Performance measures for information extraction. In: DARPA Broadcast News Workshop. Morgan Kaufmann Publishers Inc, Burlington, pp 249–252 Makhoul J, Kubala F, Schwartz R, Weischedel R (1999) Performance measures for information extraction. In: DARPA Broadcast News Workshop. Morgan Kaufmann Publishers Inc, Burlington, pp 249–252
127.
Zurück zum Zitat Wei JM, Yuan XJ, Hub QH, Wang SQ (2010) A novel measure for evaluating classifiers. Exp Syst Appl, 3799–3809 Wei JM, Yuan XJ, Hub QH, Wang SQ (2010) A novel measure for evaluating classifiers. Exp Syst Appl, 3799–3809
128.
Zurück zum Zitat Powers DMW (2011) Evaluation: from precision, recall and F-factor to ROC, informedness, markedness & correlation. J Mach Learn Technol, pp 37–63 Powers DMW (2011) Evaluation: from precision, recall and F-factor to ROC, informedness, markedness & correlation. J Mach Learn Technol, pp 37–63
129.
Zurück zum Zitat Liu B (2011) Web data mining: exploring hyperlinks, contents, and usage data. Springer, Berlin Liu B (2011) Web data mining: exploring hyperlinks, contents, and usage data. Springer, Berlin
130.
Zurück zum Zitat Santra AK, Christy CJ (2012) Genetic algorithm and confusion matrix for document clustering. Int J Comput Sci, 322–328 Santra AK, Christy CJ (2012) Genetic algorithm and confusion matrix for document clustering. Int J Comput Sci, 322–328
131.
Zurück zum Zitat Saxena PC, Navaneetham K (1991) The effect of cluster size, dimensionality, and number of clusters on recovery of true cluster structure through Chernoff-type faces. J R Stat Soc Stat, 415–425 Saxena PC, Navaneetham K (1991) The effect of cluster size, dimensionality, and number of clusters on recovery of true cluster structure through Chernoff-type faces. J R Stat Soc Stat, 415–425
132.
Zurück zum Zitat Fowlkes EB, Mallows CL (1983) A method for comparing two hierarchical clusterings. J Am Stat Assoc, 553–569 Fowlkes EB, Mallows CL (1983) A method for comparing two hierarchical clusterings. J Am Stat Assoc, 553–569
133.
Zurück zum Zitat Zhao Y, Karypis G (2001) Criterion functions for document clustering: experiments and analysis. Department of Computer Science, University of Minnesota, Tech. Rep. Technical report TR 0140 Zhao Y, Karypis G (2001) Criterion functions for document clustering: experiments and analysis. Department of Computer Science, University of Minnesota, Tech. Rep. Technical report TR 0140
134.
Zurück zum Zitat Krzanowski WJ, Lai YT (1988) A criterion for determining the number of groups in a data set using sum-of-squares clustering. International Biometric Society, JSTOR, pp 23–34 Krzanowski WJ, Lai YT (1988) A criterion for determining the number of groups in a data set using sum-of-squares clustering. International Biometric Society, JSTOR, pp 23–34
135.
Zurück zum Zitat Hartigan JA (1975) Clustering algorithms. Wiley, New York Hartigan JA (1975) Clustering algorithms. Wiley, New York
136.
Zurück zum Zitat Calinski RB, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat, 1–27 Calinski RB, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat, 1–27
137.
Zurück zum Zitat Sarle WS (1983) The cubic clustering criterion. SAS Institute, Tech. Rep. SAS technical report A-108: the cubic clustering criterion Sarle WS (1983) The cubic clustering criterion. SAS Institute, Tech. Rep. SAS technical report A-108: the cubic clustering criterion
138.
Zurück zum Zitat Scott AJ, Symons MJ (1971) Clustering methods based on likelihood ratio criteria. Biometrics, 387–397 Scott AJ, Symons MJ (1971) Clustering methods based on likelihood ratio criteria. Biometrics, 387–397
139.
Zurück zum Zitat Marriott FH (1971) Practical problems in a method of cluster analysis. Biometrics, 501–514 Marriott FH (1971) Practical problems in a method of cluster analysis. Biometrics, 501–514
140.
Zurück zum Zitat Milligan GW, Cooper M (1985) An examination of procedures for determining the number of clusters in a data set. Psychometrika, 159–179 Milligan GW, Cooper M (1985) An examination of procedures for determining the number of clusters in a data set. Psychometrika, 159–179
141.
Zurück zum Zitat Friedman HP, Rubin J (1967) On some invariant criteria for grouping data. J Am Stat Assoc, pp 1159–1178 Friedman HP, Rubin J (1967) On some invariant criteria for grouping data. J Am Stat Assoc, pp 1159–1178
142.
Zurück zum Zitat Rubin J (1967) Optimal classification into groups: an approach for solving the taxonomy problem. J Theor Biol, 103–144 Rubin J (1967) Optimal classification into groups: an approach for solving the taxonomy problem. J Theor Biol, 103–144
143.
Zurück zum Zitat Hubert LJ, Levin JR (1976) A general statistical framework for assessing categorical clustering in free recall. Psychol Bull, 1072–1080 Hubert LJ, Levin JR (1976) A general statistical framework for assessing categorical clustering in free recall. Psychol Bull, 1072–1080
144.
Zurück zum Zitat Davies DL, Bouldin DW (1979) A cluster separation measure. Pattern Anal Mach Intell, 224–227 Davies DL, Bouldin DW (1979) A cluster separation measure. Pattern Anal Mach Intell, 224–227
145.
Zurück zum Zitat Ratkowsky DA, Lance GN (1978) A criterion for determining the number of groups in a classification. Aust Comput J, 115–117 Ratkowsky DA, Lance GN (1978) A criterion for determining the number of groups in a classification. Aust Comput J, 115–117
146.
Zurück zum Zitat Ball GH, Hall DJ (1965) ISODATA, a novel method of data analysis and pattern classification. Menlo Park: Stanford Research Institute, Tech. Rep. AD0699616 Ball GH, Hall DJ (1965) ISODATA, a novel method of data analysis and pattern classification. Menlo Park: Stanford Research Institute, Tech. Rep. AD0699616
147.
Zurück zum Zitat Milligan GW (1980) An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika, 325–342 Milligan GW (1980) An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika, 325–342
148.
Zurück zum Zitat Frey T, Groenewoud HV (1972) A cluster analysis of the d-squared matrix of white spruce stands in saskatchewan based on the maximum-minimum principle. J Ecol, 873–886 Frey T, Groenewoud HV (1972) A cluster analysis of the d-squared matrix of white spruce stands in saskatchewan based on the maximum-minimum principle. J Ecol, 873–886
149.
Zurück zum Zitat McClain JO, Rao VR (1975) CLUSTISZ: a program to test for the quality of clustering of a set of objects. J Market Res, 456–460 McClain JO, Rao VR (1975) CLUSTISZ: a program to test for the quality of clustering of a set of objects. J Market Res, 456–460
150.
Zurück zum Zitat Dunn J (1974) Well separated clusters and optimal fuzzy partitions. J Cybern, 95–104 Dunn J (1974) Well separated clusters and optimal fuzzy partitions. J Cybern, 95–104
151.
Zurück zum Zitat Halkidi M, Vazirgiannis M, Batistakis I (2000) Quality scheme assessment in the clustering process. In: Principles and practice of knowledge in databases. Springer, Berlin, pp 265–276 Halkidi M, Vazirgiannis M, Batistakis I (2000) Quality scheme assessment in the clustering process. In: Principles and practice of knowledge in databases. Springer, Berlin, pp 265–276
152.
Zurück zum Zitat Halkidi M, Batistakis I, Vazirgiannis M (2001) On clustering validation techniques. J Intell Inf Syst, 107–145 Halkidi M, Batistakis I, Vazirgiannis M (2001) On clustering validation techniques. J Intell Inf Syst, 107–145
153.
Zurück zum Zitat Deza E, Deza MM (2013) Encyclopedia of distances. Springer, Berlin Deza E, Deza MM (2013) Encyclopedia of distances. Springer, Berlin
154.
Zurück zum Zitat Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc, 846–850 Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc, 846–850
155.
Zurück zum Zitat Hubert L, Arabic P (1985) Comparing partitions. J Classif, 193–218 Hubert L, Arabic P (1985) Comparing partitions. J Classif, 193–218
156.
Zurück zum Zitat Kraskov A, Stögbauer H, Andrzejak RG, Grassberger P (2003) Hierarchical clustering based on mutual information. In: Quantitative methods (q-bio.QM). CoRR q-bio.QM/0311039, 2003, pp 193–218 Kraskov A, Stögbauer H, Andrzejak RG, Grassberger P (2003) Hierarchical clustering based on mutual information. In: Quantitative methods (q-bio.QM). CoRR q-bio.QM/0311039, 2003, pp 193–218
157.
Zurück zum Zitat Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res, 2837–2854 Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res, 2837–2854
158.
Zurück zum Zitat Wei H, Chen K, Ingold R, Liwicki M (2014) Hybrid feature selection for historical document layout analysis. In: International conference on frontiers in handwriting recognition. IEEE, New York, pp 87–92 Wei H, Chen K, Ingold R, Liwicki M (2014) Hybrid feature selection for historical document layout analysis. In: International conference on frontiers in handwriting recognition. IEEE, New York, pp 87–92
Metadaten
Titel
A texture-based pixel labeling approach for historical books
verfasst von
Maroua Mehri
Petra Gomez-Krämer
Pierre Héroux
Alain Boucher
Rémy Mullot
Publikationsdatum
08.02.2015
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 2/2017
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-015-0451-9

Weitere Artikel der Ausgabe 2/2017

Pattern Analysis and Applications 2/2017 Zur Ausgabe

Premium Partner