Abstract
We develop the classification part of a system that analyses transmitted light microscope images of dispersed kerogen preparation. The system automatically extracts kerogen pieces from the image and labels each piece as either inertinite or vitrinite. The image pre-processing analysis consists of background removal, identification of kerogen material, object segmentation, object extraction (individual images of pieces of kerogen) and feature calculation for each object. An expert palynologist was asked to label the objects into categories inertinite and vitrinite, which provided the ground truth for the classification experiment. Ten state-of-the-art classifiers and classifier ensembles were compared: Naïve Bayes, decision tree, nearest neighbour, the logistic classifier, multilayered perceptron (MLP), support vector machines (SVM), AdaBoost, Bagging, LogitBoost and Random Forest. The logistic classifier was singled out as the most accurate classifier, with an accuracy greater than 90. Using a 10 times 10-fold cross-validation provided within the Weka software, we found that the logistic classifier was significantly better than five classifiers (p<0.05) and indistinguishable from the other four classifiers. The initial set of 32 features was subsequently reduced to 6 features without compromising the classification accuracy. A further evaluation of the system alerted us to the possible sensitivity of the classification to the ground truth that might vary from one human expert to another. The analysis also revealed that the logistic classifier made most of the correct classifications with a high certainty.
Similar content being viewed by others
References
Athersuch J, Banner FT, Higgins AC, Howarth RJ, Swaby PA (1994) The application of expert systems to the identification and use of microfossils in the petroleum industry. Math Geol 26(4):483–489
Bishop CM (1995) Neural networks for pattern recognition. Clarendon Press, Oxford, 504 p
Bishop CM (2006) Pattern recognition and machine learning. Springer, New York, 738 p
Bollmann J, Quinn P, Vela M, Brabec B, Brechner S, Cortés M, Hilbrecht H, Schmidt DN, Schiebel R, Thierstein HR (2004) Automated particle analysis: Calcareous microfossils. In: Francus P (ed) Image analysis, sediments and paleoenvironments. Kluwer Academic, Dordrecht, pp 229–252
Bonton P, Boucher A, Thonnat M, Tomczak R, Hidalgo P, Belmonte J, Galan C (2001) Colour image in 2d and 3d microscopy for the automation of pollen rate measurement. Image Anal Stereol 20:527–532
Boucher A, Hidalgo P, Thonnat M, Belmonte J, Galan C, Bonton P, Tomczak R (2002) Development of a semi-automatic system for pollen recognition. Aerobiologia 18(3–4):195–201
Breiman L (1996) Bagging predictors. Mach Learn 26(2):123–140
Breiman L (2001) Random forests. Mach Learn 45:5–32
Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth International, Belmont, 335 p
Charles JJ, Kuncheva L, Wells B, Lim I (2008a) Object segmentation within microscope images of palynofacies. Comput Geosci 34:688–698. http://dx.doi.org/10.1016/j.cageo.2007.09.014
Charles JJ, Kuncheva LI, Wells B, Lim I (2008b) Background segmentation in microscope images. In: Proc 3rd international conference on computer vision theory and applications VISAPP08, Madeira, Portugal, pp 283–294
Cristianini N, Taylor S-J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge, 189 p
Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, New York, 680 p
Flesche H, Nielsen AA, Larsen R (2000) Supervised mineral classification with semiautomatic training and validation set generation in scanning electron microscope energy dispersive spectroscopy images of thin sections. Math Geol 32(3):337–366
France I, Duller A, Duller G, Lamb H (2000) A new approach to automated pollen analysis. Quat Sci Rev 19(6):537–546
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 28(2):337–374
Hand DJ, Yu K (2001) Idiot’s Bayes—not so stupid after all? Int Stat Rev 69:385–398
Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New York, 536 p
Hills S (1988) Outline extraction of microfossils in reflected light images. Camput Geosci 14(4):481–488
Jonker R, Groben R, Tarran G, Medlin L, Wilkins M, Garcia L, Zabala L, Boddy L (2000) Automated identification and characterisation of microbial populations using flow cytometry: the aims project. Sci Mar 64:225–234
Kuncheva LI (2004) Combining pattern classifiers. Methods and algorithms. Wiley, New York, 376 p
Liu S, Thonnat M, Berthod M (1994) Automatic classification of planktonic foraminifera by a knowledge-based system. In: Proceedings of the 10th conference on artificial intelligence for applications. IEEE Computer Society Press, San Antonio, pp 358–364
Swaby PA (1992) VIDES: An expert system for visually identifying microfossils. IEEE Expert: Intell Syst Their Appl 7(2):36–42
Tyson RV (1990) Automated transmitted light kerogen typing by image analysis. Meded Rijks Geol Dienst 45:139–149
Vincent L, Soille P (1991) Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Trans Pattern Anal Mach Intell 13(6):583–598
Wang L (1995) Automatic identification of rocks in thin sections using texture analysis. Math Geol 27(7):847–865
Weller AF, Corcoran J, Harris AJ, Ware JA (2005) The semi-automated classification of sedimentary organic matter in palynological preparations. Comput Geosci 31(10):1213–1223
Weller AF, Harris AJ, Ware JA, Jarvis PS (2006) Determining the saliency of feature measurements obtained from images of sedimentary organic matter for use in its classification. Comput Geosci 32(9):1357–1367
Weller AF, Harris AJ, Ware JA (2007) Two supervised neural networks for classification of sedimentary organic matter images from palynological preparations. Math Geol 39(7):657–671
Wilkins MF, Boddy L, Morris CW, Jonker RR (1999) Identification of phytoplankton from flow cytometry data by using radial basis function neural networks. Appl Environ Microbiol 65(10):4404—4410
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Mateo, 525 p
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kuncheva, L.I., Charles, J.J., Miles, N. et al. Automated Kerogen Classification in Microscope Images of Dispersed Kerogen Preparation. Math Geosci 40, 639–652 (2008). https://doi.org/10.1007/s11004-008-9163-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11004-008-9163-7