Multiscale recognition of legume varieties based on leaf venation images
Introduction
Many works in the current literature deal with the problem of automatically identifying plants by means of foliar image analysis. One of the most common approaches consists in performing shape analysis of the leaves (Agarwal et al., 2006, Camargo Neto et al., 2006, Chaki and Parekh, 2012, Du et al., 2007, Im et al., 1998, Solé-Casals et al., 2008). Leaf color and texture can also be taken into consideration. In the work by Pydipati, Burks, and Lee (2006), color texture features of the leaves are used in combination with discriminant analysis to detect citrus diseases. Also, a combination of shape, texture and color features are used in the papers by Golzarian and Frick (2011) and Bama, Valli, Raju, and Kumar (2011).
However, in some practical situations there are not evident differences in the shape, size, color or texture features of the leaves for the plants under study. This is the case, for example, of plants that belong to several cultivars from the same species.
Since there exists correlation between leaf venation characteristics and leaf properties (such as damage and drought tolerance, among others) (Sack et al., 2008, Scoffoni et al., 2011), some works in the recent literature highlight the importance of analyzing the structure of the venation system as a means to perform leaf-based plant identification. In the paper by Park, Hwang, and Nam (2008), a content-based image retrieval system is proposed which analyzes the venation of a leaf sketch drawn by the user as an initial categorization, and then uses shape features to find similar leaves existing in the database. On the other hand, Clarke et al. (2006) and Valliammal and Geethalakshmi (2011) propose new methods for leaf vein segmentation. However, neither work includes any characterization or recognition tasks. Recently, Du, Zhai, and Wang (2013) propose a method based on fractal dimension features computed both on the veins and the leaf outline, and employ k-nearest neighbors to perform classification on different leaves. However, these leaves are visually very different and belong to very different families. Additionally, the computed features do not provide the experts with a simple vein description and may not lead to a human direct interpretation.
In a recent work, Price, Symonova, Mileyko, Hilley, and Weitz (2011) developed an interactive graphic tool named LEAF GUI, aimed at thresholding, cleaning and segmenting stained leaf vein and areole images in a user-assisted way. In addition, the software allows to extract several measures which are automatically computed on these structures. The segmentation algorithms require that the visibility of the veins were previously enhanced by means of X-ray techniques, chemical or biological clearing, or back-lit scanning. LEAF GUI does not include any feature selection algorithm or any plant classification/recognition procedure.
Agricultural specialists require, in many situations, to identify which species a certain batch of plants corresponds to. Depending on the species and growth stage of the plants, leaves, flowers, fruits and/or seeds can be used in conjunction to recognize the species. But in many other situations, this is not possible. For example, if the goal is to differentiate diverse cultivars/varieties from the same species, all the previous mentioned characteristics may be visually the same. One possibility is to perform DNA analysis to accurately determine the variety, but this method is expensive. On the other hand, we propose to investigate the possibility of searching for distinctive venation patterns that could uniquely identify the varieties using a low cost procedure based on an image analysis and machine learning system.
We showed in a previous work (Larese et al., 2014) that it is possible to recognize different species using exclusively information from the leaf veins. The motivation of the present work is to extend the analysis to the more difficult problem of recognizing varieties from the same species. We search for the existence of distinctive leaf vein patterns for different cultivars, when all the other leaf characteristics (e.g., shape, color and texture) are similar. If the plants under study have different physiological characteristics (e.g., drought tolerance), there is a chance that these properties can be reflected in their veins even if the leaves look similar. In this work we propose an automatic low cost procedure aimed at segmenting and characterizing the leaf veins of plants from the same family. An automatic procedure is desirable since it provides reliability, reproducibility and economy, besides of providing a solution to a problem which is not easily solved by the human experts, as it is the cultivars recognition.
Since this problem is more difficult than separating different species, we propose to measure vein traits from images at different scales. We first try the procedure on the simpler problem of species recognition, showing that the new approach improves the results reported in our previous work (Larese et al., 2014). Next, we analyze the cultivar recognition problem.
We use three legume species and three soybean cultivars in order to perform the species and variety recognition, respectively. The leaves are acquired using a standard flatbed scanner. We perform the automatic plant recognition by means of measuring and classifying morphological traits from central patches extracted from the previously segmented venation system, i.e., no leaf shape, color or texture information is considered. We also analyze the distinctive vein characteristics for each class.
For this purpose, we start by performing segmentation using the Unconstrained Hit-or-Miss Transform (UHMT) and adaptive image thresholding in order to extract the veins at several image scale levels. The UHMT is a mathematical morphology operator useful to perform template matching. It extracts all the pixels which follow a certain foreground and background neighboring configuration.
After segmentation, we compute several morphological measures on the segmented veins at the different scales, and use them as features in the classification process. The recognition is performed resorting to three different classifiers, namely, Random Forests (Breiman, 2001), Support Vector Machines with Gaussian kernel (Vapnik, 1995) and Penalized Discriminant Analysis (Hastie, Buja, & Tibshirani, 1995). Recursive Feature Elimination (Guyon, Weston, Barnhill, & Vapnik, 2002) is also used in combination with the three classifiers in order to estimate the importance of the input variables in the classification process for the different species and varieties.
The analysis is performed on two different problems. First of all, we consider the discrimination between three classes of legumes, namely soybean (Glycine max (L) Merr), red and white beans (Phaseolus vulgaris). Red and white beans have very similar leaves, which are slightly darker for the former. However, in this work we do not consider color information, but only morphological features of the veins obtained from the gray scale images.
The second problem consists of identifying three different cultivars of soybean. This task is more challenging by far, since the differences in the veins are not obvious to the human experts. Automatic classification would come to solve this issue in an inexpensive way. Additionally, the procedure would highlight relevant distinctive vein features for each cultivar, and possibly help to relate these differences to variety adaptation.
The rest of the paper is organized as follows. In Section 2.1 we describe the leaf images dataset. Sections 2.2 Unconstrained Hit or Miss Transform (UHMT), 2.3 Vein segmentation summarize the segmentation procedure that we employed to extract the leaf venation system. We detail the measures computed on the segmented veins in Section 2.4. We briefly describe the classification and feature selection algorithms in Section 2.5. We present and discuss the results in Section 3, where we assess the performance of the procedure and analyze the relevant features. Finally, we draw some conclusions in Section 4.
Section snippets
Leaf images dataset
The dataset used in this paper is composed by a total number of 866 color leaf images provided by Instituto Nacional de Tecnología Agropecuaria (INTA, Oliveros, Argentina). The dataset is divided in the following way: 422 images correspond to soybean leaves (198 belong to cultivar 1, 176 belong to cultivar 2, and 48 belong to cultivar 3), 272 images are from red bean leaves and 172 from white bean leaves. They are the images of the first foliage leaves (pre-formed in the seed) of 433 specimens
Results and discussion
The total number of features computed per leaf rises to 208, i.e., 52 features × 4 patches (combined veins and 3 scales). As a preprocessing step, all the features exhibiting near zero variance across the examples were discarded. Also, the data were normalized (centered and scaled). For each one of the three classifiers described in Section 2.5, both the whole set of features and a subset composed by the optimal number of relevant features (according to RFE) were considered. We also compared the
Conclusions
In this work we show how an automatic image analysis and machine learning system can be implemented to classify leaves from different species and varieties. This system can provide a reliable, repeatable and economical means of recognizing plants, outperforming manual expert classification. By focusing on vein features only, we attempt to deal with the problem of visually similar leaves from different cultivars or varieties, which otherwise require to be processed by expensive methods, such as
Acknowledgments
MGL, AEB and PMG acknowledge grant support from ANPCyT PICT 2012-0181.
References (30)
- et al.
Leaf shape based plant species recognition
Applied Mathematics and Computation
(2007) - et al.
Recognition of plant leaf image based on fractal dimension features
Neurocomputing
(2013) - et al.
Wrappers for feature subset selection
Artificial Intelligence
(1997) - et al.
Automatic classification of legumes using leaf vein image features
Pattern Recognition
(2014) - et al.
Utilizing venation features for efficient leaf image retrieval
Journal of Systems and Software
(2008) - et al.
Identification of citrus disease using color texture features and discriminant analysis
Computers and Electronics in Agriculture
(2006) - et al.
First steps toward an electronic field guide for plants
Taxon, Journal of the International Association for Plant Taxonomy
(2006) - et al.
Content based leaf image retrieval (CBLIR) using shape, color and texture features
Indian Journal of Computer Science and Engineering
(2011) Random forests
Machine Learning
(2001)- et al.
Plant species identification using Elliptic Fourier leaf shape analysis
Computers and Electronics in Agriculture
(2006)
Designing an automated system for plant leaf recognition
International Journal of Advances in Engineering & Technology
Classification of images of wheat, ryegrass and brome grass species at early growth stages using principal component analysis
Plant Methods
Gene selection for cancer classification using support vector machines
Machine Learning
Penalized discriminant analysis
Annals of Statistics
Cited by (49)
Symmetry-constrained linear sliding co-occurrence LBP for fine-grained leaf image retrieval
2024, Computers and Electronics in AgricultureCUDU-Net: Collaborative up-sampling decoder U-Net for leaf vein segmentation
2024, Digital Signal Processing: A Review JournalLeaf classification on Flavia dataset: A detailed review
2023, Sustainable Computing: Informatics and SystemsSPARE: Self-supervised part erasing for ultra-fine-grained visual categorization
2022, Pattern RecognitionFusing deep learning features of triplet leaf image patterns to boost soybean cultivar identification
2022, Computers and Electronics in AgricultureCitation Excerpt :Recently, there is an increasing concern whether leaf image patterns can also provide powerful discriminative information for cultivar-level recognition. Larese et al. (2014a) proposed to characterize and analyze the leaf venation network for distinguishing legume varieties. They reported an average accuracy of 60.20% on distinguishing three diverse soybean cultivars using the best version of their proposed methods.
MaskCOV: A random mask covariance network for ultra-fine-grained visual categorization
2021, Pattern Recognition