Multiscale recognition of legume varieties based on leaf venation images

doi:10.1016/j.eswa.2014.01.029

Expert Systems with Applications

Volume 41, Issue 10, August 2014, Pages 4638-4647

https://doi.org/10.1016/j.eswa.2014.01.029 Get rights and content

Highlights

•
We develop an automatic low cost procedure to classify legume varieties.
•
The method is based on multiscale feature analysis of leaf venation images.
•
We use modern automatic classifiers and feature selection techniques.
•
We improve previous results in the recent literature.
•
The proposed method outperforms human expert classification.

Abstract

In this work we propose an automatic low cost procedure aimed at classifying legume species and varieties based exclusively on the characterization and analysis of the leaf venation network. The identification of leaf venation patterns which are characteristic for each species or variety is not an easy task since in some situations (specially for cultivars from the same species) the vein differences are visually indistinguishable for humans. The proposed procedure takes as input leaf images acquired using a standard scanner, processes the images in order to segment the veins at different scales, and measures different traits on them. We use these features in combination with modern automatic classifiers and feature selection techniques in order to perform recognition. The process was initially applied to recognize three different legumes in order to evaluate the improvements over previous works in the literature, and then it was employed to distinguish three diverse soybean cultivars. The results show the improvements achieved by the usage of the multiscale features. The cultivar recognition is a more challenging problem, since the experts cannot distinguish evident differences in plain sight. However, we achieve acceptable classification results. We also analyze the feature relevance and identify, for each classifier, a small set of distinctive traits to differentiate the species and varieties.

Introduction

Many works in the current literature deal with the problem of automatically identifying plants by means of foliar image analysis. One of the most common approaches consists in performing shape analysis of the leaves (Agarwal et al., 2006, Camargo Neto et al., 2006, Chaki and Parekh, 2012, Du et al., 2007, Im et al., 1998, Solé-Casals et al., 2008). Leaf color and texture can also be taken into consideration. In the work by Pydipati, Burks, and Lee (2006), color texture features of the leaves are used in combination with discriminant analysis to detect citrus diseases. Also, a combination of shape, texture and color features are used in the papers by Golzarian and Frick (2011) and Bama, Valli, Raju, and Kumar (2011).

However, in some practical situations there are not evident differences in the shape, size, color or texture features of the leaves for the plants under study. This is the case, for example, of plants that belong to several cultivars from the same species.

Since there exists correlation between leaf venation characteristics and leaf properties (such as damage and drought tolerance, among others) (Sack et al., 2008, Scoffoni et al., 2011), some works in the recent literature highlight the importance of analyzing the structure of the venation system as a means to perform leaf-based plant identification. In the paper by Park, Hwang, and Nam (2008), a content-based image retrieval system is proposed which analyzes the venation of a leaf sketch drawn by the user as an initial categorization, and then uses shape features to find similar leaves existing in the database. On the other hand, Clarke et al. (2006) and Valliammal and Geethalakshmi (2011) propose new methods for leaf vein segmentation. However, neither work includes any characterization or recognition tasks. Recently, Du, Zhai, and Wang (2013) propose a method based on fractal dimension features computed both on the veins and the leaf outline, and employ k-nearest neighbors to perform classification on different leaves. However, these leaves are visually very different and belong to very different families. Additionally, the computed features do not provide the experts with a simple vein description and may not lead to a human direct interpretation.

In a recent work, Price, Symonova, Mileyko, Hilley, and Weitz (2011) developed an interactive graphic tool named LEAF GUI, aimed at thresholding, cleaning and segmenting stained leaf vein and areole images in a user-assisted way. In addition, the software allows to extract several measures which are automatically computed on these structures. The segmentation algorithms require that the visibility of the veins were previously enhanced by means of X-ray techniques, chemical or biological clearing, or back-lit scanning. LEAF GUI does not include any feature selection algorithm or any plant classification/recognition procedure.

Agricultural specialists require, in many situations, to identify which species a certain batch of plants corresponds to. Depending on the species and growth stage of the plants, leaves, flowers, fruits and/or seeds can be used in conjunction to recognize the species. But in many other situations, this is not possible. For example, if the goal is to differentiate diverse cultivars/varieties from the same species, all the previous mentioned characteristics may be visually the same. One possibility is to perform DNA analysis to accurately determine the variety, but this method is expensive. On the other hand, we propose to investigate the possibility of searching for distinctive venation patterns that could uniquely identify the varieties using a low cost procedure based on an image analysis and machine learning system.

We showed in a previous work (Larese et al., 2014) that it is possible to recognize different species using exclusively information from the leaf veins. The motivation of the present work is to extend the analysis to the more difficult problem of recognizing varieties from the same species. We search for the existence of distinctive leaf vein patterns for different cultivars, when all the other leaf characteristics (e.g., shape, color and texture) are similar. If the plants under study have different physiological characteristics (e.g., drought tolerance), there is a chance that these properties can be reflected in their veins even if the leaves look similar. In this work we propose an automatic low cost procedure aimed at segmenting and characterizing the leaf veins of plants from the same family. An automatic procedure is desirable since it provides reliability, reproducibility and economy, besides of providing a solution to a problem which is not easily solved by the human experts, as it is the cultivars recognition.

Since this problem is more difficult than separating different species, we propose to measure vein traits from images at different scales. We first try the procedure on the simpler problem of species recognition, showing that the new approach improves the results reported in our previous work (Larese et al., 2014). Next, we analyze the cultivar recognition problem.

We use three legume species and three soybean cultivars in order to perform the species and variety recognition, respectively. The leaves are acquired using a standard flatbed scanner. We perform the automatic plant recognition by means of measuring and classifying morphological traits from central patches extracted from the previously segmented venation system, i.e., no leaf shape, color or texture information is considered. We also analyze the distinctive vein characteristics for each class.

For this purpose, we start by performing segmentation using the Unconstrained Hit-or-Miss Transform (UHMT) and adaptive image thresholding in order to extract the veins at several image scale levels. The UHMT is a mathematical morphology operator useful to perform template matching. It extracts all the pixels which follow a certain foreground and background neighboring configuration.

After segmentation, we compute several morphological measures on the segmented veins at the different scales, and use them as features in the classification process. The recognition is performed resorting to three different classifiers, namely, Random Forests (Breiman, 2001), Support Vector Machines with Gaussian kernel (Vapnik, 1995) and Penalized Discriminant Analysis (Hastie, Buja, & Tibshirani, 1995). Recursive Feature Elimination (Guyon, Weston, Barnhill, & Vapnik, 2002) is also used in combination with the three classifiers in order to estimate the importance of the input variables in the classification process for the different species and varieties.

The analysis is performed on two different problems. First of all, we consider the discrimination between three classes of legumes, namely soybean (Glycine max (L) Merr), red and white beans (Phaseolus vulgaris). Red and white beans have very similar leaves, which are slightly darker for the former. However, in this work we do not consider color information, but only morphological features of the veins obtained from the gray scale images.

The second problem consists of identifying three different cultivars of soybean. This task is more challenging by far, since the differences in the veins are not obvious to the human experts. Automatic classification would come to solve this issue in an inexpensive way. Additionally, the procedure would highlight relevant distinctive vein features for each cultivar, and possibly help to relate these differences to variety adaptation.

The rest of the paper is organized as follows. In Section 2.1 we describe the leaf images dataset. Sections 2.2 Unconstrained Hit or Miss Transform (UHMT), 2.3 Vein segmentation summarize the segmentation procedure that we employed to extract the leaf venation system. We detail the measures computed on the segmented veins in Section 2.4. We briefly describe the classification and feature selection algorithms in Section 2.5. We present and discuss the results in Section 3, where we assess the performance of the procedure and analyze the relevant features. Finally, we draw some conclusions in Section 4.

Section snippets

Leaf images dataset

The dataset used in this paper is composed by a total number of 866 color leaf images provided by Instituto Nacional de Tecnología Agropecuaria (INTA, Oliveros, Argentina). The dataset is divided in the following way: 422 images correspond to soybean leaves (198 belong to cultivar 1, 176 belong to cultivar 2, and 48 belong to cultivar 3), 272 images are from red bean leaves and 172 from white bean leaves. They are the images of the first foliage leaves (pre-formed in the seed) of 433 specimens

Results and discussion

The total number of features computed per leaf rises to 208, i.e., 52 features × 4 patches (combined veins and 3 scales). As a preprocessing step, all the features exhibiting near zero variance across the examples were discarded. Also, the data were normalized (centered and scaled). For each one of the three classifiers described in Section 2.5, both the whole set of features and a subset composed by the optimal number of relevant features (according to RFE) were considered. We also compared the

Conclusions

In this work we show how an automatic image analysis and machine learning system can be implemented to classify leaves from different species and varieties. This system can provide a reliable, repeatable and economical means of recognizing plants, outperforming manual expert classification. By focusing on vein features only, we attempt to deal with the problem of visually similar leaves from different cultivars or varieties, which otherwise require to be processed by expensive methods, such as

Acknowledgments

MGL, AEB and PMG acknowledge grant support from ANPCyT PICT 2012-0181.

References (30)

J.-X. Du et al.
Leaf shape based plant species recognition
Applied Mathematics and Computation
(2007)
J.-X. Du et al.
Recognition of plant leaf image based on fractal dimension features
Neurocomputing
(2013)
R. Kohavi et al.
Wrappers for feature subset selection
Artificial Intelligence
(1997)
M.G. Larese et al.
Automatic classification of legumes using leaf vein image features
Pattern Recognition
(2014)
J. Park et al.
Utilizing venation features for efficient leaf image retrieval
Journal of Systems and Software
(2008)
R. Pydipati et al.
Identification of citrus disease using color texture features and discriminant analysis
Computers and Electronics in Agriculture
(2006)
G. Agarwal et al.
First steps toward an electronic field guide for plants
Taxon, Journal of the International Association for Plant Taxonomy
(2006)
B.S. Bama et al.
Content based leaf image retrieval (CBLIR) using shape, color and texture features
Indian Journal of Computer Science and Engineering
(2011)
L. Breiman
Random forests
Machine Learning
(2001)
J. Camargo Neto et al.
Plant species identification using Elliptic Fourier leaf shape analysis
Computers and Electronics in Agriculture
(2006)

J. Chaki et al.

Designing an automated system for plant leaf recognition

International Journal of Advances in Engineering & Technology

(2012)

Clarke, J., Barman, S., Remagnino, P., Bailey, K., Kirkup, D., Mayo, S., Wilkin, P. (2006). Venation pattern analysis...

M.R. Golzarian et al.

Classification of images of wheat, ryegrass and brome grass species at early growth stages using principal component analysis

Plant Methods

(2011)

I. Guyon et al.

Gene selection for cancer classification using support vector machines

Machine Learning

(2002)

T. Hastie et al.

Penalized discriminant analysis

Annals of Statistics

(1995)

Cited by (49)

Symmetry-constrained linear sliding co-occurrence LBP for fine-grained leaf image retrieval
2024, Computers and Electronics in Agriculture
Fine-grained leaf image retrieval (FGLIR) is a new and challenging issue that has not yet well-studied in the research community of content-based image retrieval (CBIR). It focuses on similarity retrieval of leaf images in the level of subspecies (cultivar) instead of species that has been extensively researched in CBIR. In this paper, we propose a novel co-occurrence texture feature representation, named Symmetry-Constrained Linear Sliding Co-occurrence LBP (SCLS-CoLBP), to address this issue. Unlike the existing co-occurrence local binary patterns (CoLBPs) which use a circular neighborhood structure and spatially adjacent correlation to produce co-occurrence texture features, the proposed SCLS-CoLBP is designed to slide a pair of axisymmetric lines over a wings-like adaptive local patch to capture the spatial symmetry context, gray-scale correlation, and directional information in texture patterns. By incorporating multi-scale, multi-position and multi-orientation information, the SCLS-CoLBP statistical features are extended to an overall and dense local description. For achieving compact representation and efficient matching, we use the common bag-of-words (BoW) model with the proposed smoothing-coding scheme to aggregate the SCLS-CoLBP local descriptors into image-level signature. Extensive experiments have been performed on several challenging FGLIR tasks. All the experimental results show that the proposed method has the superior performance over the state-of-the-arts, excellent complementarity to deep features, and promising generalization ability to different FGLIR tasks.
CUDU-Net: Collaborative up-sampling decoder U-Net for leaf vein segmentation
2024, Digital Signal Processing: A Review Journal
Leaf vein is a common visual pattern in nature which provides potential clues for species identification, health evaluation, and variety selection of plants. However, as a critical step in leaf vein pattern analysis, segmenting vein from leaf image remains unaddressed due to its hierarchical curvilinear structure and busy background. In this study, we for the first time design a deep model which is tailored to address the segmentation of overall leaf vein structure. The proposed deep model, termed Collaborative Up-sampling Decoder U-Net (CUDU-Net), is an improved U-Net structure consisting of a fine-tuned ResNet extractor and a collaborative up-sampling decoder. The ResNet extractor utilizes residual module to explore high-dimensional features that are representative and abstract in the hidden layers of the network. The core of CUDU-Net is the collaborative up-sampling decoder which utilizes the complementarity of the bilinear-interpolation and deconvolution, to enhance the decoding capability of the model. The bilinear-interpolation can recovery key veins while the deconvolution actively learns to supplement more fine-grained features of the tertiary veins. In addition, we embed the strip pooling in the skip-connection to distill the vein-related semantics for performance boosting. Two leaf vein segmentation datasets, termed SoyVein500 and CottVein20, are built for model validation and generalization ability test. The extensive experimental results show that our proposed CUDU-Net outperforms the state-of-the-art methods in both segmentation accuracy and generalization ability.
Leaf classification on Flavia dataset: A detailed review
2023, Sustainable Computing: Informatics and Systems
For decades, vision scientists have contemplated the topic of plant species classification. As plants are of great importance to medicinal research, they are utilized in a wide range of medications. Plants are required in a variety of ways in order to save the species from extinction and provide an abundance of food through agriculture. Therefore,Botanists and computer scientists must conduct extensive plant species research. The plant resources are necessary for the survival of the world’s nations The purpose of this paper is to examine the frequently utilized and publicly accessible dataset for plant classification in the past. We explored over 200 research papers for a deep understanding of the area. Briefly described are the procedural advancements and developments in the field of leaf classification. All the major techniques with significant advancements, the new effective approaches, and the novel techniques are discussed in this research. For the benefit of future researchers, the findings, research gap and transition, and coherence of algorithms in terms of several measurements are underlined. The hundreds of publications on a single benchmark dataset illustrate the progression of the recognition process, improvements, and innovations.
SPARE: Self-supervised part erasing for ultra-fine-grained visual categorization
2022, Pattern Recognition
This paper presents SPARE, a self-supervised part erasing framework for ultra-fine-grained visual categorization. The key insight of our model is to learn discriminative representations by encoding a self-supervised module that performs random part erasing and prediction on the contextual position of the erased parts. This drives the network to exploit intrinsic structure of data, i.e., understanding and recognizing the contextual information of the objects, thus facilitating more discriminative part-level representation. This also enhances the learning capability of the model by introducing more diversified training part segments with semantic meaning. We demonstrate that our approach is able to achieve strong performance on seven publicly available datasets covering ultra-fine-grained visual categorization and fine-grained visual categorization tasks.
Fusing deep learning features of triplet leaf image patterns to boost soybean cultivar identification
2022, Computers and Electronics in Agriculture
Citation Excerpt :
Recently, there is an increasing concern whether leaf image patterns can also provide powerful discriminative information for cultivar-level recognition. Larese et al. (2014a) proposed to characterize and analyze the leaf venation network for distinguishing legume varieties. They reported an average accuracy of 60.20% on distinguishing three diverse soybean cultivars using the best version of their proposed methods.
Soybean cultivar recognition plays a vital role in cultivar evaluation, selection and production. Recently, there is an increasing interest in taking leaf image patterns as clues for distinguishing soybean cultivars. However, due to the higher inter-class similarity of soybean cultivars over plant species, the cultivar classification accuracies reported by the existing methods are far lower than those published on plant species recognition which make computer vision community have a concern whether leaf image patterns can provide sufficient discriminative information for identifying soybean cultivars. In this paper, we explore fusing deep learning features of leaves from different parts of soybean plants for achieving an accurate cultivar recognition. In our method, the deep learning features of triplet leave image patterns that consists of leaves from the lower, middle, and upper parts of soybean plants are fused by two methods, distance fusion and classifier fusion. In the former, the L₁ distance measurements defined on the deep feature spaces of triplet leaf image patterns are fused prior to using 1NN classifier for classification. While in the later, the SVM classifiers trained by the deep features of triple leaf image patterns are combined by sum rule for cultivar prediction. We use the SoyCultivar200 leaf dataset which consists of 6000 samples from 200 soybean cultivars as benchmark. Our method achieves an exciting classification rate of 83.55% which demonstrates that our proposed fusion of deep features of triplet leaf image patterns can provide strong discriminative information for accurately identifying soybean cultivars.
MaskCOV: A random mask covariance network for ultra-fine-grained visual categorization
2021, Pattern Recognition
Ultra-fine-grained visual categorization (ultra-FGVC) categorizes objects with more similar patterns between classes than those in fine-grained visual categorization (FGVC), e.g., where the spectrum of granularity significantly moves down from classifying species to classifying cultivars within the same species. It is considered as an open research problem mainly due to the following challenges. First, the inter-class differences among images are much smaller by level of orders (e.g., cultivars in the same species) than those in current FGVC tasks (e.g., species). Second, there is only a few samples per category, which is beyond the ability of most large training data favored convolutional neural network methods. To address these problems, we propose a novel random mask covariance network (MaskCOV), which integrates an auxiliary self-supervised learning module with a powerful in-image data augmentation scheme for the ultra-FGVC. Specifically, we first uniformly partition input images into patches and then augment data by randomly shuffling and masking these patches. On top of that, we introduce an auxiliary self-supervised learning module of predicting the spatial covariance context of these patches to increase discriminability of our network for classification. Very encouraging experimental results of the proposed method in comparison with the state-of-the-art benchmarks demonstrate its superiority and potential of MaskCOV concept, which pushes research boundary forward from the fine-grained to the ultra-fine-grained visual categorization.

View all citing articles on Scopus

View full text

Multiscale recognition of legume varieties based on leaf venation images

Highlights

Abstract

Introduction

Section snippets

Leaf images dataset

Results and discussion

Conclusions

Acknowledgments

Applied Mathematics and Computation

Neurocomputing

Artificial Intelligence

Pattern Recognition

Journal of Systems and Software

Computers and Electronics in Agriculture

First steps toward an electronic field guide for plants

Taxon, Journal of the International Association for Plant Taxonomy

Content based leaf image retrieval (CBLIR) using shape, color and texture features

Indian Journal of Computer Science and Engineering

Random forests

Machine Learning

Plant species identification using Elliptic Fourier leaf shape analysis

Computers and Electronics in Agriculture

Designing an automated system for plant leaf recognition

International Journal of Advances in Engineering & Technology

Classification of images of wheat, ryegrass and brome grass species at early growth stages using principal component analysis

Plant Methods

Gene selection for cancer classification using support vector machines

Machine Learning

Penalized discriminant analysis

Annals of Statistics