Skip to main content
Log in

An effective approach for improving the accuracy of a random forest classifier in the classification of Hyperion data

  • Original Paper
  • Published:
Applied Geomatics Aims and scope Submit manuscript

Abstract

Random forest (RF) is one of the most powerful ensemble classifiers often used in machine learning applications. It has been found successful on many benchmarked data. However, the performance of an RF model is highly affected by the calibration of the model parameters. It requires optimization of two parameters—(i) size of RF and (ii) number of features. RF is based on the principle of bagging and random selection of relevant features. This paper conveys an effective method in improving classification accuracy of RF. The principal component analysis (PCA) technique was used for dimension reduction of spectral bands whereas correlation-based feature selection (CFS) was used to identify the optimal set of features. RF was initialized by 10 random trees with an increment of 10, with a variable number of features till the model achieved its highest accuracy. The model was tested with variable sample sizes in order to observe the effectiveness. An investigation was carried out on Hyperion sensor data of the Earth Observing-1 (EO-1) satellite. The performance of RF was observed to be significantly enhanced in terms of predictive ability and computational expenses with the optimized set of features and number of random trees as base classifiers. While comparing with the other advanced classifiers like a support vector machine (SVM), multilayer perceptron (MLP) and maximum likelihood classifier (MLC), the optimized RF outperformed all the other classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. National Aeronautical and Space Application

  2. ODs, training datasets with all the features

  3. DRs, training datasets with the optimal set of features

References

  • Benediktsson JA, Sveinsson JR, Ersoy OK, Swain PH (1997) Parallel consensual neural networks. IEEE Trans Neural Netw 8:54–65

    Article  Google Scholar 

  • Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Chutia D, Bhattacharyya DK, Kalita R, Sudhakar S (2014a) A model on achieving higher performance in the classification of hyperspectral satellite data: a case study on Hyperion data. Appl Geomat, (Springer) 6(3):181–195

    Article  Google Scholar 

  • Chutia D, Bhattacharyya DK, Kalita R, Sudhakar S (2014b) OBCsvmFS: object-based classification supported by support vector machine feature selection approach for hyperspectral data. J Geom 8(1):12–19

    Google Scholar 

  • Chutia D, Bhattacharyya DK, Sarma KK, Kalita R, Sudhakar S (2016) Hyperspectral remote sensing classifications: a perspective survey. Trans GIS 20(4):463–490

    Article  Google Scholar 

  • García Adeva JJ, Ulises Cerviño B, Calvo RA (2005) Accuracy and diversity in ensembles of text Categorisers. CLEI J 8(2):1–12

    Google Scholar 

  • Giacinto G, Roli F (1997) Ensembles of neural networks for soft classification of remote sensing images. Proc. of the European Symposium on Intelligent Techniques, March 20-21, Bari, Italy, pp 166-170

  • Gislason PO, Benediktsson JA, Sveinsson JR (2004) Random forest classification of multisource remote sensing and geographic data. Geoscience and Remote Sensing Symposium, IGARSS '04. Proceedings. IGARSS '04. Proceedings. 2004 IEEE International , vol.2, no., pp.1049,1052 vol.2, 20–24 Sep.

  • Green AA, Berman M, Switzer P, Craig MD (1988) A transformation for ordering multispectral data in terms of image quality with implications for noise removal. IEEE Trans Geosci Remote Sens 26(1):65–74

    Article  Google Scholar 

  • Hall MA (1999) Correlation-based feature subset selection for machine learning. Hamilton, New Zealand. PhD thesis

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18

    Article  Google Scholar 

  • Harris JR, Ponomarev P, Shang J, Rogge D (2006) Noise reduction and best band selection techniques for improving classification results using hyperspectral data: application to lithological mapping in Canada’s Arctic. Can J Remote Sens 32(5):341–354

    Article  Google Scholar 

  • Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844

    Article  Google Scholar 

  • Howard A (1987) Elementary linear algebra 5e, 2nd edn. Wiley, Canada

    Google Scholar 

  • Hsu PH (2007) Feature extraction of hyperspectral images using wavelet and matching pursuit. ISPRS J Photogramm Remote Sens 62(2):78–92

    Article  Google Scholar 

  • Hyvärinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neural Netw 13(4):411–430

    Article  Google Scholar 

  • Jisoo H, Yangchi C, Crawford MM, Ghosh J (2005) Investigation of the random forest framework for classification of hyperspectral data. IEEE Trans Geosci Remote Sens 43(3):492–501

    Article  Google Scholar 

  • Joevivek V, Hemalatha T, Soman KP (2009) Determining an efficient supervised classification method for hyperspectral image. In 2009 International Conference on Advances in Recent Technologies in Communication and Computing (pp 384-386). IEEE

  • Junshi X, Dalla Mura M, Chanussot J, Du P, He X (2015) Random subspace ensembles for hyperspectral image classification with extended morphological attribute profiles. IEEE Trans Geosci Remote Sens 53(9):4768–4786

    Article  Google Scholar 

  • Kohavi R (1995) Wrappers for performance enhancement and oblivious decision graphs, PhD thesis, Stanford University

  • Kohavi R, John G (1996) Wrappers for feature subset selection. Artif Intell Spec Issue Relevance 97(1–2):273–324

    Google Scholar 

  • Krogh A, Vedelsby J (1995) Neural networks ensembles, cross validation and active learning. In: Touretzky DS, Tesauro G, Leen TK (eds) Advances in neural information processing systems, vol 7. MIT Press, Cambridge, pp 107–115

    Google Scholar 

  • Kuncheva L, Whitaker C (2003) Measures of diversity in classifier ensembles. Mach Learn 51:181–207

    Article  Google Scholar 

  • Li W, Prasad S, Fowler JE, Bruce LM (2012) Locality-preserving discriminant analysis in kernel-induced feature spaces for hyperspectral image classification. IEEE Geosci Remote Sens Lett 8(5):894–898

    Article  Google Scholar 

  • Mader S, Vohland M, Jarmer T, Casper M (2006) Crop classification with hyperspectral data of the HyMap sensor using different feature extraction techniques. In 2nd Workshop of the EARSel SIG on Remote Sensing of Land Use & Land Cover, edited by M Braun (Bonn, Germany) (pp 96–101)

  • Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 1:169–198

    Article  Google Scholar 

  • Pal M (2003) Random forests for land cover classification. Geoscience and Remote Sensing Symposium, IGARSS '03. Proceedings. 2003 IEEE International, vol.6, no., pp 3510–3512, 21–25 July 2003. https://doi.org/10.1109/IGARSS.2003.1294837

  • Pal M (2005) Random forest classifier for remote sensing classification. Int J Remote Sens 26:217–222

    Article  Google Scholar 

  • Piragnolo M, Masiero A, Pirotti F (2017) Open source R for applying machine learning to RPAS remote sensing images. Open Geospat Data Softw Stand 2(1):16

    Article  Google Scholar 

  • Polikar R (2006) Ensemble based systems in decision making. IEEE Circuits Syst Mag 6(3):21–45

    Article  Google Scholar 

  • Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33(1–2):1–39

    Article  Google Scholar 

  • Ross Q (1993) C4.5: Programs for machine learning, vol 16. Morgan Kaufmann Publishers, San Mateo, pp 235–240

    Google Scholar 

  • Roy M, Routaray D, Ghosh S, Ghosh A (2014) Ensemble of multilayer perceptrons for change detection in remotely sensed images. IEEE Geosci Remote Sens Lett 11(1):49–53

    Article  Google Scholar 

  • Su H, Yang H, Du Q, Sheng Y (2011) Semisupervised band clustering for dimensionality reduction of hyperspectral imagery. IEEE Geosci Remote Sens Lett 8(6):1135–1139

    Article  Google Scholar 

  • Tao D, Xiaoou T, Xuelong L, Xindong W (2006) Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans Pattern Anal Mach Intel 28(7):1088–1099

    Article  Google Scholar 

  • Tremblay G (2004) Optimizing nearest neighbour in random subspaces using a multi-objective genetic algorithm. 17th International Conference on Pattern Recognition, pp 208–211

  • Waske B, Braun M (2009) Classifier ensembles for land cover mapping using multitemporal SAR imagery. ISPRS J Photogramm Remote Sens 64(5):450–457

    Article  Google Scholar 

  • Wei W, Du Q, Younan NH (2012) Fast supervised hyperspectral band selection using graphics processing unit. J Appl Remote Sens 6(1):061504

    Article  Google Scholar 

  • Yang C, Everitt JH, Johnson HB (2009) Applying image transformation and classification techniques to airborne hyperspectral imagery for mapping Ashe juniper infestations. Int J Remote Sens 30(11):2741–2758

    Article  Google Scholar 

  • Yoav F, Robert ES (1996) Experiments with a new boosting algorithm. In: Thirteenth International Conference on Machine Learning, San Francisco, p 148–156

Download references

Acknowledgements

The authors would like to thank the North Eastern Space Applications Centre, Department of Space, Government of India, Umiam, Meghalaya, India, for providing the necessary guidance and support during the study. The authors also acknowledge the concerned authorities of WEKA and ImageJ software for their important role in carrying out the investigation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dibyajyoti Chutia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chutia, D., Borah, N., Baruah, D. et al. An effective approach for improving the accuracy of a random forest classifier in the classification of Hyperion data. Appl Geomat 12, 95–105 (2020). https://doi.org/10.1007/s12518-019-00281-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12518-019-00281-8

Keywords

Navigation