Skip to main content

Advertisement

Log in

Wetland vegetation distribution modelling for the identification of constraining environmental variables

  • Research Article
  • Published:
Landscape Ecology Aims and scope Submit manuscript

Abstract

Wetland ecosystems are of primary concern for nature conservation and restoration. Adequate conservation and restoration strategies emerge from a scientific comprehension of wetland properties and processes. Hereby, the understanding of plant species and vegetation patterns in relation to environmental gradients is an important issue. The modelling approaches in this study statistically relate vegetation patterns to measured environmental gradients in a lowland wetland ecosystem. Measured environmental gradients included groundwater quantity and quality aspects, soil properties and vegetation management. Among this variety, the objective was to identify the key environmental gradients constraining the vegetation, using recently developed methodologies within the modelling approaches. Comparison of results indicated that different environmental gradients were considered to be important by different methodologies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Abbreviations

MLR:

Multiple logistic regression

RF:

Random forest

References

  • Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723

    Article  Google Scholar 

  • Araújo MB, News M (2007) Ensemble forecasting of species distributions. Trends Ecol Evol 22(1):42–47

    Article  PubMed  Google Scholar 

  • Archer KJ, Kimes RV (2008) Empirical characterization of random forest variable importance measures. Comput Stat Data Anal 52:2249–2260

    Article  Google Scholar 

  • Austin MP (2002) Spatial prediction of species distribution: an interface between ecological theory and statistical modeling. Ecol Model 157(2–3):101–118

    Article  Google Scholar 

  • Barendregt A, Wassen MJ, Smidt JTD (1993) Hydroecological modelling in a polder landscape: a tool for wetland management. In: Vos CC, Opdam P (eds) Landscape ecology of a stressed environment. Chapman and Hall, London

    Google Scholar 

  • Bio AMF, De Becker P, De Bie E, Huybrechts W, Wassen M (2002) Prediction of plant species distribution in lowland river valleys in Belgium: modelling species response to site conditions. Biodivers Conserv 11:2189–2216

    Article  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  • Breiman L, Cutler A (2005) http://www.stat./berkeley.edu/users/Breiman/RandomForests/cc_papers.htm

  • Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Chapman and Hall, New York

    Google Scholar 

  • Burnham KP, Anderson DR (2002) Model selection and multimodel inference: a practical information-theoretic approach. Springer, New York

    Google Scholar 

  • Chevan A, Sutherland M (1991) Hierarchical partitioning. Am Stat 45(2):90–96

    Article  Google Scholar 

  • Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Measure 20:37–46

    Article  Google Scholar 

  • De Becker P, Huybrechts W (2000) De Doode Bemde—Ecohydrologische Atlas. Institute of Nature Conservation, Brussels (in Dutch)

  • De Becker P, Hermy M, Butaye J (1999) Ecohydrological characterisation of a groundwater-fed alluvial floodplane mire. Appl Veg Sci 2:215–228

    Article  Google Scholar 

  • Díaz-Uriarte R, de Andrés SA (2006) Gene selection and classification of microarray data using random forest. BMC Bioinformatics 7(3). doi:10.1186/1471-2105-7-3

  • Ertsen ACD, Frens JW, Nieuwenhuis JW, Wassen MJ (1995) An approach to modelling the relationship between plant species and site conditions in terrestrial ecosystems. Landsc Urban Plan 31:143–151

    Article  Google Scholar 

  • Everitt BS (1992) The analysis of contingency tables, 2nd edn. Chapman and Hall, London

    Google Scholar 

  • Fleisman E, Mac Nally R, Murphy DD (2005) Relationships among non-native plants, diversity of plants and butterflies, and adequacy of spatial sampling. Biol J Linn Soc 85:157–166

    Article  Google Scholar 

  • Guisan A, Thuiller W (2005) Predicting species distribution: offering more than simple habitat models. Ecol Lett 8:993–1009

    Article  Google Scholar 

  • Guisan A, Zimmerman NE (2000) Predictive habitat distribution models in ecology. Ecol Model 135(2–3):147–186

    Article  Google Scholar 

  • Hansen L, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12:993–1001

    Article  Google Scholar 

  • Hastie T, Tibshirani R (1990) Generalized additive models. Chapman and Hall, London

    Google Scholar 

  • Hill MO (1979) TWINSPAN—a FORTRAN program for arranging multivariate data in an ordered two-way table by classification of the individuals and attributes. Cornell University, Ithaca

    Google Scholar 

  • Hosmer DW, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, Chichester

    Google Scholar 

  • Huybrechts W, De Bie E, De Becker P, Wassen M, Bio A (2002) Ontwikkeling van een hydro-ecologisch model voor vallei-ecosystemen in Vlaanderen, ITORS-VL (VLINA 00/16). Instituut voor Natuurbehoud, Brussel (In Dutch)

  • Kadlec RH, Knight RL (1996) Treatment wetlands. Lewis Publishers, Boca Raton

    Google Scholar 

  • Legendre P, Legendre L (1998) Numerical ecology, 2nd edn. Elsevier Science, Amsterdam

    Google Scholar 

  • Liaw A, Wiener M (2002) Classification and regression by random forest. R News 2/3:18–22

    Google Scholar 

  • Londo G (1988) Nederlandse Freatophyten. Pudoc, Wageningen (in Dutch)

    Google Scholar 

  • Mac Nally R (2000) Regression and model-building in conservation biology, biogeography, and ecology: the distinction between—and reconciliation of—‘predictive’ and ‘explanatory’ models. Biodivers Conserv 9:655–671

    Article  Google Scholar 

  • Mac Nally R (2002) Multiple regression and inference in ecology and conservation biology: further comments on identifying important predictor variables. Biodivers Conserv 11:1397–1401

    Article  Google Scholar 

  • Mitsch WJ, Gosselink JG (2000) Wetlands, 3rd edn. Wiley, New York

    Google Scholar 

  • Neter J, Kutner MH, Nachtsheim CJ, Wasserman W (1996) Applied linear statistical models, 4th edn. WCB McGraw-Hill, United States

    Google Scholar 

  • Noest V (1994) A hydrology-vegetation interaction model for predicting the occurrence of plant species in dune slacks. J Environ Manage 40:119–128

    Article  Google Scholar 

  • Özesmi SL, Tan CO, Özesmi U (2006) Methodological issues in building, training, and testing artificial neural networks in ecological applications. Ecol Model 195(1–2):83–93

    Article  Google Scholar 

  • Peters J, De Baets B, Verhoest NEC, Samson R, Degroeve S, De Becker P, Huybrechts W (2007) Random forests as a tool for predictive ecohydrological modelling. Ecol Model 207(2–4):304–318

    Article  Google Scholar 

  • Peters J, De Baets B, Samson R, Verhoest NEC (2008) Modelling groundwater-dependent vegetation patterns using ensemble learning. Hydrol Earth Syst Sci 12:603–613

    Article  CAS  Google Scholar 

  • Prasad AM, Iverson LR, Liaw A (2006) Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9:181–199

    Article  Google Scholar 

  • Rushton SP, Ormerod SJ, Kerby G (2004) New paradigms for modeling species distributions? J Appl Ecol 41:193–200

    Article  Google Scholar 

  • Strobl C, Boulesteix A-L, Zeileis A, Hothorn T (2007) Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinfo 8(25). doi:10.1186/1471-2105-8-25

  • ter Braak CFJ (1986) Canonical correspondence analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology 67:1167–1179

    Article  Google Scholar 

  • Vaughan IP, Ormerod SJ (2005) Increasing the value of principle components analysis for simplifying ecological data: a case study with rivers and river birds. J Appl Ecol 42:487–497

    Article  Google Scholar 

  • Walsh C, Mac Nally R (2005) http://www.cran.r-project.org/doc/packages/hier.part.pdf

  • Yee TW, Mitchell ND (1991) Generalized additive models in plant ecology. J Veg Sci 2:587–602

    Article  Google Scholar 

Download references

Acknowledgements

The authors wish to thank the special research fund (BOF, project nr 011/015/04) of Ghent University, and the Fund for Scientific Research-Flanders (operating and equipment grant 1.5.108.03). We are grateful to Willy Huybrechts and Piet De Becker from the Institute of Nature Conservation, Belgium, for providing the data gathered through the Flemish Research Programme on Nature Development (projects VLINA 96/03 and VLINA 00/16), and to Rudi Hoeben for computer assistance.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. Peters.

Appendix A: Variable importance measure

Appendix A: Variable importance measure

Algorithm for estimating the variable importance of each predictive variable within random forests is given by:

  1. (i)

    For i = 1 to k do (grow a random forest consisting of k classification trees):

    1. (1)

      apply tree i to the n oob elements and count the number of correct classifications over the n oob elements (C i,untouched);

    2. (2)

      for j = 1 to p (with p the total number of variables) do:

      1. (a)

        take the n untouched oob elements;

      2. (b)

        randomly permute the values of variable j in the n oob elements;

      3. (c)

        apply tree i to all the j permuted oob elements;

      4. (d)

        count the number of correct classifications (C i,j-permuted);

      5. (e)

        subtract the number of correct classifications of the variable-j-permuted oob elements from the number of correct classifications of the untouched oob elements and divide by the number of oob elements (ΔC i,j = (C i,untouched − C i,j-permuted)/n);

The results from these iterations are p (number of variables, j = 1 to p) groups of k (number of trees, i = 1 to k) ΔC i,j values. Since trees are independent, correlations among the ΔC i,j values within the p groups are generally low. Finally:

  1. (ii)

    For each of the j = 1 to p groups, the mean ΔC i,j over all i = 1 to k trees is calculated \((\overline{\Updelta C_j}=\sum^{k}_{i=1}{C_{i,j}/{k}}).\) The value \(\overline{\Updelta C_j}\times 100\) is referred to as the ‘mean importance score’ of variable j. The value is positive when C i,untouchedC i,j-permuted and negative when C i,untouchedC i,j-permuted. Mean importance scores have high values when the classification error increases by permuting the values of variable p.

  2. (iii)

    Since correlations of the ΔC i,j scores are generally low within the j = 1 to p groups, standard errors can be calculated for each of the j groups of i = 1 to k ΔC i,j scores. Divide \(\overline{\Updelta C_j}\) by the standard error (se) to obtain a z-score for variable j, and assign a significance level assuming normality.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Peters, J., Verhoest, N.E.C., Samson, R. et al. Wetland vegetation distribution modelling for the identification of constraining environmental variables. Landscape Ecol 23, 1049–1065 (2008). https://doi.org/10.1007/s10980-008-9261-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10980-008-9261-4

Keywords

Navigation