Robustness and sensitivity of weighting and aggregation in constructing composite indices
Highlights
► We describe the five key steps required in constructing a composite index. ► Weighting and aggregating indicators is a critical step in index construction. ► We simulated data to explore this step for 4 extant stream health indices. ► The robustness and sensitivity of the weighting and aggregating step was tested. ► The study helped assess the advantages and disadvantages of each approach.
Introduction
A composite index is a combination of multiple sources of information measured in or of a system in order to provide a summary of the system that is itself not directly measurable. For example, the Environmental Performance Index or EPI (Emerson et al., 2010) is a composite index used to quantify and benchmark the environmental performance of a country's policies. In 2008, the EPI was comprised of 25 environmental health and ecosystem viability indicators.
Fig. 1 illustrates the five key steps to building a composite index, namely (1) defining a theoretical framework to support the index composition; (2) selecting, cleaning and manipulating the raw indicator data; (3) standardizing the data; (4) weighting and aggregating the indicators; and (5) assessing robustness and sensitivity of the index. By addressing all five steps, there is a better chance of producing an objective, informative and defensible summary measure or index.
The most comprehensive source of information about the construction of composite indices generally, including their design, development and dissemination, is provided in OECD (2008). For a more specific environmental context, see Karr and Chu (1999) and Dobbie and Dail (2012). For the purpose of motivating the objective of our article, we present a brief overview of each key step and use the construction of environmental indices to provide some context.
Large-scale natural ecosystems are dynamic and complex, so defining a theoretical framework that describes what and how interactions occur in an ecosystem may not be trivial. The objectives of the study or monitoring program and current knowledge of the ecosystem of interest should both contribute to defining a theoretical framework that supports indicator selection and establishes relationships between indicators. A theoretical framework could also be informed from historical studies/data or elicited expert opinion. It should not be defined around what indicators are available as it then loses objectivity, transparency and, potentially, relevancy.
There are numerous actions that are of high priority in preparing environmental data before proceeding with subsequent steps in constructing an index. It would be rare that a set of environmental data could be immediately summarized or analyzed without having insight into how the data were generated or without performing pre-processing such as checking and cleaning. Ignorance of this step could have serious ramifications on the index construction and subsequent inferences.
Three important data preparation concerns are selection of indicators, imputation of missing values and multivariate exploratory analyses. The indicator selection process can traditionally be quite subjective as it has often been informed by expert opinion, but if a theoretical framework exists for the study of interest, then this should be used to guide the selection process. Indicators should be selected based on relevance, importance, accessibility, the potential risk associated with a marked change in its magnitude, representativeness (of perhaps a region or a range of indicators), and operational considerations, to name a few reasons. There may also be justification in some studies to select indicators that are common and universally accepted or that align/facilitate comparison with existing programs in the area.
Missing values arise in environmental data sets for numerous reasons. They can reduce the representativeness of the sample and as a result, this may lead to misleading inferences about the population. To avoid this, one approach is to “fill-in” or impute the missing values before undertaking analyses. Understanding how the missing data arise will help with selecting an appropriate data imputation approach.
The use of multivariate analyses as part of preparing the data seeks to inform and guide subsequent methodological choices regarding weighting and aggregation of indicators. In addition, multivariate analyses provide a statistically determined structure of the data set, which can be used in conjunction with the theoretical framework to provide support for making sound inferences.
Indicators may be measured in different units and on different scales, so prior to any data aggregation, standardization (which may also be referred to as normalization) of the data is required. An appropriate standardization technique should also inform the treatment of highly skewed indicators, outliers, and disparate scales (Freudenberg, 2003). By standardizing, there is a compromise between information loss and robustness against data particularities. Since different standardization techniques can produce different overall results, sensitivity analysis and robustness tests might be needed to assess their impact on the composite index.
In this step, the individual (standardized) indicators are combined together. The weight assigned to an indicator in an index reflects its relative importance or contribution to the index. A number of weighting techniques exist and can lead to different overall results (OECD, 2008). Some weighting techniques are derived statistically, such as through factor analysis (Child, 1990) whilst others are derived by eliciting expert opinion (such as for the Bradley–Terry model; see Bradley and Terry, 1952).
Equal weighting implies all indicators are given the same weight for the aggregation, but this can disguise the absence of a statistical or empirical basis for determining the weights. Additionally, in a two-step aggregation case, equal weighting to all indicators may imply unequal weighting to the sub-indices as the sub-index with the largest number of indicators will be given more weight in the overall index calculation; conversely, equal weighting to the sub-indices implies unequal weighting to the individual indicators which may or may not be desirable.
The correlation between variables is an important consideration when using equal weighting. However, high correlation between two indicators may not imply double counting, as the two indicators could contribute to different sub-indices in the index, and might therefore justify equal weighting in the index construction (OECD, 2008).
The aggregation of standardized indicators is typically one of two methods: linear or geometric. Linear or additive aggregation is the sum of the weighted indicators, and geometric aggregation is the product of indicators with weighted exponents. Linear aggregation is more compensatory, so that one remarkably good indicator score will compensate more when other indicators received poor scores. By contrast, geometric aggregation is less compensatory, but small gains in poor-performing indicators lead to a greater marginal improvement in the composite index (Munda and Nardo, 2005).
A third aggregation method known as non-compensatory multi-criteria aggregation (Munda and Nardo, 2007) seeks to find a compromise between two or more monitoring objectives, such as is often the case in stream monitoring programs, e.g. sustained biodiversity, increased economic value, and decreased levels of toxins may all be objectives of a particular monitoring program.
It is best to keep the weights and the aggregation approach unchanged across time when interest is in comparing environmental indices over time. However, if the monitoring objective is to define best practice or set priorities, then weights should necessarily change over time (OECD, 2008).
All decisions made up until this point in constructing indices, especially the key steps of imputing missing values (as part of data preparation), standardization and weighting and aggregation, should be subject to sensitivity analyses to assess the robustness of the composite index to the particular choices made. Through a sensitivity analysis we may be able to derive a measure of variability for the index that incorporates information related to scale (i.e. a measurement of a streambed encompassing 0.2% of the total population versus a measurement of the bank vegetation encompassing 45% of the population of banks) or relates back to the theoretical framework and captures the reason the indicators were of interest in the first place.
Additionally, the uncertainty of a composite index should be considered, focusing on how quantifying uncertainty in the indicator scores filters through the structure of the composite index, thereby making the final score more meaningful. Sometimes robustness and uncertainty are treated separately, but considering them iteratively during the development of a composite index could improve its structure (Saisana et al., 2005). Few studies discuss the presence or detailed consideration of uncertainty inherent in the development of a composite index (OECD, 2008).
Assessing the health of an environmental system such as a stream network, an estuary or a particular ecoregion, is often a key objective of large-scale monitoring studies. However there is no one environmental/ecological indicator that paints a complete picture of a system's health, that is to say, environmental health cannot be directly measured. It is usually quantified through the combination or composition of multiple relevant indicators measured in or of the environmental system of interest, otherwise known as an environmental (composite) index (Dobbie and Dail, 2012).
There are numerous extant indices commonly used for assessing the condition or health of streams in a given spatial domain (i.e. region, network, catchment). The definition of stream health is subjective, which is why numerous indices have been developed, and should be clarified as part of the monitoring objectives. To help gauge the appropriateness and applicability of using existing stream health indices for assessing the health of a given set of stream data, we really need to understand how they were constructed and what improvements could be made to improve their diagnostic and inferential capabilities.
Of the five steps described in Section 1.1 for constructing composite indices, the first three steps are essentially concerned with the justification, selection and manipulation of indicator data and so are very much context and data dependent. And the last step (step 5) is very important for ensuring confidence in the construction, the index is adequately representative, and inferences about the index are statistically meaningful.
The essence of the weighting and aggregation step (step 4) is how to best combine the individual indicator data. This is typically driven by the objective for constructing the index in the first place and may or may not be dependent on the actual data. This step is critical in index construction, yet there seems to be no documented evidence about making objective weighting and aggregation choices in constructing indices so that they are robust. Thus we designed a simulation study to test the robustness and sensitivity of the weighting and aggregation choices in four existing indices for assessing stream health.
Section snippets
Method
Four existing stream health indices were selected for comparison due to their popularity with practitioners and to provide coverage of different motivating objectives. Each index is appraised according to whether they address each of the five key construction steps and details of a simulation study for empirically comparing the performance of their weighting and aggregation step is described. Three criteria were devised for assessing the robustness and sensitivity of each index and objectively
Results
A summary of the statistics for the three criterion – total absolute bias, false difference detection rate and true difference detection rate – is provided in tabular form (Table 4, Table 5, Table 6). In each table the mean, root mean square error (RMSE), first quartile (Q1), median, and third quartile (Q3) are presented for each index and criterion.
The simulation results have also been presented graphically. The (empirical) probability density function for each method is graphed in Fig. 3.
Discussion
In some fields of science, there are multiple existing approaches for constructing indices. It may be overwhelming or unclear how to choose one that is fit for purpose, performs well and that can easily be applied to one's own data. As such there is need for objective discrimination of the construction and performance between existing approaches to aid users to make an appropriate selection or to guide them on constructing a new index.
The construction of a composite index requires careful
Acknowledgements
We thank the CSIRO Office of the Chief Executive for providing the funds that brought DD to Australia to undertake a 12-week internship on this work. We are also grateful to the constructive suggestions from two anonymous referees which have helped to improve the focus and clarity of the article.
References (19)
- et al.
Rank analysis of incomplete block designs. I. The method of paired comparisons
Biometrika
(1952) - Bunea, F., Guttorp, P., Richardson, T., 1999. Ecological indices and graphical modeling of factors influencing benthic...
- Canadian Council of Ministers of the Environment (CCME), 2001. Canadian water quality guidelines for the protection of...
The Essentials of Factor Analysis
(1990)- et al.
Stream health index for the Puget Sound Lowland
Environmetrics
(2006) - et al.
Latent health factor index: a statistical modelling approach for ecological health assessment
Environmetrics
(2011) - et al.
Environmental Indices
- et al.
2010 Environmental Performance Index
(2010) - Freudenberg, M., 2003. Composite Indicators of Country Performance: A Critical Assessment. OECD Science, Technology and...
Cited by (92)
Measuring and classifying the social sustainability of European banks: An analysis using interval-based composite indicators
2024, Environmental Impact Assessment ReviewBootstrap approach for quantifying the uncertainty in modeling of the water quality index using principal component analysis and artificial intelligence
2024, Journal of the Saudi Society of Agricultural SciencesEvaluating four fish-based indices of biotic integrity for similar measures of ecological condition
2023, Ecological IndicatorsEvaluating farmland ecosystem resilience and its obstacle factors in Ethiopia
2023, Ecological IndicatorsA quantitative three-step approach for guiding sandy beach management
2022, Ocean and Coastal Management