Robustness and sensitivity of weighting and aggregation in constructing composite indices

doi:10.1016/j.ecolind.2012.12.025

Ecological Indicators

Volume 29, June 2013, Pages 270-277

https://doi.org/10.1016/j.ecolind.2012.12.025 Get rights and content

Abstract

A composite index is a combination of various sources of information known as indicators, measured in or of a system in order to provide a summary of the system that is itself not directly measurable. For the index to be useful and meaningful, its construction requires careful consideration of several important aspects of the potentially disparate and multiple indicators that help convey its meaning. In general there are five key steps that should be considered when constructing a composite index, one of which is how the indicators should be weighted and aggregated to form the index. This step is critical in index construction, yet there seems to be no documented evidence about making objective weighting and aggregation choices in constructing indices so that they are robust. This sort of evidence would be particularly helpful for deciding whether any of the existing indices for assessing stream health would suffice for assessing the health of a given set of stream data or whether developing a new index is warranted. Thus we designed a simulation study to test the robustness and sensitivity of the weighting and aggregation choices in four existing stream health indices. The four indices mainly differed in their choices of standardization and weighting and aggregation techniques. The three main general conclusions about these existing approaches are the recommendation to use bootstrapping to approximate the distribution of the stream health index; the standardization technique employed should use all of the available indicators; and the use of reference (or pristine) sites as a standardization tool is not essential. Since the study is based on artificial data, the findings should be applicable and relevant to indices in other fields of study such as economics, social sciences, finance and medicine.

Highlights

► We describe the five key steps required in constructing a composite index. ► Weighting and aggregating indicators is a critical step in index construction. ► We simulated data to explore this step for 4 extant stream health indices. ► The robustness and sensitivity of the weighting and aggregating step was tested. ► The study helped assess the advantages and disadvantages of each approach.

Introduction

A composite index is a combination of multiple sources of information measured in or of a system in order to provide a summary of the system that is itself not directly measurable. For example, the Environmental Performance Index or EPI (Emerson et al., 2010) is a composite index used to quantify and benchmark the environmental performance of a country's policies. In 2008, the EPI was comprised of 25 environmental health and ecosystem viability indicators.

Fig. 1 illustrates the five key steps to building a composite index, namely (1) defining a theoretical framework to support the index composition; (2) selecting, cleaning and manipulating the raw indicator data; (3) standardizing the data; (4) weighting and aggregating the indicators; and (5) assessing robustness and sensitivity of the index. By addressing all five steps, there is a better chance of producing an objective, informative and defensible summary measure or index.

The most comprehensive source of information about the construction of composite indices generally, including their design, development and dissemination, is provided in OECD (2008). For a more specific environmental context, see Karr and Chu (1999) and Dobbie and Dail (2012). For the purpose of motivating the objective of our article, we present a brief overview of each key step and use the construction of environmental indices to provide some context.

Large-scale natural ecosystems are dynamic and complex, so defining a theoretical framework that describes what and how interactions occur in an ecosystem may not be trivial. The objectives of the study or monitoring program and current knowledge of the ecosystem of interest should both contribute to defining a theoretical framework that supports indicator selection and establishes relationships between indicators. A theoretical framework could also be informed from historical studies/data or elicited expert opinion. It should not be defined around what indicators are available as it then loses objectivity, transparency and, potentially, relevancy.

There are numerous actions that are of high priority in preparing environmental data before proceeding with subsequent steps in constructing an index. It would be rare that a set of environmental data could be immediately summarized or analyzed without having insight into how the data were generated or without performing pre-processing such as checking and cleaning. Ignorance of this step could have serious ramifications on the index construction and subsequent inferences.

Three important data preparation concerns are selection of indicators, imputation of missing values and multivariate exploratory analyses. The indicator selection process can traditionally be quite subjective as it has often been informed by expert opinion, but if a theoretical framework exists for the study of interest, then this should be used to guide the selection process. Indicators should be selected based on relevance, importance, accessibility, the potential risk associated with a marked change in its magnitude, representativeness (of perhaps a region or a range of indicators), and operational considerations, to name a few reasons. There may also be justification in some studies to select indicators that are common and universally accepted or that align/facilitate comparison with existing programs in the area.

Missing values arise in environmental data sets for numerous reasons. They can reduce the representativeness of the sample and as a result, this may lead to misleading inferences about the population. To avoid this, one approach is to “fill-in” or impute the missing values before undertaking analyses. Understanding how the missing data arise will help with selecting an appropriate data imputation approach.

The use of multivariate analyses as part of preparing the data seeks to inform and guide subsequent methodological choices regarding weighting and aggregation of indicators. In addition, multivariate analyses provide a statistically determined structure of the data set, which can be used in conjunction with the theoretical framework to provide support for making sound inferences.

Indicators may be measured in different units and on different scales, so prior to any data aggregation, standardization (which may also be referred to as normalization) of the data is required. An appropriate standardization technique should also inform the treatment of highly skewed indicators, outliers, and disparate scales (Freudenberg, 2003). By standardizing, there is a compromise between information loss and robustness against data particularities. Since different standardization techniques can produce different overall results, sensitivity analysis and robustness tests might be needed to assess their impact on the composite index.

In this step, the individual (standardized) indicators are combined together. The weight assigned to an indicator in an index reflects its relative importance or contribution to the index. A number of weighting techniques exist and can lead to different overall results (OECD, 2008). Some weighting techniques are derived statistically, such as through factor analysis (Child, 1990) whilst others are derived by eliciting expert opinion (such as for the Bradley–Terry model; see Bradley and Terry, 1952).

Equal weighting implies all indicators are given the same weight for the aggregation, but this can disguise the absence of a statistical or empirical basis for determining the weights. Additionally, in a two-step aggregation case, equal weighting to all indicators may imply unequal weighting to the sub-indices as the sub-index with the largest number of indicators will be given more weight in the overall index calculation; conversely, equal weighting to the sub-indices implies unequal weighting to the individual indicators which may or may not be desirable.

The correlation between variables is an important consideration when using equal weighting. However, high correlation between two indicators may not imply double counting, as the two indicators could contribute to different sub-indices in the index, and might therefore justify equal weighting in the index construction (OECD, 2008).

The aggregation of standardized indicators is typically one of two methods: linear or geometric. Linear or additive aggregation is the sum of the weighted indicators, and geometric aggregation is the product of indicators with weighted exponents. Linear aggregation is more compensatory, so that one remarkably good indicator score will compensate more when other indicators received poor scores. By contrast, geometric aggregation is less compensatory, but small gains in poor-performing indicators lead to a greater marginal improvement in the composite index (Munda and Nardo, 2005).

A third aggregation method known as non-compensatory multi-criteria aggregation (Munda and Nardo, 2007) seeks to find a compromise between two or more monitoring objectives, such as is often the case in stream monitoring programs, e.g. sustained biodiversity, increased economic value, and decreased levels of toxins may all be objectives of a particular monitoring program.

It is best to keep the weights and the aggregation approach unchanged across time when interest is in comparing environmental indices over time. However, if the monitoring objective is to define best practice or set priorities, then weights should necessarily change over time (OECD, 2008).

All decisions made up until this point in constructing indices, especially the key steps of imputing missing values (as part of data preparation), standardization and weighting and aggregation, should be subject to sensitivity analyses to assess the robustness of the composite index to the particular choices made. Through a sensitivity analysis we may be able to derive a measure of variability for the index that incorporates information related to scale (i.e. a measurement of a streambed encompassing 0.2% of the total population versus a measurement of the bank vegetation encompassing 45% of the population of banks) or relates back to the theoretical framework and captures the reason the indicators were of interest in the first place.

Additionally, the uncertainty of a composite index should be considered, focusing on how quantifying uncertainty in the indicator scores filters through the structure of the composite index, thereby making the final score more meaningful. Sometimes robustness and uncertainty are treated separately, but considering them iteratively during the development of a composite index could improve its structure (Saisana et al., 2005). Few studies discuss the presence or detailed consideration of uncertainty inherent in the development of a composite index (OECD, 2008).

Assessing the health of an environmental system such as a stream network, an estuary or a particular ecoregion, is often a key objective of large-scale monitoring studies. However there is no one environmental/ecological indicator that paints a complete picture of a system's health, that is to say, environmental health cannot be directly measured. It is usually quantified through the combination or composition of multiple relevant indicators measured in or of the environmental system of interest, otherwise known as an environmental (composite) index (Dobbie and Dail, 2012).

There are numerous extant indices commonly used for assessing the condition or health of streams in a given spatial domain (i.e. region, network, catchment). The definition of stream health is subjective, which is why numerous indices have been developed, and should be clarified as part of the monitoring objectives. To help gauge the appropriateness and applicability of using existing stream health indices for assessing the health of a given set of stream data, we really need to understand how they were constructed and what improvements could be made to improve their diagnostic and inferential capabilities.

Of the five steps described in Section 1.1 for constructing composite indices, the first three steps are essentially concerned with the justification, selection and manipulation of indicator data and so are very much context and data dependent. And the last step (step 5) is very important for ensuring confidence in the construction, the index is adequately representative, and inferences about the index are statistically meaningful.

The essence of the weighting and aggregation step (step 4) is how to best combine the individual indicator data. This is typically driven by the objective for constructing the index in the first place and may or may not be dependent on the actual data. This step is critical in index construction, yet there seems to be no documented evidence about making objective weighting and aggregation choices in constructing indices so that they are robust. Thus we designed a simulation study to test the robustness and sensitivity of the weighting and aggregation choices in four existing indices for assessing stream health.

Section snippets

Method

Four existing stream health indices were selected for comparison due to their popularity with practitioners and to provide coverage of different motivating objectives. Each index is appraised according to whether they address each of the five key construction steps and details of a simulation study for empirically comparing the performance of their weighting and aggregation step is described. Three criteria were devised for assessing the robustness and sensitivity of each index and objectively

Results

A summary of the statistics for the three criterion – total absolute bias, false difference detection rate and true difference detection rate – is provided in tabular form (Table 4, Table 5, Table 6). In each table the mean, root mean square error (RMSE), first quartile (Q1), median, and third quartile (Q3) are presented for each index and criterion.

The simulation results have also been presented graphically. The (empirical) probability density function for each method is graphed in Fig. 3.

Discussion

In some fields of science, there are multiple existing approaches for constructing indices. It may be overwhelming or unclear how to choose one that is fit for purpose, performs well and that can easily be applied to one's own data. As such there is need for objective discrimination of the construction and performance between existing approaches to aid users to make an appropriate selection or to guide them on constructing a new index.

The construction of a composite index requires careful

Acknowledgements

We thank the CSIRO Office of the Chief Executive for providing the funds that brought DD to Australia to undertake a 12-week internship on this work. We are also grateful to the constructive suggestions from two anonymous referees which have helped to improve the focus and clarity of the article.

References (19)

R.A. Bradley et al.
Rank analysis of incomplete block designs. I. The method of paired comparisons
Biometrika
(1952)
Bunea, F., Guttorp, P., Richardson, T., 1999. Ecological indices and graphical modeling of factors influencing benthic...
Canadian Council of Ministers of the Environment (CCME), 2001. Canadian water quality guidelines for the protection of...
D. Child
The Essentials of Factor Analysis
(1990)
G. Chiu et al.
Stream health index for the Puget Sound Lowland
Environmetrics
(2006)
G. Chiu et al.
Latent health factor index: a statistical modelling approach for ecological health assessment
Environmetrics
(2011)
M.J. Dobbie et al.
Environmental Indices
J. Emerson et al.
2010 Environmental Performance Index
(2010)
Freudenberg, M., 2003. Composite Indicators of Country Performance: A Critical Assessment. OECD Science, Technology and...

There are more references available in the full text version of this article.

Cited by (92)

Measuring and classifying the social sustainability of European banks: An analysis using interval-based composite indicators
2024, Environmental Impact Assessment Review
Promoting social information disclosure can develop sustainable banking. This paper aims to evaluate the social sustainability of banks by constructing a new interval-based composite indicator using the Thomson Reuters database. In this paper, to measure the social sustainability of European banks, we propose an approach to construct interval-based composite indicators that usefully extends the construction of the composite indicator and allows us to measure the uncertainty generated by the choices made in the construction of the composite indicator. The methodological approach is based on Monte Carlo simulation and allows us to improve the information provided by the composite indicators. Thus, we measure the value of the social indicator and its subcomponents and the uncertainty of the value due to the different possible weightings. The results show that the best international ESG practices in European banks are found in French and British banks, especially in Italian ones. Finally, we analyze innovative perspectives and propose policy recommendations to support sustainable banking ecosystems in light of the increasing attention to ESG indicator disclosure and its correspondence to reality.
Bootstrap approach for quantifying the uncertainty in modeling of the water quality index using principal component analysis and artificial intelligence
2024, Journal of the Saudi Society of Agricultural Sciences
Collecting and analyzing data on surface water across extensive areas is a challenging, time-consuming and expensive. Developing predictive models that offer high accuracy, reliability and require minimal parameters can potentially reduce the time and expense associated with water quality monitoring and management. While most existing studies have focused on estimating point prediction of water quality without approximating the predictive interval (PI) of the estimation, this study aimed to develop a prediction tool to estimate the PI of water quality indexes (WQIs) in the lower Mun river basin. This was achieved by employing principal component analysis (PCA), artificial neural networks (ANN), and bootstrap methods to enhance accuracy, robustness, and reliability with the minimum number of water quality parameters. PCA was initially used to select 4 parameters for the WQI. Subsequently, ANN regression was employed to develop a new WQI to enhance data evaluation efficiency. The testing results of the proposed model revealed its excellent performance compared to other models in terms of accuracy (root mean square error (RMSE) = 0.86, correlation coefficient (R) = 0.993, scatter index (SI) = 0.019, mean absolute error (MAE) = 0.709, and mean bias error (MBE) = −0.003). Additionally, the proposed model incorporated the bootstrap method to quantify uncertainty and create a PI, resulting in a high coverage rate exceeding 95%. By integrating statistical techniques with artificial intelligence and quantifying uncertainty, it is possible to effectively evaluate water quality, provide more accurate and reliable indexes. This study can be an effective tool for decision makers and planners seeking precise data on water quality to develop water resource management strategies.
Evaluating four fish-based indices of biotic integrity for similar measures of ecological condition
2023, Ecological Indicators
While the use and development of biological indices for management and planning purposes has increased in popularity over the past few decades, many of their properties are rarely evaluated. Using stream fish community data from numerous watersheds across several management jurisdictions, this study compared and evaluated four different indices of biotic integrity to assess their use in biomonitoring programs. With multivariate comparisons, statistical resampling and species replacements, sensitivities of indices to changes in community composition were examined. While indices were positively correlated at a large scale, different relationships among indices were found across the spatial regions represented by management jurisdictions. Indices responded differently to replacements of native with non-native species as well as year-to-year changes in community composition (i.e., species turnover). Variability generated through bootstrap resampling showed the potential to change resulting scores up to a value of 50, altering stream health designations that are commonly used in decision-making. The differences in index scoring seen due to differences in the four calculations prevent large-scale comparisons and integrated management from taking place across management jurisdictions. This paper emphasizes that the potential advantages and limitations of indices of biotic integrity must be considered when developing/choosing one for use in a given region.
Evaluating farmland ecosystem resilience and its obstacle factors in Ethiopia
2023, Ecological Indicators
In the context of global warming, the sustainability of farmland ecosystems is increasingly impacted by multiple disturbances from both natural and human-induced sources. This study constructed a conceptual model and indicator system of farmland ecosystem resilience (FER) based on the disturbance-response processes of farmland ecosystems. FER assessment, supported by 30 specific indicators, was tested in Ethiopia, one of the most food-insecure countries in the world and the factors impending farmland ecosystem sustainability were discussed based on the obstacle degree values (ODVs). The results showed that the FER change rate in Ethiopia was 0.3 %/year during 2003–2018, indicating positive sustainability on the whole. However, in the five years of 2004, 2007, 2008, 2011, and 2017, the FER decreased, among which the largest decline was −0.017 in 2017. Natural comprehensive disturbance (ND), human activity disturbance (HD), resilience of farmland supply (RFS), and resilience of social support (RS) were the key constraints of FER, while the ODVs of ND and RFS showed increasing trends. We also found that there were general antagonistic and synergistic effects among the response factors and the synergistic enhancement between disturbance factors was more pronounced than the antagonistic buffering effect. This study provides a fresh perspective for farmland ecosystem managers and stakeholders to examine disturbance risks and assess the sustainability of farmland ecosystems.
A quantitative three-step approach for guiding sandy beach management
2022, Ocean and Coastal Management
The complexity of the social, economic, and environmental characteristics of the coastal zone represents a challenge for decision-makers to identify the socio-environmental conditions of sandy beaches and, consequently, the application of effective management measures. In this study, we propose a quantitative three-step approach to meet this need, recognizing the relevance of spatial heterogeneity of the coastal zone and its uses. The first step seeks to evaluate the environmental condition of beaches based on the combined interpretation of standardized indices: Urbanization Index, Conservation Index, and Recreation Index. The second is to classify beaches according to the similarities in these indices to detect trends in use regarding simulated reference sites. The third step aims to identify management strategies and reference conditions for each beach attribute. This three-step approach is illustrated using 24 sandy beaches at six coastal localities in Yucatan, Mexico. As a result, ten beaches were classified as having good environmental conditions, while eight had moderate and six had poor environmental conditions. Beaches were statistically classified into three groups, each with a clear tendency of similarity regarding multipurpose-use simulated reference sites. The results also show that such trends in use are independent of the locality. Finally, appropriate management strategies were recognized for each beach, regardless of the locality to which it belongs. The proposed framework is intended to be applicable for beaches where the main uses are conservation, recreation, coastal development, or a mixture of these uses.
Climate-related financial policy index: A composite index to compare the engagement in green financial policymaking at the global level
2022, Ecological Indicators
This study presents a composite index for assessing, quantifying, and comparing international engagement based on evidence that countries have been globally committed – to varying degrees – to climate-related financial policymaking in recent decades. The proposed investigation aims to promote awareness about countries’ global engagement and identify gaps in climate-related financial policy. Using a standardised metric, researchers and policymakers can use the generated climate-related financial policy index (CRFPI) to compare international commitment to climate-related financial policymaking. Emerging economies are the most engaged in green financial policymaking, although they report the highest within-group variability, indicating no standard or general approach to “greening” the financial system. The index demonstrates a location effect, as most jurisdictions with high CRFPIs are in Asia-Pacific and North Europe. Brazil, China, France, Indonesia, and South Korea had the highest scores among G20 countries. After examining several aggregation approaches, the findings appear to be robust overall.

View all citing articles on Scopus

View full text

Robustness and sensitivity of weighting and aggregation in constructing composite indices

Abstract

Highlights

Introduction

Section snippets

Method

Results

Discussion

Acknowledgements

Rank analysis of incomplete block designs. I. The method of paired comparisons

Biometrika

The Essentials of Factor Analysis

Stream health index for the Puget Sound Lowland

Environmetrics

Latent health factor index: a statistical modelling approach for ecological health assessment

Environmetrics

Environmental Indices

2010 Environmental Performance Index