If energy policy-making makes different epistemic demands on research, these should be visible by examining the pattern of designs deployed to inform a policy. For clarity, these demands will be referred to from hereon as the ‘energy policy epistemology’ as the arguments made here are meant largely to focus on the particular demands of evaluating energy policy. The one proviso of such a deductive approach is that any such pattern may be subject to the risk aversion to negative findings identified by Vine et al. and others, revealing what might be classified as a ‘defensive epistemology’—that is undertaking ‘policy-based evidence making’ in the pejorative sense of the phrase (Torriti
2010). Clearly, on the basis of good democratic standards, any policy institution that subscribes to a ‘defensive epistemology’ is effectively attempting to bypass accountability to citizens and is therefore not a legitimate, defensible position. As such, it is important to identify what a legitimate epistemic position might be for policy institutions with respect to evaluation (i.e. use an inductive approach). In so doing, any failures to deploy stronger tests of policy effectiveness can be identified and new prescriptions for improvement offered, tailored to the energy policy epistemology. Below, the Green Deal policy (developed by DECC between 2012 and 2015) ex ante and ex post evaluative research and data collection approaches are examined to shed light on what might constitute an ‘energy policy epistemology’.
The Green Deal evaluation programme
The Green Deal policy aimed to increase the uptake of insulation and other energy efficiency measures by providing first an official assessment of need (a ‘Green Deal assessment’) which then identified what measures could be funded by a Green Deal loan. The loan was to be paid back by savings on energy bills generated by the installed measures. Specific suppliers of Green Deal measures were accredited as were the Green Deal assessors. The Energy Company Obligation (ECO) placed an obligation on energy companies to supply energy efficiency measures into low-income homes at no cost to the occupant—while important, little will be made of this element due to space constraints.
The UK’s gov.uk website collates analytic outputs centrally making straightforward the discovery of evaluative research and data collection on a policy. There is a single webpage dedicated to the Green Deal and ECO evaluation programme
4 which collates 9 separate projects each of which have published between one and 12 reports. The project names are presented in Table
2 as an indication of the areas of the important questions policy asked in the evaluation activity.
Table 2
Evaluative research and data collection related to the Green Deal and ECO policy, showing study design, methods, number of waves and scale of data collection where appropriate
Green Deal assessment research | Survey | Quantitative | 3 | 1500 (500/wave) |
Green Deal customer journey surveys | Survey | Quantitative | 5 | Ca. 4100 (400–900/wave) |
Energy Companies Obligation (ECO) customer journey research | Mixed (Survey/interviews) | Quantitative and qualitative | 1 | 571/28 |
Green Deal household tracker survey | Survey | Quantitative | 6 | 15,000 (3000/wave + 2 1500 small waves) |
Research into businesses that were not certified Green Deal suppliers | Mixed (Survey/interviews) | Quantitative and qualitative | 1 | 400/15 |
Green Deal assessment mystery shopping research | Mystery shopper | Qualitative | 1 | 46 |
Green Deal pre-assessment customer journey qualitative research | Focus group | Qualitative | 1 | 6 groups |
Green Deal provider market report | Interviews | Qualitative | 1 | 39 |
Evaluation of the Green Deal Communities Private Rented Sector funding | Mixed | Qualitative and quantitative | 1 | 44 interviews, Administrative data covering 23,000 properties |
Green Deal and ECO statistics | Administrative data | Quantitative | 78 | Population statistics of Green Deal installations |
In addition, a dedicated statistics web portal
5 collates all official statistical outputs. A search in the webpage’s text box labelled ‘Contains’, using ‘Green Deal’ as a search term and ‘Policy area’ set to ‘Energy’, returns 79 hits (as at 31 October 2017), each for a monthly statistical return on the deployment of the policy. A wider search for ‘Green Deal’ on the publications section of gov.uk (‘Contains’ set to Green Deal, Publication type set to ‘Research and Analysis’, Policy area set to ‘Energy’) returns 43 hits, of which 28 are Green Deal related. Fourteen of these are part of the Green Deal and ECO evaluation page, 11 of these associated with Green Deal Assessment research and 4 of which are part of a ‘Household tracker’ survey. Table
2 summarises these studies with an indication of the study design type, main method classification (quantitative, qualitative or mixed) and an indication of scale if fieldwork is involved.
The first point to make is that there are no RCT-based experimental designs here—the closest RCT in this general policy area is a loft clearance ‘behavioural trial’ (DECC
2013) linked to the Green Deal insofar as one aspect of the policy was support for loft insulation. This project was carried independently of the main Green Deal evaluation programmes and so is not listed in Table
2. Volumetrically, the greatest emphasis here is on the monthly statistical collection that documents the appearance of Green Deal activity across the UK. What is notable is that these are published publically and badged as ‘national statistics’ rather than being solely internal monitoring data without any such badging. The ‘national statistics’ badging is a way of quality assuring government data, in order to ensure they are ‘trustworthy’ (i.e. valid). It is clear that these data are more than simply monitoring data—they perform a function of demonstrating that the policy is existing and is growing, or, as ultimately happened in this case, tailing off. Ultimately, this data exists as a form of
accountability—to Parliament, the media, the opposition parties and the general public. Without this data, it would be difficult for the government to show that it has done what it said it would do, and would therefore lose the legitimacy necessary to govern.
The central importance of this data, as determined by the scale of data collection and waves of publication, means the logic outlined above merits further inspection. First, do these statistics provide a useful and valid form of accountability? Or might they hide a level of deadweight in the policy (i.e. many such installations would have happened without the policy) that they are effectively useless? While the data show the amount of Green Deal-branded activities, there is no contextual data showing the number of similar such installations that are not Green Deal-driven making it difficult to know if the Green Deal was adding to or simply ‘rebranding’ current normal activity that would have happened anyway. Despite this, there are some reasons why these figures alone could provide a reasonable, if sub-optimal, indication of policy success or failure. The first is whether the rate of installation—the numbers per month or year—are increasing or decreasing compared to previous data on related prior policy (as indicated above, this was a critical element of the evaluation of policy success). If one assumes that such installations are not going to take place on account of market failure (e.g. split incentives or co-ordination failure), then the likelihood of high deadweight might be seen as a low risk. In that context, what also becomes important is whether the rate of installation indicates sufficient pace of delivery to meet legally binding targets for greenhouse gas emission reductions that had been set in a previous ‘Carbon Plan’ (DECC
2011a). In addition, impact assessments are published prior to the launch of any new policy in the UK, where the anticipated cost-benefit ratios are identified based on assumptions about rates of installation and the logical value of benefits attributable to those installations (DECC
2011b). As a consequence, should the rate of installations under the Green Deal fall short of the anticipated volume of activity identified by the low end of estimates in the impact assessment then the policy is likely to be killed off. In addition to the deadweight issue above, the other major epistemic challenge is in knowing whether Green Deal (or indeed any equivalent) installation leads to energy demand reductions (and consequent carbon emissions reductions) of the scale assumed in the impact assessments.
The next obvious focus is on surveys: half of the research listed in Table
2 uses a survey design collecting quantitative self-report data. This maintains the focus on quantitative data but adds a subjective element to the representation of the public. Some of these surveys look at the experience of receiving the policy, and some about the perspective of those who have no connection to it. The scale of the surveying (total around 21,500 survey participants) gives rise to a desire to
represent a population—and to do so in enough detail such that different sub-groups issues and perspectives are taken into account. In the list in Table
2, there are four populations being represented—consumers who have had contact with the Green Deal, consumers who have not had contact, businesses involved in delivering the Green Deal, businesses who are not. This has both a democratic function with regard to enabling different voices to inform the policy, but also a practical function: to improve the policy by addressing potential barriers such as lack of awareness, lack of supply and so on. This practical function is visible in the final obvious feature of the list of studies—the qualitative nature of the designs.
Qualitative data features in six of the studies so is in some sense more common (as a design choice) than the quantitative data. There are three studies that are exclusively qualitative—covering both the consumer and provider side of the market. This reflects a clear privileging for subjectivity in the form of personal perspectives with regard to how and whether the policy is working. In a sense, this perspective can be classed as ‘useful subjectivity’. The goal here is clearly not to represent the populations of interest but to gather narrative accounts of and views on how the policy operates. It is here where a causal mechanism is in part understood, and it is important to note that qualitative inquiry is the preferred mechanism for capturing it, in and alongside significant quantitative data collection.
The pattern of Green Deal research designs, methods and scale elucidates some features of an ‘energy policy epistemology’. In democratic states such as the UK and USA, these generate three specific epistemological drivers that promote the use of certain designs. Table
3 sets out the three drivers and the kinds of designs and methods it promotes.
Table 3
Epistemological drivers for an ‘energy policy epistemology’ and the kinds of designs and methods they promote
Accountability | Being able to demonstrate that the promises of action represented in policy announcements have been undertaken is key to retaining legitimacy and therefore power. The checks and balances built into democratic systems demand a summative demonstration—have you done what you said you would do? | Complete and systematic administrative data collection (census), quantitative, independently quality assured. |
Representation | Ensuring an understanding of how a policy affects different groups is critical to retain political legitimacy and credibility. This asks the question, how true is the problem/impact generally and for whom? | Large-scale surveys (in the 1000s); mainly quantitative data; self-report via questionnaire. Likely involve some form of random probability sampling. |
Useful subjectivity | Recognition that systems are constituted of sentient actors whose perspectives are both important (democratically) and informative (pragmatically) regarding the action of the policy. This asks the question, is the policy working well? If not, how can it be improved? | Various forms of qualitative inquiry mainly including interview methods, but also observational methods (e.g. mystery shopper). Often purposively sampled in relation to sub-groups identified via the quantitative survey. |
What is absent from Table
3 is a driver that would actively promote (or prevent) an RCT or experimental designs more generally. The likely home for such a driver would be under accountability—to demonstrate to Parliament, the media and the public at large that the government did
actually cause the observed outcome. Yet the clear indication here is that accountability seems to stop short of that kind of epistemic demand, implying there are other features which need to be invoked if the current set of epistemic drivers is deemed both necessary and sufficient to explain a legitimate lack of interest in RCTs. There are potentially two reasons for this. The first is the role and presence of Chief Scientific Advisers in UK government departments (and associated engineering research teams) potentially leading to a problematic over-reliance on pure physics assumptions that such interventions work in all circumstances, as Vine et al. suggest. In addition to this, two additional inter-related epistemic drivers are proposed below that would actively reduce interest in RCTs:
limited agency and
negotiated certainty. Further critical analysis and potentially new empirical research will of course be required to determine whether these concepts carry any explanatory useful power.
Given the particular open nature of the energy systems in democratic states where actors within these systems are not under direct control from the state (in contrast, for example to the health or education sectors in the UK), but are in some sense partners delivering and receiving services with the support and oversight of the state, certain legitimate positions can be held. These comprise the following:
-
Limited agency—the recognition, especially within the energy system, that policy institutions have limited agency on account of the open nature of the system. This leads to a focus on policy-specific outputs (e.g. Green Deal assessments, installations and so on, linked to accountability) rather than outcomes (e.g. energy bill savings, increased thermal comfort, reduced fuel poverty) even though these are guiding goals of policy. Outcomes are affected by a range of external factors which policy institutions in free-market economies are not expected to try and control (indeed it is preferred that they do not control) at least in domains like energy supply. This limits the degree to which policy institutions must strongly show they directly caused certain outcomes is a relevant or fair question. The concept of limited agency is not about accepting that the world is complex (and therefore unknowable)—RCTs are designed to help manage that complexity via random assignment—it is about power and control, and the way in which energy policymakers in democratic states see their role in shaping society and therefore the way in which certain ways of knowing are privileged.
-
Negotiated certainty—Given that state actors like policymakers have limited agency in the energy sector, in order to generate causal outcomes, policy actions must be negotiated with stakeholders. This means negotiating future regulatory environments (e.g. the range of conditions under which a householder can install insulation) with business and citizen actors to ensure causal conditions can occur (that is, at the very least that the business community are amenable to supply insulation measures via the proposed programme in the case of the Green Deal). Attempting to unilaterally impose a particular way of doing into a setting where there is widespread acceptance of limited state agency is likely to directly count against policy effectiveness (and is therefore a key external validity threat to any RCT in this area). A good example of this outside the energy sector in the UK was the attempt by the Coalition government in the UK to sell off publicly-owned forest without negotiating the policy with stakeholders.
6 Following widespread criticism, the policy was dropped. This negotiation of certainty is less about imposing strict conditions in a top-down way, but about generating policy impact via bilateral relations and through that negotiation, in effect creating causal effects by agreement.
Both these reasons would count against promoting RCTs in policy research design. The acceptance of limited agency is related to, but different from, the inability of RCTs to effectively deal with threats to external validity (Allcott and Mullainathan
2010). It relates to a political, normative choice about how the state
should interact with citizens. In the UK energy system which is privatised in delivery and private in consumption, state interference in (for instance) the provision of home insulation is preferred to be relatively limited (compared to the provision of health or justice systems in the UK). As such, any research which fails to take account of the limited agency of the state in this respect goes directly against this approach. Likewise, when policy-making is seen as a means of generating negotiated certainty, then a mode of research such as RCTs which is, in effect, about imposed certainty (insofar as the control of treatment groups and the exact enactment of the treatment itself goes) it is understandable that RCTs are not a preferred way of knowing about causal effects—especially if the act of imposition itself kills off the very causal mechanism intended to be studied.
It is worth noting two important caveats surrounding the presumed existence of these concepts: the first is evidential. Here, no direct empirical data is presented to support the existence of these concepts: they are inferred via a combination of the limitation of the epistemic drivers in Table
3 to fully explain the lack of RCTs, and drawing on the author’s personal experience from working in the UK government. Clearly, targeted data collection and analysis to determine whether they are real is needed if these concepts survive initial critical inspection. The second is that the impact of these concepts, if they are important, may be policy-area specific. As implied above, they may arise specifically in democratic states and in particular in domains that are deemed to be delivered not by state actors, such as energy, environment and agriculture and transport. This implies that if they hold as explanatory factors, there ought to be more RCTs in domains with heavy state involvement or control, such as education, health or criminal justice (in the UK).
The ‘energy policy epistemology’ no doubt has roots in many other scholars’ attempts to classify epistemic positions in science. There are clear signs of a critical realist perspective in the privileging a ‘useful subjectivity’ as a major source for understanding of causal mechanisms and the implicit attempts to understand what works for whom (Pawson and Tilley
1997). Others have created categories that could place this description more widely as part polling democracy, part critical pragmatism (Tapio and Hietanen
2002). Negotiated certainty has echoes of social constructionism (Berger and Luckmann
1991). Either way, it is a long way from the positivist position underpinning RCTs. For now, the goal here is simply to elucidate some of these issues to enable a more fruitful dialogue between academics and policy makers.