This section reports the results of a regression analysis based on all the data and method choices that were presented in the previous section. Variable choices, survey data sampling, survey years and the method of aggregating light and survey data, all have an impact on the results. In the interest of keeping focus, we will only present results that show the highest impact on the conclusions of our comparisons between the full census and DHS data light correlations. Furthermore, we show results with and without control variables.
5.1 Level of wealth and light
The replication of the DHS buffer specification in Namibia, and comparison to actual enumeration area border-based results, is provided in the appendix A2. The association between wealth and light is not as strong as is found in the literature across developing countries. In Namibia, the regression R
2 is 42.1 % (A2.2, column 1), when using the logarithmically transformed mean of stable light as the only explanatory variable for the wealth index. Using the same specification in other developing countries, Bruederle and Hodler (
2018) find R
2 of 52.7 %. The lower than average explanatory power in Namibia is supported by the findings in Weidmann and Schutte (
2017). They rank order DHS clusters by light and wealth for each country, and then compare the rank correlations. The correlation in the 2007 Namibian DHS was placed 46 (highest rank equals 1) out of 57 surveys.
When we switch from randomly sampled DHS data to the full census data, and still aggregate light using the simulated buffers from enumeration area centroids, the association is slightly stronger with a R2 of 43.9 % (A2.2, column 3). This is to be expected because the census data is closer to the ground truth and eliminates measurement error introduced in the DHS sampling process. Furthermore, when we use the census data and aggregate light within the enumeration area borders instead of the buffers, the R2 increases to 48.3 % (A2.2, column 5). This is also to be expected because the borders aggregate light that is most relevant for the survey respondents. Controlling for population density does not have large impact on the results. Altogether, the association between wealth and light is significant, and slightly stronger when using the full census than the DHS data.
Next, we apply the point and polygon aggregation methods for 0.5 degree grid cells. It allows us to compare results from the DHS to the census step by step to see where the differences arise. Table
1 shows descriptive statistics for the different surveys and methods. The first notable difference is that the amount of grid cells considered is lower in the point method. This is caused by the fact that enumeration area centroids do not fall in each grid cell. In DHS, the amount is even lower due to random sampling. One full grid cell is 3600 pixels but note that cells at country borders are cut smaller. The higher mean and sum of light in DHS most likely reflects the exclusion of remote areas, development between the two years or differences in the satellite light gain settings. In column two and three, we can see that changing the point method to the polygon method lowers the population amount and density. This happens even though the numbers are derived from the same underlying data. The difference is down to how we assume the population is distributed in each enumeration area (all in a central point or equally divided in each pixel). We will not know for sure which one is closer to the actual distribution, but we can check if the choice influences our conclusions.
Table 1
Descriptive statistics for 0.5 degree grid cells
– | Grid cells (count) | 152 | 275 | 338 |
Wealth_A | Wealth index with 15 components | 2.63 | 2.74 | 2.67 |
Elec | Amount of people with access to electricity | 7385 | 2805 | 798 |
Stb_mean | Average stable light | 0.21 | 0.10 | 0.15 |
Stb_sum | Sum stable light | 605 | 296 | 245 |
popD | Population per pixel | – | 3.12 | 1.68 |
Pop | Population | – | 7480 | 4017 |
GPW_popD | GPW population per pixel | 5.76 | 3.54 | 2.99 |
GPW_pop | GPW population (GPW_popD * pixels) | 16777 | 9499 | 7754 |
Pixels | Pixels in Grid cells | 3199 | 3260 | 3047 |
The wealth index does not vary much across the columns. In contrast, the electricity access variable illustrates how large the difference can be when using different data and methods. The large gap is partly explained by the reduction in grid cells and development over two years. However, the bias gets amplified when calculating the amount of people with access to electricity. It is the product of GPW_pop and the share of access to electricity. As the table shows, the GPW_pop variable is probably an over-estimate as well. As a result, the calculation compounds the biases in both variables, and widens the gap compared to census-based figures that use the actual count. Our analysis demonstrates how important the population data and aggregation method are. These are the key factors when drawing the conclusions on the association between wealth and light.
We start the 0.5 degree grid analysis by using similar regression specifications as in Bruederle and Hodler (
2018) in Table
2. The light variable is significant at a 99 % confidence level in all specifications. In column (1), the magnitude of the light variable (0.242) is slightly lower for Namibia than what Bruederle and Hodler (
2018) find across developing countries (0.326). The interpretation of the coefficient is that a one percent increase in mean light is associated with 0.252 change in the wealth index in a grid cell. In column (1), the light alone explains 15.8% of the variation in the wealth index, which is lower than 35.7 % found in Bruederle and Hodler (
2018). This result also supports the finding in Weidmann and Schutte (
2017) that the association of light and wealth is relatively weaker in Namibia. In column (2), we control for population density and get a higher light coefficient and R
2 than Bruederle and Hodler (
2018). In their study, the population density variable is negative (−0.042) but not significant. This leads us to believe that the role of population density might play a largely different role across countries. In Namibia, which is one of the least densely populated countries in the world, it is highly significant, and the magnitude is almost on the same level as light.
Table 2
Regressions using 0.5 degree grid cells and relative wealth as dependent variable
Survey | DHS | DHS | Census | Census | Census | Census |
Year | 2013 | 2013 | 2011 | 2011 | 2011 | 2011 |
Method | Point | Point | Point | Point | Polygon | Polygon |
Dep. var | Wealth_A | Wealth_A | Wealth_A | Wealth_A | Wealth_A | Wealth_A |
Log_stb_mean | 0.242 | 0.578 | 0.204 | 0.476 | 0.088 | 0.308 |
(0.046)*** | (0.06)*** | (0.030)*** | (0.036)*** | (0.028)*** | (0.0260)*** |
Log_GPW_popD | – | −0.498 | – | −0.329 | – | – |
– | (0.067)*** | – | (0.031)*** | – | – |
Log_popD | – | – | – | – | – | −0.35 |
– | – | – | – | – | (0.022)*** |
Observations | 152 | 152 | 275 | 275 | 338 | 338 |
R2 | 0.158 | 0.387 | 0.147 | 0.394 | 0.029 | 0.437 |
In Table
2, we also compare how the results differ between DHS and census. Columns (3) and (4) use the same point method but switch the data from DHS to census. This changes the year of the survey and the number of observations, but the results are similar.
Columns (5) and (6) consider the polygon method instead of the point method. It seems that aggregating a relative wealth variable by taking the average of the pixels is a worse practice than taking the average of enumeration area centroid points. However, once we control for the actual population count from the census in column (6), the association is stronger. Altogether, the conclusions that we can draw based on DHS data, are well in line with the census-based results when using a relative wealth index as is done in the literature.
Now that we have established a baseline comparison to the literature, we take a different approach by using the stock of wealth indicators. This utilizes the full scope of the census data where we know the actual amount of people with access to electricity, piped water, flush toilets and car ownership. In principle, we can construct the stock of wealth variables also for DHS by using a third source for population density. However, as we demonstrated in the Table
1, this leads to biased values. Therefore, it is understandable that this approach is often avoided when only using DHS data.
Table
3 uses the amount of people with access to electricity as a stock dependent variable of wealth. It is the most obvious variable that should show up as a signal in nighttime lights. It is also highly correlated with other household assets. Our primary goal is to determine if the nighttime lights provide a further signal of wealth, compared to what can be concluded from relative wealth-based DHS studies. Specifications (1), (3) and (5) provide an intuition of the relationship, in which we simply sum up the light values in a grid cell. In this case the point method (3) provides the strongest association, but the polygon method (5) is also highly significant. Considering the R
2, the association is weaker for DHS in column (1). In columns (2), (4) and (6) we include control variables for the grid cell size and the amount of population. They are all highly significant. Furthermore, the difference in the R
2 between the DHS and census widens substantially, which shows the advantage of using the actual population data instead of another source.
Table 3
Regressions using 0.5 degree grid cells and the stock of wealth as dependent variable
Survey | DHS | DHS | Census | Census | Census | Census |
Year | 2013 | 2013 | 2011 | 2011 | 2011 | 2011 |
Method | Point | Point | Point | Point | Polygon | Polygon |
Dep. var | Log_elec | Log_elec | Log_elec | Log_elec | Log_elec | Log_elec |
Log_stb_sum | 0.470 | 0.410 | 0.457 | 0.321 | 0.387 | 0.263 |
(0.06)*** | (0.076)*** | (0.035)*** | (0.042)*** | (0.032)*** | (0.029)*** |
Area_size | – | 0.001 | – | 0.001 | – | 0.001 |
– | (0.000)** | – | (0.000)*** | – | (0.000)*** |
Log_GPW_pop | – | 0.371 | – | 0.668 | – | – |
– | (0.247) | – | (0.122)*** | – | – |
Log_pop | – | – | – | – | – | 0.608 |
– | – | – | – | – | (0.066)*** |
Observations | 152 | 152 | 275 | 275 | 338 | 338 |
R2 | 0.289 | 0.313 | 0.380 | 0.471 | 0.300 | 0.563 |
In contrast to the relative wealth regressions, population density has a positive sign. The intuition is that average wealth is lower in densely populated areas, but the absolute amount of wealth can still be higher. In column (6), the R2 climbs as high as 56.3%. However, this high association might just be due to the choice of electricity access as an indicator of wealth. Therefore, we repeat the regressions specifications (5) and (6) with different individual stock variables of wealth in appendix A3. The variables piped water, flush toilet and car ownership are not directly linked to electricity, but the results show an equally strong, or an even stronger association with light. Therefore, we conclude that nighttime lights carry a stronger signal of the total stock of wealth in a grid cell compared to what can be concluded from the DHS data.
So far we have aggregated data within enumeration area borders, buffers and 0.5 degree grid cells. The latter is the favored cell size in the literature, but arbitrary, nevertheless. Any other cell size could also be justified, especially as we focus on Namibia only. Therefore, we repeat the regressions for 0.25, 0.1 and 0.00833 (one light pixel) grid cell sizes. In order to get an overview of the results, we pack all of the light coefficients, their significance and R
2 into Table
4. The results differ from earlier regressions because we have excluded observations that have no light emissions. This practice is debatable because it may be that the non-lit areas do emit light, which is just not detected in the satellites, as argued by Bruederle or Hodler (
2018). Alternatively, it may be that the non-lit areas are simply uninhabited, and therefore justifiably removed. The truth is probably somewhere in the middle, and therefore we provide analysis with and without the non-lit areas for robustness. In general, excluding non-lit areas leads to a stronger association between nighttime lights and wealth. Furthermore, in the census point method, we control for the population count reported in the census instead of GPW population. Note that in the previous analysis, we aimed to be comparable to the existing literature but at this stage we are seeking the specifications that provide best signal of wealth using the nighttime lights data.
Table 4
Regressions with varying spatial units and the stock of wealth as dependent variable
Enumeration area borders/buffers | Log_stb_sum | 1.734*** | 1.218*** | 0.277*** | 0.262*** | 0.899*** | 0.850*** |
Control vars | No | Yes | No | Yes | No | Yes |
R2 | 0.159 | 0.198 | 0.046 | 0.063 | 0.059 | 0.070 |
N | 370 | 370 | 3215 | 3215 | 3215 | 3215 |
0.5 degree grid | Log_stb_sum | 1.201*** | 0.918*** | 0.894*** | 0.514*** | 0.857*** | 0.318*** |
Control vars | No | Yes | No | Yes | No | Yes |
R2 | 0.359 | 0.397 | 0.653 | 0.768 | 0.361 | 0.808 |
N | 104 | 104 | 129 | 129 | 146 | 146 |
0.25 degree grid | Log_stb_sum | 1.255*** | 1.270*** | 1.013*** | 0.553*** | 0.865*** | 0.318*** |
Control vars | No | Yes | No | Yes | No | Yes |
R2 | 0.226 | 0.226 | 0.537 | 0.680 | 0.289 | 0.781 |
N | 119 | 119 | 175 | 175 | 214 | 214 |
0.1 degree grid | Log_stb_sum | 1.485*** | 1.592*** | 0.973*** | 0.654*** | 0.799*** | 0.361*** |
Control vars | No | Yes | No | Yes | No | Yes |
R2 | 0.219 | 0.221 | 0.375 | 0.508 | 0.207 | 0.762 |
N | 157 | 157 | 278 | 278 | 445 | 445 |
One pixel grid | Log_stb_sum | – | – | – | – | 1.346*** | 0.785*** |
Control vars | – | – | – | – | No | Yes |
R2 | – | – | – | – | 0.101 | 0.520 |
N | – | – | – | – | 9602 | 9602 |
Table
4 shows that light provides a highly significant association with the total stock of wealth across all spatial units, surveys and methods. The association tends to get weaker in smaller area units, but the signal remains clear even at the smallest possible pixel level in the nighttime lights data. This is an interesting result when considering the MAUP in the context of nighttime lights. Chen and Nordhaus (
2019) find that the association becomes stronger at a smaller spatial scale. However, they move from US state level to metropolitan areas. Our largest spatial scale is the 0.5 degree grid, which is closer to the metropolitan area size. Therefore, this might be the area size with the strongest association, while moving to larger or smaller units weakens it. However, given the large differences between the countries and light data, this finding remains a point for future work.
In Table
4, the association between lights and wealth is stronger in DHS when the data is aggregated within enumeration area borders or buffers. However, the enumeration area level results are unreliable due to overlapping buffers and blooming, as discussed earlier. When we switch to the grids, then the census results are stronger, especially with the control variables. Using the census data, it is not clear whether the point or polygon method should be preferred. The point method is stronger without the controls, while the polygon method is stronger with the controls. We recommend the polygon method if population data and borders are available because it retains more observations and allows for pixel level analysis.
5.2 Change of wealth and light
Now we turn attention to measuring changes in nighttime lights and wealth. Relating growth in nighttime lights to growth in GDP, as done by Henderson et al. (
2012), provided the benchmark for using nighttime lights in economic research. Again, we provide our most detailed results for 0.5 degree grid cells, but we also present an overview for other spatial aggregations. We report on changes in the total stock of wealth to changes in the sum of light as it is a more intuitive comparison.
For completeness, we provide analysis on relative wealth change as well (appendix A4). The wealth index B is using a smaller set of asset components as explained in the data section. The wealth change is constructed simply by subtracting the earlier year values from later year values. The other variables show the difference between log transformed earlier year values and log transformed later year values. The results are all insignificant, which means that at least for Namibia between 2000 and 2013 (DHS) or 2001 and 2011 (census), there does not seem to be any evidence of relative wealth change being associated with stable lights change. Grid cells that have no light in both years are excluded. The results did not improve when we included non-lit grid cells. This is an important result since economists often exploit the time series properties of (panel) data. Using DHS would suggest that it is difficult to capture the relevant variation.
Table
5 shows the results for the association between changes in the amount of people with access to electricity and changes in stable lights. The association is significant, but weak with the exception of column (6). It seems that the polygon method and actual population count are required for explaining more variation in the change of wealth. Furthermore, most of the explanatory power comes from changes in population rather than changes in light. Appendix A5, shows that there is no evidence for a significant association between changes in light and changes in other stock of wealth variables (piped water and flush toilets). Including non-lit areas did not improve the results either.
Table 5
Regressions using 0.5 degree grid cells and the change in the stock of wealth as dependent variable
OLS regression | (1) | (2) | (3) | (4) | (5) | (6) |
Survey | DHS | DHS | Census | Census | Census | Census |
Year | 2000–2013 | 2000–2013 | 2001–2011 | 2001–2011 | 2001–2011 | 2001–2011 |
Method | Point | Point | Point | Point | Polygon | Polygon |
Dep. var | Elec_log_change | Elec_log_change | Elec_log_change | Elec_log_change | Elec_log_change | Elec_log_change |
Stb_sum_log_change | 0.467 | 0.434 | 0.114 | 0.113 | 0.069 | 0.067 |
(0.210)** | (0.208)** | (0.034)*** | (0.034)*** | (0.021)*** | (0.018)*** |
Area_size | – | −0.001 | – | 0.000 | – | 0.000 |
– | (0.001) | – | (0.000) | – | (0.000) |
GPW_Pop_log_change | – | −4.425 | – | −1.816 | – | – |
– | (3.087) | – | (1.028)* | – | – |
Pop_log_change | – | – | – | – | – | 1.138 |
– | – | – | – | – | (0.138)*** |
Observations | 74 | 74 | 146 | 146 | 172 | 172 |
R2 | 0.064 | 0.114 | 0.072 | 0.093 | 0.060 | 0.334 |
Finally, we compare changes in light and wealth across the different spatial units in Table
6. The results based on DHS are not robust across different grid sizes (columns 1 and 2). The results based on the census follow a similar pattern that was found in 0.5 degree cells. The light variable is significant on every level with or without controls. However, the explanatory power is low, unless control variables are included. In this case, it is mostly the population variable that contributes to explaining changes in wealth. Even though the wealth signal in light is low, it is remarkable that it remains significant even on a single pixel level. This is good news for researchers who want to apply light data on local levels or draw economic conclusions based on the effects of changes in nighttime lights. We can conclude that there is a significant association between changes in nighttime lights and changes in wealth on a local level. This association does not show in DHS based studies due to use of relative wealth variables and a third source population data. However, using the full census of Namibia, we find that light change is associated with wealth change even at highly disaggregated spatial levels.
Table 6
Regressions with varying spatial units and the change in the stock of wealth as dependent variable
– | Survey | DHS | DHS | Census | Census | Census | Census |
– | Year | 2000–2013 | 2000–2013 | 2001–2011 | 2001–2011 | 2001–2011 | 2001–2011 |
Method | Point | Point | Point | Point | Polygon | Polygon |
Dep. var | Elec_log_change | Elec_log_change | Elec_log_change | Elec_log_change | Elec_log_change | Elec_log_change |
0.5 degree grid | Stb_sum_log_change | 0.467** | 0.434** | 0.114*** | 0.113*** | 0.069*** | 0.067*** |
Control vars | No | Yes | No | Yes | No | Yes |
R2 | 0.064 | 0.114 | 0.072 | 0.093 | 0.06 | 0.334 |
N | 74 | 74 | 146 | 146 | 172 | 172 |
0.25 degree grid | Stb_sum_log_change | 0.223 | 0.157 | 0.204*** | 0.196*** | 0.066*** | 0.075*** |
Control vars | No | Yes | No | Yes | No | Yes |
R2 | 0.015 | 0.085 | 0.136 | 0.187 | 0.043 | 0.281 |
N | 73 | 73 | 192 | 192 | 256 | 256 |
0.1 degree grid | Stb_sum_log_change | 0.450* | 0.337 | 0.226*** | 0.226*** | 0.088*** | 0.085*** |
Control vars | No | Yes | No | Yes | No | Yes |
R2 | 0.039 | 0.106 | 0.079 | 0.097 | 0.038 | 0.238 |
N | 74 | 74 | 251 | 251 | 531 | 531 |
One pixel grid | Stb_sum_log_change | – | – | – | – | 0.058*** | 0.063*** |
Control vars | – | – | – | – | No | Yes |
R2 | – | – | – | – | 0.008 | 0.371 |
N | – | – | – | – | 11706 | 11706 |