## Introduction

## Historical Population Estimates

^{1}

^{2}Thus, when adjusting Palm’s (2001) figures, Edvinsson (2015:169) acknowledged that a “[c]ertain amount of guesswork is necessary”; Broadberry et al. (2015:22) estimated the medieval population with “significant margins of error”; van Zanden and van Leeuwen (2012) provided a “very tentative” estimate of the historical Dutch population; Cipolla (1965:572) suggested a wide confidence interval for his Italian estimates (±15%); and even the esteemed Wrigley and Schofield (1981:152) described their own estimate as “tolerably reliable.”

^{3}

^{4}However, much more elaborate methods have also been suggested, especially the procedure laid out in Wrigley and Schofield (1981). A well-known correction method is the one Bourgeois-Pichat (1951) used to correct for underregistered (infant) mortality, later applied by, for example, Henry and Blayo (1975), Wrigley (1977), and Pitkänen and Laakso (1999).

## Data

^{5}

^{6}The first statute to stipulate that these records should be kept was issued in 1628, but it was not until 1665 and 1673 that proper judicial measures were taken to enforce the practice and not until the late 1680s that parish registering became relatively common practice (Jutikkala 1944; Muroma 1991; Valpas 1967). A few scattered burial records are available from the 1630s, but 1648 is the first year when both variables are simultaneously available—burials for nine parishes, and baptisms for one. Because of the consequent lack of available registers, vital events before the 1680s have to be estimated from data that are mostly incomplete and provided by fewer than 50 parishes. Subsequently, plenty of prior research has addressed the (un)reliability of registers and the population figures they produce (e.g., Eriksson et al. 2008; Jutikkala 1944, 1965; Kaukiainen 2013; Muroma 1991; Pitkänen 1977a, 1979b; Valpas 1967).

^{7}the Finnish Genealogical Society’s Internet data bank, which contains the vast majority of Finnish parish records available. HisKi is an ongoing voluntary project to digitize Finnish parish records that started in the 1990s, in which ecclesiastical events are entered into the database one individual at a time. Local studies that have cross-checked the HisKi database against the original records reveal occasional errors, but the digital versions of the registers are overall of high quality (e.g., Uotila 2014).

^{8}This accounts for the level shift evident in the Tabellverket series, when regions in the southeastern corner that had been annexed to Russia (in 1721 and 1743) were reannexed to Finland in the early nineteenth century.

^{9}

## Method

_{t}. This formulation assumes zero net migration, which we have to resort to, like Galloway (1994) and Edvinsson (2015), because there are no good quality early modern benchmark population estimates that would allow for residual estimation of net migration.

_{t,i}and burials d

_{t,i}for the years 1648–1850, compiled from all those parishes that had records for both baptisms and burials for at least 10 years (177 parishes). As described earlier, these records are incomplete in three ways: (1) baptisms did not necessarily represent births, nor burials deaths; (2) for most years, observations came from only some parishes; and (3) some burials and baptisms went unregistered. This incompleteness applied especially to the seventeenth century. In the following, we assumed that the missingness was independent of parish size as well as birth and death rates.

^{10}In the following, we define and explain the components of the model, starting from the census data, then moving upward in the figure.

_{t}represents the true (unknown) population size we aim to estimate, and σ

_{c}is the standard deviation of the measurement noise with a Gamma (1,0.0001) prior

^{11}translating to a 95% prior interval [253; 36,889]. Because the population increased during the Tabellverket era, a constant variance assumption leads to the increasing relative accuracy of census data, consistent with the fact that the first census estimates contain more uncertainty than the latest (e.g., Jutikkala 1944; Pitkänen 1979a; Pitkänen and Laakso 1999).

_{1647}, we used previous literature (Åström 1978; Ignatius 1866; Kilpi 1917; Luukko 1967; Sundquist 1929–1931) to arrive at a mean population estimate of 430,000, with a standard deviation of approximately 30,000.

^{12}

_{t}as

_{t}and δ

_{t}are birth and death estimates for year t, and the deterministic part s

_{t}corresponds to the estimates of military casualties (Edvinsson 2015; Lappalainen 2001; Palm 2001; Valpas 1967), which we took as given. As a prior for ψ

_{μ}we used a Gamma(2, 4) distribution. The prior for the first year translates to a 95% prior interval of [374,440; 489,348], with a prior mean population of 430,000 and a standard deviation of 29,326.

_{1697}/ μ

_{1695}, for which we defined a beta prior with a mean of 0.775 and precision of 200, leading to 99% prior quantiles of [0.695; 0.845]. Based on this, we defined a mortality correction coefficient ϕ

_{μ}for the number of deaths in 1697 as

_{1697}∼ Gamma(ψ

_{μ}πμ

_{1695}, ψ

_{μ}).

_{t}at year t; and \( {\uplambda}_t^d \) is defined similarly for burials and deaths.

_{j}and m

_{j}as the growth rate and the midpoint, respectively, of logistic curve for j = b, d; and then defined the standard logistic curve as

_{j}and Beta(5, 5) as prior for scaled midpoint \( \tilde{m}_{j}=\left({m}_j-1647\right)/\left(1830-1647\right) \).

_{1850}, translating to 95% prior interval [0.66; 1]. Instead of defining a prior directly for the first year, we defined priors for proportions \( {\uplambda}_{1648}^j/{\uplambda}_{1850} \) with a constraint \( {\uplambda}_{1648}^d<{\uplambda}_{1648}^b \); in other words, we assumed that the burial entries were less accurate than those of baptisms. Based on this, we defined \( {\uplambda}_{1648}^d/{\uplambda}_{1850}={a}_1 \) and \( {\uplambda}_{1648}^b/{\uplambda}_{1850}={a}_1+{a}_2 \), with Dirichlet(10, 5, 5) as a prior for (a

_{1}, a

_{2}, 1 − a

_{1}− a

_{2}). This led to prior means of 0.5λ

_{1850}for \( {\uplambda}_{1648}^d \) and 0.75λ

_{1850}for \( {\uplambda}_{1648}^b \), with 95% prior intervals for the proportions as [0.29; 0.71] and [0.54; 0.91].

_{d}for the famine as

_{bd}~ Gamma(2, 4), σ

_{j, ν}~ Gamma(2, 10), and σ

_{j, η}~ Gamma(2, 20), for j = b, d. For prior distributions of year 1648, we used N(3.8, 0.5) for \( {\upnu}_{1648,i}^b \); N(3.2, 0.5) for \( {\nu}_{1648,i}^d \); and N(0, 0.01) for \( {\upeta}_{1648}^j \), for j = b, d, i = 1. . . ,177.

^{13}

## Results

Mean | MCSE | SD | 2.5% | 25% | 50% | 75% | 97.5% | Effective N | \( \hat{R} \) | |
---|---|---|---|---|---|---|---|---|---|---|

σ _{b, η} | 0.15 | <0.01 | 0.01 | 0.13 | 0.14 | 0.15 | 0.16 | 0.17 | 83,601 | 1.00 |

σ _{d, η} | 0.30 | <0.01 | 0.02 | 0.27 | 0.29 | 0.30 | 0.31 | 0.33 | 157,076 | 1.00 |

σ _{b, ν} | 0.09 | <0.01 | <0.01 | 0.09 | 0.09 | 0.09 | 0.09 | 0.09 | 8,092 | 1.00 |

σ _{d, ν} | 0.26 | <0.01 | <0.01 | 0.25 | 0.25 | 0.26 | 0.26 | 0.26 | 14,105 | 1.00 |

ψ _{bd} | 0.33 | <0.01 | <0.01 | 0.32 | 0.33 | 0.33 | 0.33 | 0.34 | 18,522 | 1.00 |

ψ _{μ} | 0.38 | 0.01 | 0.29 | 0.08 | 0.18 | 0.29 | 0.49 | 1.17 | 2,353 | 1.01 |

σ _{c} | 2,358 | 21 | 1,342 | 177 | 1,372 | 2,276 | 3,187 | 5,297 | 3,903 | 1.00 |

μ _{1647} | 440,380 | 265 | 29,236 | 384,595 | 420,322 | 439,830 | 459,869 | 498,951 | 12,137 | 1.00 |

ϕ _{d} | 5.20 | 0.01 | 0.60 | 4.13 | 4.78 | 5.16 | 5.57 | 6.46 | 7,517 | 1.00 |

π | 0.78 | <0.01 | 0.03 | 0.72 | 0.76 | 0.78 | 0.80 | 0.83 | 125,064 | 1.00 |

ϕ _{μ} | 1.41 | <0.01 | 0.38 | 0.75 | 1.15 | 1.38 | 1.65 | 2.24 | 21,918 | 1.00 |

r _{b} | 0.07 | <0.01 | 0.06 | 0.01 | 0.03 | 0.05 | 0.09 | 0.23 | 13,416 | 1.00 |

r _{d} | 0.10 | <0.01 | 0.07 | 0.01 | 0.05 | 0.08 | 0.13 | 0.27 | 22,773 | 1.00 |

m _{b} | 1,740.16 | 0.22 | 23.22 | 1,695.26 | 1,724.50 | 1,739.15 | 1,755.88 | 1,785.45 | 10,881 | 1.00 |

m _{d} | 1,703.17 | 0.25 | 20.88 | 1,670.44 | 1,688.66 | 1,700.29 | 1,714.23 | 1,754.17 | 7,106 | 1.00 |

a _{1} | 0.58 | <0.01 | 0.10 | 0.37 | 0.51 | 0.59 | 0.66 | 0.77 | 19,889 | 1.00 |

a _{2} | 0.18 | <0.01 | 0.07 | 0.06 | 0.13 | 0.17 | 0.22 | 0.34 | 21,053 | 1.00 |

a _{3} | 0.24 | <0.01 | 0.09 | 0.10 | 0.18 | 0.23 | 0.29 | 0.43 | 28,638 | 1.00 |

\( {\uplambda}_{1648}^b \) | 0.64 | <0.01 | 0.08 | 0.48 | 0.59 | 0.65 | 0.70 | 0.79 | 18,582 | 1.00 |

\( {\uplambda}_{1648}^d \) | 0.49 | <0.01 | 0.09 | 0.31 | 0.43 | 0.49 | 0.56 | 0.67 | 16,717 | 1.00 |

λ _{n} | 0.85 | <0.01 | 0.05 | 0.76 | 0.81 | 0.84 | 0.88 | 0.95 | 7,453 | 1.00 |

_{bd}, \( {\uplambda}_t^j \), \( {\nu}_{t,i}^j \), π, ϕ

_{δ}, ψ

_{μ}, μ

_{1647}, and \( {\upsigma}_c^2 \), j = b, d, i = 1, . . . , 177 and t = 1648, . . . , 1850, we simulated new replications of β

_{t}and δ

_{t}(assuming empty \( {\Omega}_t^j \) for all t and j = b, d). Using these, we sampled new latent population processes, μ

_{rep}, and sampled hypothetical census data from N(μ

_{rep}, \( {\upsigma}_c^2 \)). Had the model been severely misspecified, these posterior samples would have shown considerably different patterns compared with the real census data; but as shown in Fig. 3, which contains 1,000 replications from posterior predictive distribution, quite the opposite occurred. While the variation in these posterior predictive samples was, as expected, high when compared with our 95% posterior interval (see Fig. 5), the overall trend had a similar shape to census observations.

^{14}This implies that data quality issues during the nineteenth century were primarily related to the availability of parish records, rather than the quality of recorded entries (this is visible in Fig. 6).

_{t}, we calculated the probability of various average annual growth rates during the 1647–1690 period. The probability of negative growth (average annual growth <0.0%) was estimated at 11.8%, slow positive growth rate (0.0% to 0.5%) was estimated at 79.6%, and fast growth (>0.5%) was estimated to be 8.7%.

^{15}The most obvious feature of this new series is the correction provided to account for the level shifts due to changes in Finnish geographical area. In addition to this, we introduce the following revisions to existing interpretations of Finnish population history. Space constraints allow us to list only the most obvious of these revisions and the areas where further investigation will be required.

^{16}and was lower than what has been suggested or derived from other studies (e.g., Åström 1978; Ignatius 1866; Luukko 1967; Pitkänen 2007; Sundquist 1929–1931). Furthermore, the growth rate was found to be lower than that of the Swedish mainland for the same period (Edvinsson 2015). Our estimate for the eighteenth century differs from Jutikkala’s (1965), especially during the population drop in the 1740s; the estimate for 1780–1787, rather than corroborating the slow growth apparent from the censuses, justifies the criticisms mentioned earlier that have been leveled at the 1785, 1790, and 1795 census totals.