Introduction

Population genetic principles and methods are increasingly used in the management of harvested species and the conservation of natural populations (Allendorf and Luikart, 2007; Allendorf et al., 2008). Population subdivision results from genome-wide processes, such as reduced migration and genetic drift, that on an average affect all genes in the same way, as well as locus-specific processes such as recombination, selection and mutation that are limited to specific genomic regions. Population structure is typically assessed by estimating divergence in the allele frequencies at neutral marker loci, whereas coding gene regions are usually not used for testing demographic independence or for estimating migration rates because selection biases parameter estimation (for example, Whitlock and McCauley, 1999). However, in some cases, such as mixed stock analysis in harvested species, gene loci influenced by selection can be valuable as population markers on ecological time scales where isolated populations have not yet diverged at neutral loci (O’Malley et al., 2007; Westgaard and Fevolden, 2007; Hauser and Carvalho, 2008; Nielsen et al., 2007, 2009a). Moreover, coding loci will help in identifying locally adapted populations (Hemmer-Hansen et al., 2007; Andersen et al., 2009; Gebremedhin et al., 2009; Nielsen et al., 2009a; Árnason et al., 2009).

In marine fishes, population differentiation through drift is typically weak because populations have relatively shallow histories and are often very large (Hauser and Carvalho, 2008). Although some marine fish species show population structure even at small spatial scales (Bekkevold et al., 2005; Jorde et al., 2007; Galarza et al., 2009), a high number of neutral loci and large sample sizes are needed to resolve structure. Genetic differentiation at coding loci requires heterogeneous selection patterns and limited migration, and is expected to be more pronounced in large populations, in which even weak selection may override genetic drift. Temporal stability of divergence, whether selected or neutral, increases the likelihood that the detected structuring is biologically significant, and thus that the loci are useful as population markers (Waples, 1998; Nielsen et al., 2007).

In this study, we compare the performance of three putatively neutral genetic marker types (microsatellites, allozymes and mitochondrial DNA (mtDNA)) with two genetic markers allegedly influenced by selection (the major histocompatibility complex (MHC)-linked locus Clha-DAA-INTR3 and a microsatellite locus Cpa112) to detect structuring in Atlantic herring (Clupea harengus) populations within the North Sea/Baltic Sea region. Although there are several cases in which each of these three classes of neutral markers have shown to be under the influence of selection (Karl and Avise, 1992; Grant et al., 2006; Nielsen et al., 2006), the expectation is, in the absence of evidence to the contrary, that their evolutionary characteristics follow neutral expectations (Baer, 1999). It is, however, important to apply appropriate tests of neutrality beyond traditional measures of population structure (Rand, 1996).

Atlantic herring has traditionally been divided into different geographical ‘stocks’ based on spawning behavior and meristic or morphological characters. However, there has been little concordance between these characters and genetic differentiation (Ryman et al., 1984; but see Clausen et al., 2007 and Jørgensen et al., 2008). Recent genetic investigations using microsatellite DNA and allozyme markers have found little differentiation among spatially discrete populations in the North Sea, but concordant divergence between the Baltic and North Sea populations (Bekkevold et al., 2005; Mariani et al., 2005; Larsson et al., 2007, 2010). Genetic mixed-stock analyses show that coastal Skagerrak is used as a nursery both for the Baltic and North Sea spawning populations (Ruzzante et al., 2006). Earlier studies of population structure in the NE Atlantic using mtDNA have generally indicated little divergence among the populations (Dahle and Eriksen, 1990).

Using two different methods, Larsson et al. (2007) and Gaggiotti et al. (2009) showed that a microsatellite locus, Cpa112, displayed highly divergent allele frequencies among populations, suggesting that this locus may be affected by hitchhiking selection. The identity and function of associated gene(s) is unknown, but the divergence pattern was stable over a 25-year time span (Larsson et al., 2010).

MHC genes encode proteins in the vertebrate immune response, and molecular variation among alleles has been linked directly to fitness and survival (Bos et al., 2008). For example, disease challenge trials on Atlantic salmon confirm that polymorphism at the MHC genes are associated with increased disease resistance (Grimholt et al., 2003). Wild populations show evidence for local adaptation (de Eyto et al., 2007), and several studies indicate that MHC genes, as well as a number of linked loci, can be more powerful than neutral microsatellites in detecting population differentiation in spatially structured fish species such as salmonids (Bernatchez and Landry, 2003; Beacham et al., 2005; Hansen et al., 2007). However, with the exception of a precursory characterization of the MHC class IIA gene and an embedded microsatellite locus in Atlantic herring (Stet et al., 2008), no studies exist to date that describe the MHC variation in an abundant and fully marine fish. It is worth noting, however, that divergence in genes under selection is not always expected; if fish populations share similar parasite environments, balancing selection can result in lower levels of differentiation in MHC-linked microsatellites than neutral microsatellites (Muirhead, 2001; Bernatchez and Landry, 2003).

Different genetic markers reflect different demographic histories (Avise, 2004) and also have varying statistical properties that affect the extent to which they can resolve the patterns of divergence (Ryman et al., 2006). For example, it has been shown that organelle markers such as mtDNA may be more powerful for detecting differentiation in recently diverged populations and especially when migration among populations is relatively high (Larsson et al., 2009).

In this study, we present an investigation of spatial and temporal patterns of divergence and statistical power obtained using three traditional genetic markers, allozymes, mtDNA and microsatellites, and two markers potentially influenced by selection, MHC and the microsatellite locus Cpa112. Using the same individuals for all analyses, we compare fish collected in 2002, and again in 2003, from four spawning grounds spanning the environmental gradient from the Baltic Sea to the North Sea.

Materials and methods

Sampling

A total of eight samples of Atlantic herring were collected in 2002 and 2003 at four locations in the North Sea/Baltic Sea region: autumn-spawning herring from (1) Flamborough and (2) Berwick in the North Sea, (3) a spring-spawning population from Tjøme in the Skagerrak and (4) a second spring-spawning population from Kalix in the inner Baltic Sea (Figure 1; Table 1). These samples cover a salinity gradient from the fully marine environment in the North Sea (35‰) to almost fresh water in the Bothnian Bay of the Baltic Sea (3‰). Each sample consisted of ca 100 fish in full spawning condition. Otolith (sagitta) winter rings were counted as a proxy for age following standard procedures (ICES, 2003). The hatching season of individual fish (spring, autumn or winter) was estimated from otolith central area microstructure according to Clausen et al. (2007).

Figure 1
figure 1

Sampling locations for Atlantic herring collected in 2002 and 2003. Sea surface salinities are indicated.

Table 1 Sample descriptions including locations, coordinates, dates, spawning seasons, percentage of fish in spawning condition and sample sizes (n) of genetic markers used

Genetic analyses

The fish were genotyped using four genetic marker types: microsatellites, allozymes, mtDNA and MHC. One of the microsatellite loci, Cpa112, has previously been demonstrated to be influenced by selection (Gaggiotti et al., 2009; Larsson et al., 2007, 2010). Parts of the microsatellite, allozyme and MHC data have been presented previously (see Table 1), although here we provide full data sets to allow a direct comparison.

DNA for microsatellite, mitochondrial and MHC analyses was extracted from fin clip or muscle tissue using standard kits or the HotSHOT method (Truett et al., 2000). Nine microsatellite loci, Cha1017, Cha1020, Cha1027, Cha1202 (McPherson et al., 2001), Cpa101, Cpa111, Cpa113, Cpa114 and Cpa112 (Olsen et al., 2002), were amplified and genotyped using a BaseStation (MJ Research, MJ Research, Skovlunde, Denmark, Denmark) and Pharmacia ALF express (Amersham Pharmacia, Uppsala, Sweden) automated sequencers. Scoring consistency among runs and platforms was ensured by analyzing two heterozygote control individuals spanning the anticipated allelic ranges on all gels, in addition to internal and external size ladders.

MtDNA variation was examined by analysis of restriction fragment length polymorphism in the mitochondrial NADH dehydrogenase ND3/4 subunit. A 2.4 Kb sequence was PCR amplified using an existing universal primer (Hauser et al., 2001) and a new primer designed from a sardine (Sardinops melanostictus) mtDNA sequence (Inoue et al., 2000) published on GenBank (accession number NC 002616):

Universal vertebrate primer: 5′-TTTTGGTTCCTAAGACCAA(C/T)GGAT-3′

Newly designed primer: 5′-AAGACAGTACAGGTGGCTTCCAA-3′.

The PCR products were digested using six restriction enzymes: AluI, Hae III, Hinf I, Mbo I, Msp I and Rsa I, which produced 26 cut sites in Atlantic herring. Restricted fragments were separated on 6% polyacrylamide gels, together with a pGEM size marker (Promega, Southhampton, UK), and stained using a standard silver nitrate protocol. Fragments were sized using NEBCUTTER (http://tools.neb.com/NEBcutter2/index.php), and fragment patterns were expected to add up to a total of 2.4 Kb if the PCR product had been fully digested; partial digests were discarded.

MHC polymorphism was investigated using the linked microsatellite Clha-DAA-INTR3, located in the third intron of the MHC class II alpha gene Clha-DAA (Stet et al., 2008). PCR amplification of Clha-DAA was performed using standard cycling conditions with annealing at 50 °C. PCR products were analyzed with an ABI 377 automated sequencer and fragment sizes assessed by the GENOTYPER software (Applied Biosystems, Foster City, CA, USA).

Allozyme starch gel electrophoresis was performed using procedures described in Andersson et al. (1981). The following eleven polymorphic loci were scored, (locus abbreviations used in early studies, enzyme names and EC numbers are given in parentheses): mAAT* (AAT-2, aspartate aminotransferase, EC 2.6.1.1), G3PDH* (GPD-1, glycerol-3-phosphate dehydrogenase, EC 1.1.1.8), GPI* (glucosephosphate isomerase, EC 5.3.1.9), mIDHP* (IDH-2, isocitrate dehydrogenase, EC 1.1.1.42), LDH-1*, LDH-2* (lactate dehydrogenase, EC 1.1.1.27), MDH-4* (malate dehydrogenase, EC 1.1.1.37), sMEP* (ME-1, malic enzyme, EC 1.1.1.40), mMEP* (ME-2, malic enzyme, EC 1.1.1.40), PGM-1* (phosphoglucomutase, EC 5.4.2.2, formerly EC 2.7.5.1) and SOD* (superoxide dismutase, EC 1.15.1.1). These loci have previously been shown to be variable in herring from this region (Andersson et al., 1981; Ryman et al., 1984; Larsson et al., 2007).

Statistical analyses

It was not possible to genotype all fish for all marker loci, and the original data set with ca 100 fish from each sample was cropped to include individual fish that were successfully analyzed for at least three of the four genetic marker types (microsatellites incl. Cpa112, the MHC locus Clha-DAA-INTR3, allozymes and mtDNA; Table 1). In total, 743 fish were analyzed, out of which 573 were genotyped for all marker types.

Amounts of genetic variation within herring samples were estimated as allelic richness (AR) and expected heterozygosity (gene diversity, HE). F-statistics were calculated using FSTAT (Goudet, 2001). Statistical significance of FST for allozymes, MHC and microsatellites were examined using Fisher's exact test implemented in GENEPOP 3.4 (Raymond and Rousset, 1995), the χ2-test implemented in CHIFISH 1.3 (Ryman, 2006) and permutation tests using FSTAT (Goudet, 2001); the results were very similar and only P-values obtained using FSTAT (with 10 000 permutations and not assuming Hardy–Weinberg equilibrium) are presented here for brevity. Departure from Hardy–Weinberg equilibrium was tested using 10 000 permutations in FSTAT. We used GDA version 1.1 (Lewis and Zaykin, 2001) to examine linkage disequilibria. ARLEQUIN 3.1 (Excoffier et al., 2005) was used to estimate gene diversity and pairwise FST between samples for the mtDNA haplotype data; statistical significance was obtained from 10 000 permutations. The presence of outlier loci was investigated with the test of Beaumont and Nichols (1996), in which the null distribution was created using FDIST2 (Beaumont, 2002). STATISTICA 6.0 (StatSoft Inc., Uppsala, Sweden) was used to visualize genetic relationships among samples for the different marker types with multidimensional scaling using pairwise FST estimates. Concordance in FST estimates between markers was investigated using correlation of FST and Mantel tests between all sample pairs.

Temporal stability in differentiation for the different marker types was estimated using hierarchical analysis of molecular variance, with sampling years nested within locations. We used the locus-by-locus option in ARLEQUIN 3.1; statistical significance was obtained from 10 000 permutations. Further, we divided all fish into year classes, based on otolith ageing, and tested for genetic heterogeneity among cohorts with n20, within and among locations.

The statistical power for detecting genetic differentiation at various levels of FST was evaluated for the separate markers using the software POWSIM (Ryman and Palm, 2006; Ryman et al., 2006). This analysis simulates sampling from a specified number of populations that have diverged to predefined levels of divergence. We used sample sizes, number of loci and allele frequencies obtained from the eight samples in this study. Finally, we used a modified version of POWSIM to obtain sampling distributions of FST for the different markers; using the observed allele frequencies over all samples, we simulated a drift process in which two large populations diverge until an expected FST of 0.002 is reached (see Table 2). We then sampled these simulated populations at n=100 and calculated FST. This procedure was repeated 10 000 times.

Table 2 F-statistics (FIS, FST, FIT) for allozyme, microsatellite, MHC and mtDNA loci for unpooled (s=8) and pooled (s=4) temporal herring samples

In all statistical analyses that involve multiple tests, we provide both uncorrected and sequential Bonferroni-corrected significance levels for comparison.

Results

Sampling

The hatching date determined from otolith microstructure verified that the collected samples represented the presumed seasonal spawning stocks: autumn spawners in the North Sea and spring spawners in the Skagerrak and the Baltic. Between 48 and 84 individuals from each sample were genotyped for all five genetic markers: the three putatively neutral genetic markers (8 microsatellite loci, 11 allozyme loci, 1 mtDNA locus) and the two genetic markers allegedly influenced by selection (the microsatellite Cpa112 and the MHC-embedded Clha-DAA-INTR3).

Within-population descriptive analyses

Allelic richness and expected heterozygosity, HE, for the MHC-embedded microsatellite locus ranged between 8.5–11.6 and 0.70–0.80 among samples, respectively (Supplementary Appendix 1a). These values are similar to those obtained for the other microsatellite loci, including Cpa112, in the same populations. For allozymes the average number of alleles per locus and HE were considerably lower, ranging between 2.1–2.7 and 0.12–0.14 among samples (Supplementary Appendix 1b). In total, we found 237 mtDNA haplotypes among the 660 analyzed fish. Accordingly, gene (haplotype) diversity within samples was high, ranging between 0.93 and 0.98 among samples (Supplementary Appendix 1a).

Deficiency of heterozygotes relative to Hardy–Weinberg expectations was found in one locus-sample combination for the allozymes, and in 12 out of 64 for the presumed neutral microsatellites; there was no sample- or locus-specific pattern, and only one persisted after Bonferroni correction. One sample showed heterozygote deficiency at Cpa112, whereas no deviations were found at the MHC locus (Supplementary Appendix 1a). Combining the information from all localities and/or samples show that microsatellites displayed significantly high overall FIS values at four loci (Table 2); however, separate statistically significant locus-sample combinations appeared evenly distributed throughout the locus-sample table (Supplementary Appendix 1a). There was little indication of linkage disequilibria between loci. Out of a total of 1848 pairwise locus combinations (22 loci) within samples, 57 showed significant deviation from linkage equilibrium (ca 3% of the tests). Only one significance remained after Bonferroni correction.

Outlier test

The microsatellite locus Cpa112 was identified as an outlier (Figure 2), indicating that this locus is influenced by directional selection. While within the 95% confidence limits of the simulated FST distribution, the allozyme locus SOD* had a conspicuously large FST value compared with the other allozyme loci (Table 2; Figure 2). There was no indication of selection affecting the MHC locus.

Figure 2
figure 2

FST for individual microsatellite, allozyme and MHC loci plotted against heterozygosity. Lines denote the 2.5, 50 and 97.5 percentiles of simulation-based expected distributions of FST assuming an infinite allele or stepwise mutation model, respectively. The simulations were based on the combined overall FST=0.002 for all three marker types.

Statistical power and expected FST

Before comparing statistical power, we investigated the expected distribution of FST for all markers in this case under the assumption of selective neutrality. Two simulated populations differentiated at FST=0.002 were sampled repeatedly at n=100. The various markers had similar mean FST values, close to 0.002, but the medians and variances differed (Figure 3). The FST distributions were markedly skewed to the right, except for the eight microsatellites, in which the distribution was more symmetric and had a smaller variance (Figure 3). This pattern is also seen in our empirical data in which the average and variance were 0.0022 and 24.5 × 10−6 for the 11 allozyme loci and 0.0026 and 7.4 × 10−6 for the eight microsatellite loci (see Table 2; also see Larsson et al., 2007).

Figure 3
figure 3

Distributions of FST for various marker types based on 10 000 simulated samples of 100 individuals from two populations with a true FST of 0.002. The sample parameters mean, median and variance ( × 10−6) are for microsatellites (eight loci): 0.0020, 0.0019, 1.4; allozymes: 0.0019, 0.0009, 19; mtDNA: 0.0020, 0.0014, 9.3; MHC: 0.0021, 0.0008, 25; Cpa112: 0.0020, 0.0014, 10.

Analysis of statistical power indicated that all marker sets will detect FST=0.0050 or larger with a probability close to one, for present sample sizes (Table 3). At lower levels of divergence, eight microsatellites are more powerful for detecting structuring than eleven allozymes, mtDNA, MHC and Cpa112. The individual microsatellite loci have, however, statistical power comparable to the latter markers (Table 3).

Table 3 Statistical power of different marker types for detecting various true levels of population differentiation (FST) by means of Fisher's exact test when using present sample sizes and allele frequencies obtained from the eight samples in this study

The locus Cpa112 showed strong divergence and had an FST more than ten times higher than the other loci (Table 2). This raises the question: How much smaller could sample sizes be using Cpa112 and still have the same power as several neutral loci to detect divergence? Using POWSIM, we estimated power at Cpa112 for different sample sizes, and we show that a sample size reduced to n≈20 would be sufficient to detect FST=0.025 (Table 4).

Table 4 Statistical power of Cpa112 for detecting various true levels of population differentiation (FST) by means of Fisher's exact test when using allele frequencies obtained from the eight samples in this study and different sample sizes (n)

Population differentiation

The patterns of differentiation among the herring populations using the different marker types were explored in three different ways: using individual samples (s=8), pooling temporal samples from the same locality (s=4), and using year-class cohorts at sampling locations as the basic unit (s=10; Supplementary Appendix 3). Overall FST (s=8) was 0.0024 for microsatellites, 0.0011 for allozymes, 0.0145 for mtDNA, 0.0027 for MHC and 0.0360 for Cpa112 (Table 2). FST over all nuclear marker types (that is, excluding mtDNA) was 0.0052. Pooling the temporal samples within locations gave similar estimate of FST for microsatellites, whereas the estimates decreased for mtDNA and allozymes, and increased for MHC and Cpa112 (Table 2).

Spatial and temporal patterns

The microsatellite markers showed significant differentiation between the three regions, the North Sea, the Skagerrak and the Baltic (Figure 4, Table 2, Supplementary Appendix 2). At both Kalix and Tjøme, samples were genetically similar between years. In contrast, the four North Sea samples clustered by year of sampling. The cohort analysis indicates that this could be a year-class effect; younger fish from Berwick and Flamborough are more similar genetically than they are to older fish from the same locations (Supplementary Appendices 3, 4). The locus Cpa112 showed a similar pattern, but with a more pronounced divergence for the Baltic samples. Pairwise comparisons involving the Baltic had one magnitude higher FST values than comparisons outside the Baltic. Larsson et al. (2007) showed that the differentiation in Cpa112 is due to frequency differences especially for one allele, Cpa112(306). In this study, this allele had average frequencies of 38.6% at Kalix, 10.5% at Tjøme and 3.4% among the North Sea samples (see Figure 5). This pattern corroborates that Cpa112 is likely influenced by selection. The allozymes showed little divergence: 4 out of 11 loci had negative overall FST, and 16 out of 28 pairwise comparisons also had a negative FST estimate. As an exception, the herring sample collected in Tjøme in 2002 was highly divergent from other samples at the SOD* locus (Figure 4, Supplementary Appendix 2). Owing to this rare pattern, we repeated the genotyping of this specific sample, with an identical result. Notwithstanding this specific sample, and despite the many negative FST values, the North Sea samples clearly grouped together (Figure 4). Similar to the allozymes, the mtDNA data included a single highly divergent sample (Figure 4, Supplementary Appendix 2): the Kalix 2003 sample had one haplotype occurring at a frequency of 20% compared with around 2% in all other samples. In addition, mtDNA displayed a number of statistically significant pairwise differences. Finally, MHC showed a more complex pattern. Some pairwise comparisons were negative and some had relatively high point estimates of FST of around 1%, without any obvious geographical pattern (Figure 4, Supplementary Appendix 2).

Figure 4
figure 4

Multidimensional scaling plots of herring samples based on pairwise FST estimated from different genetic markers.

Figure 5
figure 5

Frequency of the allele 306 in the locus Cpa112 in samples of herring collected in 2002 and 2003 along the salinity gradient from the North Sea to the inner part of the Baltic Sea. Genetic data are compiled from Mariani et al., 2005; Bekkevold et al., 2005, Larsson et al., 2010 and André C (unpublished) (sample collected in 2002 in Idefjord, a sheltered Skagerrak fjord with low salinity, see Figure 1). Salinities were obtained as in Bekkevold et al., 2005 or by the Swedish Meteorological and Hydrological Institute.

The hierarchical analysis of molecular variance analysis (Table 5) indicated spatial differentiation among basins for microsatellites and Cpa112. MtDNA, on the other hand, showed a strong temporal variation, presumably masking any spatial pattern (see Table 2, s=4).

Table 5 Hierarchical analysis of genetic differences (AMOVA) among four sampling locations (see Figure 1) and between temporal samples (2002 and 2003) of Atlantic herring for different genetic markers

Marker correlations

Across samples, there was strikingly little concordance in sample pairwise FST between marker types (R=0.000–0.005; P>0.05), except between Cpa112 and microsatellites (R=0.70; P<0.001), and Cpa112 and mtDNA (R=0.44; P=0.021).

Discussion

Using the same fish for all analyses, thus excluding year-class effects, we show that levels of population differentiation in herring were low for the presumed neutral genetic markers (allozymes, mtDNA and microsatellites), in agreement with earlier studies from the same region (Ryman et al., 1984; Hauser et al., 2001; Bekkevold et al 2005; Jørgensen et al., 2005; Mariani et al., 2005; Larsson et al., 2007). For the putatively selected markers, the microsatellite locus Clha-DAA-INTR3 embedded in the MHC gene showed characteristics similar to those of the neutral microsatellite loci, and there was no indication of selection influencing this locus. Cpa112 on the other hand was highly divergent for Baltic samples, indicating strong influence by selection as reported in previous studies (Gaggiotti et al., 2009; Larsson et al., 2007, 2010). The higher level of divergence in Cpa112 makes this locus efficient at detecting even weak overall population structure in Atlantic herring.

Statistical power of neutral vs selected markers

All genetic markers, including Cpa112, displayed high statistical power for detecting low levels of genetic differentiation in simulations; the applied set of eight neutral microsatellites could even detect FST as low as 0.0010 with a power of 95% at n≈100. Statistical power is related to the distribution of expected FST under neutrality, and the smaller variance of the eight microsatellites explains their higher power (Figure 3). In turn, this is likely an effect of the larger total number of alleles (Kalinowski, 2002; Ryman et al., 2006; Larsson et al., 2007). The eight microsatellite loci had 192 alleles in total, whereas the eleven allozyme loci had 38 alleles. Interestingly, the median microsatellite locus harboring 26 alleles had a power comparable to the allozymes (Table 3).

These evaluations were made under the assumption of selective neutrality; if selection is taken into account, expectations change. For Baltic vs North Sea herring, Cpa112 displayed a 10–20 times higher overall FST compared with the eight neutral microsatellite loci (Table 2), and this higher level of differentiation increase the statistical power. Consequently, a substantial reduction in sample size is possible when using a selected locus vs neutral markers (Table 4), in agreement with predictions by Schmidt et al. (2008). However, it should be noted that these estimates relate to FST over all eight samples, and that power varied with specific comparisons, that is, North Sea vs Skagerrak comparisons were less powerful, and Baltic vs Skagerrak or North Sea comparisons were even more powerful than the estimates above (see Supplementary Appendix 2 for detailed pairwise comparisons of FST).

Several pairwise comparisons using mtDNA were statistically significant (Supplementary Appendix 2), indicating that mtDNA can be a powerful marker, even at low levels of nuclear divergence (Hoarau et al., 2004; Larsson et al., 2009).

Spatial and temporal patterns of differentiation

Neutral markers

The microsatellite results support the existence of a barrier to gene flow between Baltic and Skagerrak/North Sea populations, which has been shown earlier for herring (Bekkevold et al., 2005; Larsson et al., 2007), as well as for several other species (Johannesson and André, 2006; Limborg et al., 2009). Using the microsatellite data for the same samples collected at Berwick and Tjøme as in this study, as well as several other samples, Ruzzante et al. (2006) showed that herring collected at Rügen in the southwestern Baltic is clearly distinct from the populations in the North Sea and Skagerrak. In this study, the differentiation estimated with neutral microsatellites was stable over two subsequent years (Figure 4). The allozyme data for the 2003 herring showed a concordant spatial pattern of differentiation, as was also reported by Larsson et al. (2007). In 2002, however, the sample collected at Tjøme was highly divergent at the SOD* locus. Owing to the low overall level of divergence, a single odd genotype at one locus had a disproportional impact on the FST estimate. A similar case is shown in the mtDNA data in which one specific haplotype had an elevated frequency in the Kalix 2003 sample (Figure 4). Owing to the low levels of differentiation typical for marine fish, a slightly different set of samples, or loci, may yield different patterns of population structure (cf. Waples, 1998; Allendorf and Seeb, 2000; Hauser and Carvalho, 2008). Such variance is particularly pronounced in genetic markers with low heterozygosities and a high expected range of FST, such as allozymes (Figure 3; also refer Fauvelot et al., 2007). Therefore, careful sampling including temporal replication is warranted, especially in species that undertake ontogenetic shifts in habitat use, such as herring and cod, in which the inclusion of juvenile fish may bias patterns of spatial genetic differentiation (Nielsen et al., 2009b).

Markers influenced by selection

The MHC-embedded microsatellite locus, Clha-DAA-INTR3, showed similar patterns of allelic richness and divergence as the presumed neutral microsatellite loci (Table 2; Supplementary Appendix 1a), and there was no indication of either directional or balancing selection acting on Clha-DAA-INTR3 (Figure 2). The locus is situated in the third intron of the MH IIA gene, and the extent to which the variability in the microsatellite reflects exonic variability is unknown (Stet et al., 2008). Previous attempts to estimate the variability in MHC genes using embedded micro- or minisatellites in salmonids have yielded varying results (Stet et al., 2002; Grimholt et al., 2002, 2003; Hansen et al., 2007; de Eyto et al., 2007). de Eyto et al. (2007) found a one-to-one relationship between minisatellite and MH class II allelic variability; however, for MH class I, a specific microsatellite allele could be found in several different MH I alleles. Wynne et al. (2007) showed that, although alleles of a microsatellite located in the untranslated region of the salmon MH II gene were linked to several different MH II sequences, the microsatellite diversity was correlated to disease susceptibility, and thus acted as a good proxy for actual MH class II polymorphism. In a recent study, Tonteri et al. (2010) demonstrated that 18 microsatellite loci linked to immune-relevant genes show stronger signal of selection than 76 loci without such association.

Baltic herring are highly divergent at the microsatellite locus Cpa112. Larsson et al. (2010) showed this for herring collected in the 1980s, and it is also evident for herring collected in 2002 and 2003 (Figure 4; Supplementary Appendix 2). Skagerrak and North Sea samples are also divergent at this locus, but to a substantially lesser degree. Presently, there is no information on any gene associated with Cpa112, but the lower salinity of the Baltic Sea is an obvious candidate as selective agent (Figure 5), although other environmental factors that covary with salinity may also have a role. This is also supported by preliminary data indicating a high frequency of the Cpa112(306) allele in the low-saline Idefjord in NE Skagerrak (Figure 5).

Conclusion

Neutral markers have traditionally been used to estimate population structure and also migration rates. A decade ago, Whitlock and McCauley (1999) pointed out that indirect estimates of migration rates using gene frequency data often tend to be flawed because underlying assumptions are violated. Recently, new ways of estimating migration, m, independently of FST that do not rely on equilibrium assumptions have been developed (Beerli and Felsenstein, 2001), but most investigations of population structure per se still use presumed neutral markers, most likely because it is believed that selection may bias FST. However, as long as selection pressure is temporally stable (decades–centuries), genetic markers influenced by selection, such as Cpa112, can be of use in detecting population structure on ecological time scales, where neutral loci have not yet diverged substantially. Selected loci may be particularly useful in genetic stock identification based on individual assignment and as markers for population origin in mixed-stock analysis (see Nielsen et al., 2009a, p. 3136), for example, MHC in Pacific sockeye salmon (Beacham et al., 2005), the pantophysin locus in mixed stocks of coastal and oceanic cod in Lofoten, northern Norway (Wennevik et al., 2008) and Cpa112 in mixed-stock analysis of herring in the Skagerrak (Ruzzante et al., 2006). It may be preferable to use selection-influenced genetic markers with known function (Hoffmann and Willi, 2008), or with at least some knowledge of the nature of the selection pressure, for example, hemoglobin (Andersen et al., 2009), genes involved in osmoregulation (Larsen et al., 2008) and depth adaptation (Árnason et al., 2009). With recent advances in molecular technologies, it is anticipated that such information will become increasingly available. Moreover, loci under selection may allow for an estimation of the adaptive value of subpopulations, useful for conservation priorities (Bonin et al., 2007; Gebremedhin et al., 2009).