Trends in Ecology & Evolution
Adaptation and speciation: what can Fst tell us?
Introduction
It is becoming increasingly cheaper and faster to survey samples genetically from model and non-model organisms for a large number of loci across their genomes, an advance that derives largely from the activities of the biomedical community [1]. Here, I reappraise an old idea [2] for the analysis of such data, first proposed when multi-locus surveys were scarce and difficult to obtain; it was later discredited 3, 4 and abandoned; and now, fuelled by the plentiful supply of these data, occasionally peeps apologetically out of research articles, smothered in caveats [5].
The ready availability of different classes of gene frequency information has rekindled an interest in natural selection and the development of a variety of methods for use in trying to infer the presence and mode of selection. Three main approaches can be identified [6]: (i) detailed modelling of selection at individual loci or sequences; (ii) multilocus comparisons, of which the Lewontin–Krakauer Method (see Glossary [2]), discussed here, is the oldest; and (iii) comparison of patterns of substitution among synonymous and non-synonymous sites. Analyses of the advantages and drawbacks of these different approaches are detailed by Nielsen [6]. Much of the research has been driven by the biomedical community with an aim to identify and characterize biochemical function and the phenotypic effect of natural variation throughout the human genome, often based on comparative analysis [7]. Judgements about the efficacies of different methods implicitly tend to have these, ultimately medical, goals in mind.
In evolutionary biology there are, by contrast, several interesting hypotheses that can be tested by characterizing the number, position and fitness effects of genomic regions that show apparent adaptive divergence in allele frequency, without the need to delve into the physiological details. I argue that, for most organisms, the easiest way to achieve this is by using the Lewontin–Krakauer method, which appears, at least in recent versions, to be generally robust to the vagaries of demographic history. I invoke recent theoretical results that suggest why this is not so surprising and discuss possible sampling strategies that might maximize the power of the approach. I also outline areas of application, particularly the study of adaptive divergence in the face of gene flow and modes of speciation.
Section snippets
Inbreeding coefficients and the identification of loci subject to selection
The study of selection, particularly local adaptation, at the genetic level has a long history (usefully reviewed in [8]). Examples include the study of local crypsis in response to bird predation in the snail Cepaea nemoralis [9] and the peppered moth Biston betularia [10]. The genetics of adaptation was largely eclipsed during the 1980s and 1990s by an interest in the possibility of recovering the historical demography of populations through an analysis of genetic variation, in particular, of
Alternative methods
The more recent studies have not invalidated the original criticisms of the Lewontin–Krakauer test 3, 4, 17, 18, 19, but, instead, suggest that they are often not applicable to the real world. The criticisms can be re-expressed as violations of the separation-of-timescales approximation. Potentially problematic are high mutation rate loci, such as microsatellites, in which mutations occur in the scattering phase [32]. Another problem arises when the gene frequency in the collecting phase is not
Design of surveys
How should surveys be designed to maximize the chances of picking up loci that are subject to selection? The approximation given by Lewontin and Krakauer (Box 1) can be used to obtain some idea of the expected variability in estimates of Fst among loci. This shows that there is substantial variability and skew in estimates of Fst when biallelic markers are surveyed in only a pair of populations, and hence there is potentially little power to detect outlier loci unless selection is strong
Example applications
There have been an increasing number of studies that aim to identify loci subject to selection 46, 47, often using the distribution of Fst among loci. An illustrative case stems from the work of Pogson and colleagues [48]. They studied a mixture of RFLPs and allozymes in populations of Atlantic cod Gadus morhau. On the basis of tests closely related to those of Lewontin and Krakauer [2], they showed that the mean Fst in allozymes was lower than that of RFLPs, and suggested that the allozymes
Testing models of adaptation and speciation
Once interesting genomic regions have been identified through an analysis of Fst, or related method, how might the information be used? In conjunction with demographic information, such as immigration rate, it might be possible to quantify the distribution of fitness effects that are necessary to lead to the observed distribution of estimates of Fst, and thereby test evolutionary hypotheses [47]. Although predictions of phenotypic evolution are often still based on the infinitesimal model [53],
Conclusions
The abandonment of Lewontin and Krakauer's idea could be regarded as a major advance in the wider acceptance of the usefulness of the neutral theory, and the importance of demographic events in shaping gene frequency data. This can then be seen as leading directly to the significant research programme of the 1980s and 1990s, that of trying to recover population history from genetic information. Ironically, however, the reinstatement of Lewontin and Krakauer's ideas depends largely on a
Acknowledgements
I thank Jay Storz, Renaud Vitalis and three anonymous referees for their useful comments on previous versions of the article. This work was supported by an Advanced Fellowship from the Natural Environment Research Council.
Glossary
- AFLP:
- amplified fragment length polymorphism. A way of assaying nucleotide sites for polymorphisms, typically resulting in a dominant marker system. A relatively inexpensive way of obtaining many markers.
- Ascertainment bias:
- bias in demographic inferences owing to the use of (typically) low mutation rate markers, such as SNPs, that have been previously identified in earlier smaller scale studies. The SNPs so identified will form a biased subset, with alleles at intermediate frequencies (otherwise
References (70)
- et al.
Drift variance of Fst and Gst statistics obtained from a finite number of isolated populations
Theor. Popul. Biol.
(1977) Mean and variance of Fst in a finite number of incompletely isolated populations
Theor. Popul. Biol.
(1977)The coalescent in an island model of population subdivision with variation among demes
Theor. Popul. Biol.
(2001)- et al.
DNA profile match probability calculations: how to allow for population stratification, relatedness, database selection and single bands
For. Sci. Int.
(1994) Likelihood-based inference for genetic correlation coefficients
Theor. Popul. Biol.
(2003)Towards a molecular characterization of adaptation in local populations
Curr. Opin. Genet. Dev.
(2002)Hitchhiking mapping – functional genomics from the population genetics perspective
Trends Genet.
(2003)How to detect polymorphisms undergoing selection in marine fishes? A review of methods and case studies, including flatfishes
J. Sea Res.
(2004)Chromosomal rearrangements and speciation
Trends Ecol. Evol.
(2001)The International HapMap Project
Nature
(2003)
Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms
Genetics
Lewontin–Krakauer test for neutral genes
Genetics
Remarks on the Lewontin–Krakauer test
Genetics
Interrogating a high-density SNP map for signatures of natural selection
Genome Res.
Statistical tests of selective neutrality in the age of genomics
Heredity
Inferring nonneutral evolution from human–chimp–mouse orthologous gene trios
Science
Ecological Genetics
Selection in the polymorphic land snail Cepaea nemoralis
Heredity
A survey of the frequencies of Biston Betularia (L) (Lep.) and its melanic forms in Great Britain
Heredity
Molecular Markers, Natural History, and Evolution
Population structure and human evolution
Proc. R. Soc. Lond. B Biol. Sci.
Estimating F-statistics for the analysis of population structure
Evolution
Interpretation of variation across marker loci as evidence of selection
Genetics
Estimating F-statistics
Annu. Rev. Genet.
Intra-tribal genetic differentiation among the Yanomama Indians of Southern Venezuela
Proc. Natl. Acad. Sci. U. S. A.
Gene frequency distribution as a test of selective neutrality
Genetics
Drift, admixture, and selection in human-evolution – a study with DNA polymorphisms
Proc. Natl. Acad. Sci. U. S. A.
Evaluating loci for use in the genetic analysis of population structure
Proc. R. Soc. Lond. B Biol. Sci.
Structured coalescent processes on different time scales
Genetics
Nonequilibrium migration in human history
Genetics
Gene genealogies in a metapopulation
Genetics
A separation-of-timescales approach to the coalescent in a continuous population
Genetics
Estimating gene flow in island populations
Genet. Res.
A test for deviation from island-model population structure
Mol. Ecol.
Identifying adaptive genetic divergence among populations from genome scans
Mol. Ecol.
Cited by (355)
An Indian lineage of Histoplasma with strong signatures of differentiation and selection
2022, Fungal Genetics and BiologyPhylogenomics of the genus Tursiops and closely related Delphininae reveals extensive reticulation among lineages and provides inference about eco-evolutionary drivers
2020, Molecular Phylogenetics and EvolutionCitation Excerpt :The resulting SNPs were then analysed using LOSITAN (Antao et al., 2008), considering all the main Tursiops ecotypes/species as different populations (defined in Moura et al., 2013). Lositan uses the Fdist method described by Beaumont (2005) to compare FST against heterozygosity for each SNP, and calculates an expected neutral distribution for all SNPs. SNPs that are found to be outliers against this distribution are inferred as being putatively under selection.