Adaptation and speciation: what can Fst tell us?

https://doi.org/10.1016/j.tree.2005.05.017Get rights and content

A useful way of summarizing genetic variability among different populations is through estimates of the inbreeding coefficient, Fst. Several recent studies have tried to use the distribution of estimates of Fst from individual genetic loci to detect the effects of natural selection. However, the promise of this approach has yet to be fully realized owing to the pervasive dogma that this distribution is highly dependent on demographic history. Here, I review recent theoretical results that indicate that the distribution of estimates of Fst is generally expected to be robust to the vagaries of demographic history. I suggest that analyses based on it provide a useful first step for identifying candidate genes that might be under selection, and explore the ways in which this information can be used in ecological and evolutionary studies.

Introduction

It is becoming increasingly cheaper and faster to survey samples genetically from model and non-model organisms for a large number of loci across their genomes, an advance that derives largely from the activities of the biomedical community [1]. Here, I reappraise an old idea [2] for the analysis of such data, first proposed when multi-locus surveys were scarce and difficult to obtain; it was later discredited 3, 4 and abandoned; and now, fuelled by the plentiful supply of these data, occasionally peeps apologetically out of research articles, smothered in caveats [5].

The ready availability of different classes of gene frequency information has rekindled an interest in natural selection and the development of a variety of methods for use in trying to infer the presence and mode of selection. Three main approaches can be identified [6]: (i) detailed modelling of selection at individual loci or sequences; (ii) multilocus comparisons, of which the Lewontin–Krakauer Method (see Glossary [2]), discussed here, is the oldest; and (iii) comparison of patterns of substitution among synonymous and non-synonymous sites. Analyses of the advantages and drawbacks of these different approaches are detailed by Nielsen [6]. Much of the research has been driven by the biomedical community with an aim to identify and characterize biochemical function and the phenotypic effect of natural variation throughout the human genome, often based on comparative analysis [7]. Judgements about the efficacies of different methods implicitly tend to have these, ultimately medical, goals in mind.

In evolutionary biology there are, by contrast, several interesting hypotheses that can be tested by characterizing the number, position and fitness effects of genomic regions that show apparent adaptive divergence in allele frequency, without the need to delve into the physiological details. I argue that, for most organisms, the easiest way to achieve this is by using the Lewontin–Krakauer method, which appears, at least in recent versions, to be generally robust to the vagaries of demographic history. I invoke recent theoretical results that suggest why this is not so surprising and discuss possible sampling strategies that might maximize the power of the approach. I also outline areas of application, particularly the study of adaptive divergence in the face of gene flow and modes of speciation.

Section snippets

Inbreeding coefficients and the identification of loci subject to selection

The study of selection, particularly local adaptation, at the genetic level has a long history (usefully reviewed in [8]). Examples include the study of local crypsis in response to bird predation in the snail Cepaea nemoralis [9] and the peppered moth Biston betularia [10]. The genetics of adaptation was largely eclipsed during the 1980s and 1990s by an interest in the possibility of recovering the historical demography of populations through an analysis of genetic variation, in particular, of

Alternative methods

The more recent studies have not invalidated the original criticisms of the Lewontin–Krakauer test 3, 4, 17, 18, 19, but, instead, suggest that they are often not applicable to the real world. The criticisms can be re-expressed as violations of the separation-of-timescales approximation. Potentially problematic are high mutation rate loci, such as microsatellites, in which mutations occur in the scattering phase [32]. Another problem arises when the gene frequency in the collecting phase is not

Design of surveys

How should surveys be designed to maximize the chances of picking up loci that are subject to selection? The approximation given by Lewontin and Krakauer (Box 1) can be used to obtain some idea of the expected variability in estimates of Fst among loci. This shows that there is substantial variability and skew in estimates of Fst when biallelic markers are surveyed in only a pair of populations, and hence there is potentially little power to detect outlier loci unless selection is strong

Example applications

There have been an increasing number of studies that aim to identify loci subject to selection 46, 47, often using the distribution of Fst among loci. An illustrative case stems from the work of Pogson and colleagues [48]. They studied a mixture of RFLPs and allozymes in populations of Atlantic cod Gadus morhau. On the basis of tests closely related to those of Lewontin and Krakauer [2], they showed that the mean Fst in allozymes was lower than that of RFLPs, and suggested that the allozymes

Testing models of adaptation and speciation

Once interesting genomic regions have been identified through an analysis of Fst, or related method, how might the information be used? In conjunction with demographic information, such as immigration rate, it might be possible to quantify the distribution of fitness effects that are necessary to lead to the observed distribution of estimates of Fst, and thereby test evolutionary hypotheses [47]. Although predictions of phenotypic evolution are often still based on the infinitesimal model [53],

Conclusions

The abandonment of Lewontin and Krakauer's idea could be regarded as a major advance in the wider acceptance of the usefulness of the neutral theory, and the importance of demographic events in shaping gene frequency data. This can then be seen as leading directly to the significant research programme of the 1980s and 1990s, that of trying to recover population history from genetic information. Ironically, however, the reinstatement of Lewontin and Krakauer's ideas depends largely on a

Acknowledgements

I thank Jay Storz, Renaud Vitalis and three anonymous referees for their useful comments on previous versions of the article. This work was supported by an Advanced Fellowship from the Natural Environment Research Council.

Glossary

AFLP:
amplified fragment length polymorphism. A way of assaying nucleotide sites for polymorphisms, typically resulting in a dominant marker system. A relatively inexpensive way of obtaining many markers.
Ascertainment bias:
bias in demographic inferences owing to the use of (typically) low mutation rate markers, such as SNPs, that have been previously identified in earlier smaller scale studies. The SNPs so identified will form a biased subset, with alleles at intermediate frequencies (otherwise

References (70)

  • R.C. Lewontin et al.

    Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms

    Genetics

    (1973)
  • M. Nei et al.

    Lewontin–Krakauer test for neutral genes

    Genetics

    (1975)
  • A. Robertson

    Remarks on the Lewontin–Krakauer test

    Genetics

    (1975)
  • J.M. Akey

    Interrogating a high-density SNP map for signatures of natural selection

    Genome Res.

    (2002)
  • R. Nielsen

    Statistical tests of selective neutrality in the age of genomics

    Heredity

    (2001)
  • A.G. Clark

    Inferring nonneutral evolution from human–chimp–mouse orthologous gene trios

    Science

    (2003)
  • E.B. Ford

    Ecological Genetics

    (1975)
  • A.J. Cain et al.

    Selection in the polymorphic land snail Cepaea nemoralis

    Heredity

    (1950)
  • H.B.D. Kettlewell

    A survey of the frequencies of Biston Betularia (L) (Lep.) and its melanic forms in Great Britain

    Heredity

    (1958)
  • J.C. Avise

    Molecular Markers, Natural History, and Evolution

    (1994)
  • L.L. Cavalli-Sforza

    Population structure and human evolution

    Proc. R. Soc. Lond. B Biol. Sci.

    (1966)
  • B.S. Weir et al.

    Estimating F-statistics for the analysis of population structure

    Evolution

    (1984)
  • R. Vitalis

    Interpretation of variation across marker loci as evidence of selection

    Genetics

    (2001)
  • B.S. Weir et al.

    Estimating F-statistics

    Annu. Rev. Genet.

    (2002)
  • T. Arends

    Intra-tribal genetic differentiation among the Yanomama Indians of Southern Venezuela

    Proc. Natl. Acad. Sci. U. S. A.

    (1966)
  • A. Robertson

    Gene frequency distribution as a test of selective neutrality

    Genetics

    (1975)
  • A.M. Bowcock

    Drift, admixture, and selection in human-evolution – a study with DNA polymorphisms

    Proc. Natl. Acad. Sci. U. S. A.

    (1991)
  • M.A. Beaumont et al.

    Evaluating loci for use in the genetic analysis of population structure

    Proc. R. Soc. Lond. B Biol. Sci.

    (1996)
  • M. Nordborg

    Structured coalescent processes on different time scales

    Genetics

    (1997)
  • J. Wakeley

    Nonequilibrium migration in human history

    Genetics

    (1999)
  • J. Wakeley et al.

    Gene genealogies in a metapopulation

    Genetics

    (2001)
  • J.F. Wilkins

    A separation-of-timescales approach to the coalescent in a continuous population

    Genetics

    (2005)
  • B. Rannala et al.

    Estimating gene flow in island populations

    Genet. Res.

    (1996)
  • A.H. Porter

    A test for deviation from island-model population structure

    Mol. Ecol.

    (2003)
  • M.A. Beaumont et al.

    Identifying adaptive genetic divergence among populations from genome scans

    Mol. Ecol.

    (2004)
  • Cited by (355)

    • Phylogenomics of the genus Tursiops and closely related Delphininae reveals extensive reticulation among lineages and provides inference about eco-evolutionary drivers

      2020, Molecular Phylogenetics and Evolution
      Citation Excerpt :

      The resulting SNPs were then analysed using LOSITAN (Antao et al., 2008), considering all the main Tursiops ecotypes/species as different populations (defined in Moura et al., 2013). Lositan uses the Fdist method described by Beaumont (2005) to compare FST against heterozygosity for each SNP, and calculates an expected neutral distribution for all SNPs. SNPs that are found to be outliers against this distribution are inferred as being putatively under selection.

    View all citing articles on Scopus
    View full text