Trends in Genetics
Volume 21, Issue 1, January 2005, Pages 21-25
Journal home page for Trends in Genetics

Genome Analysis
Microeconomic principles explain an optimal genome size in bacteria

https://doi.org/10.1016/j.tig.2004.11.014Get rights and content

Bacteria can clearly enhance their survival by expanding their genetic repertoire. However, the tight packing of the bacterial genome and the fact that the most evolved species do not necessarily have the biggest genomes suggest there are other evolutionary factors limiting their genome expansion. To clarify these restrictions on size, we studied those protein families contributing most significantly to bacterial-genome complexity. We found that all bacteria apply the same basic and ancestral ‘molecular technology’ to optimize their reproductive efficiency. The same microeconomics principles that define the optimum size in a factory can also explain the existence of a statistical optimum in bacterial genome size. This optimum is reached when the bacterial genome obtains the maximum metabolic complexity (revenue) for minimal regulatory genes (logistic cost).

Section snippets

Defining size-dependent and universal superfamilies and their distributions

We have determined the domain superfamilies in the CATH database that are most likely to be contributing to bacterial genome complexity because their occurrences are significantly correlated with genome size. The CATH database is a hierarchical classification of protein-domain structures, and there are structure-derived hidden Markov models (HMMs) of sequences for each domain superfamily [10]. Using the HMMs, we first assigned genes to homologous superfamilies in CATH. By exploiting structural

Functional characterization of each set of superfamily distributions

All superfamilies from each subdivision were selected for detailed functional analysis. The 38 linearly distributed superfamilies are primarily associated with metabolism; 87% of domains and 82% of the superfamilies are involved in general cellular metabolism. For example, of the two most frequent superfamilies in bacteria, the nucleotide triphosphate hydrolase domain provides motion and reaction energy in prokaryotic and eukaryotic organisms, whereas the NADP-binding domain performs oxidizing

Limits to bacterial genome expansion

The regulatory and metabolic subdivisions represent 90% of all size-dependent and universal domains and 91% of the superfamilies and were therefore selected for more detailed analysis and comparison of their distributions. Interestingly, fitting the distributions of these two types of superfamilies to regression lines showed that the functions cross when the genome size reaches 10 500 ORFs (Figure 2a), suggesting that when bacterial size increases above this value regulation exceeds metabolic

Bacterial factory model and optimum genome size

It is interesting to draw an analogy between optimum bacterial complexity and the optimum production level, giving maximum profit, in a factory. In a factory the total profit is the difference between total revenue and total cost, and the marginal profit is the additional profit derived from the production of one additional unit. Consequently, the optimum size of a factory giving maximum total profit is reached at a production level where the marginal revenue from producing an additional unit

Conclusion

As in a factory, production capacity depends, among other factors, on the technology applied. Eukaryotes apply additional technology to reduce gene expression noise, such as DNA methylation, nucleosomal chromatin or cellular compartmentalization [18]. However, in contrast to eukaryotic cells, bacteria have adopted an evolutionary strategy whereby speed in reproduction is a primary goal [1]. The size-dependent superfamilies analysed here clearly represent universal molecular technology shared by

Acknowledgements

We thank Florencio Pazos, Russell Marsden, Antonio Sillero, Chris Bennett and Alistair Coleman for discussion and helpful comments on the paper, and to Ami for her help in the inspiration of this work. The work was supported by grants from the MRC (A.G. and C.A.O.) and European Union (J.A.G.R.).

References (25)

  • E.V. Koonin

    The structure of the protein universe and genome evolution

    Nature

    (2002)
  • C. Chothia

    Evolution of the protein repertoire

    Science

    (2003)
  • Cited by (0)

    View full text