More than 9,000,000 unique genes in human gut bacterial community: estimating gene numbers inside a human body

PLoS One. 2009 Jun 29;4(6):e6074. doi: 10.1371/journal.pone.0006074.

Abstract

Background: Estimating the number of genes in human genome has been long an important problem in computational biology. With the new conception of considering human as a super-organism, it is also interesting to estimate the number of genes in this human super-organism.

Principal findings: We presented our estimation of gene numbers in the human gut bacterial community, the largest microbial community inside the human super-organism. We got 552,700 unique genes from 202 complete human gut bacteria genomes. Then, a novel gene counting model was built to check the total number of genes by combining culture-independent sequence data and those complete genomes. 16S rRNAs were used to construct a three-level tree and different counting methods were introduced for the three levels: strain-to-species, species-to-genus, and genus-and-up. The model estimates that the total number of genes is about 9,000,000 after those with identity percentage of 97% or up were merged.

Conclusion: By combining completed genomes currently available and culture-independent sequencing data, we built a model to estimate the number of genes in human gut bacterial community. The total number of genes is estimated to be about 9 million. Although this number is huge, we believe it is underestimated. This is an initial step to tackle this gene counting problem for the human super-organism. It will still be an open problem in the near future. The list of genomes used in this paper can be found in the supplementary table.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology
  • Genes, Bacterial / genetics*
  • Genome, Bacterial
  • Humans
  • Intestines / microbiology*
  • Models, Genetic
  • Models, Theoretical
  • RNA, Ribosomal, 16S / genetics
  • Sequence Alignment
  • Sequence Analysis, DNA
  • Software

Substances

  • RNA, Ribosomal, 16S