Eukaryotic aldehyde dehydrogenase (ALDH) genes: human polymorphisms, and recommended nomenclature based on divergent evolution and chromosomal mapping

Pharmacogenetics. 1999 Aug;9(4):421-34.

Abstract

As currently being performed with an increasing number of superfamilies, a standardized gene nomenclature system is proposed here, based on divergent evolution, using multiple alignment analysis of all 86 eukaryotic aldehyde dehydrogenase (ALDH) amino-acid sequences known at this time. The ALDHs represent a superfamily of NAD(P)(+)-dependent enzymes having similar primary structures that oxidize a wide spectrum of endogenous and exogenous aliphatic and aromatic aldehydes. To date, a total of 54 animal, 15 plant, 14 yeast, and three fungal ALDH genes or cDNAs have been sequenced. These ALDHs can be divided into a total of 18 families (comprising 37 subfamilies), and all nonhuman ALDH genes are named here after the established human ALDH genes, when possible. An ALDH protein from one gene family is defined as having approximately < or = 40% amino-acid identity to that from another family. Two members of the same subfamily exhibit approximately > or = 60% amino-acid identity and are expected to be located at the same subchromosomal site. For naming each gene, it is proposed that the root symbol 'ALDH' denoting 'aldehyde dehydrogenase' be followed by an Arabic number representing the family and, when needed, a letter designating the subfamily and an Arabic number denoting the individual gene within the subfamily; all letters are capitalized in all mammals except mouse and fruit fly, e.g. 'human ALDH3A1 (mouse, Drosophila Aldh3a1).' It is suggested that the Human Gene Nomenclature Guidelines (http://++www.gene.ucl.ac.uk/nomenclature/guidelines.h tml) be used for all species other than mouse and Drosophila. Following these guidelines, the gene is italicized, whereas the corresponding cDNA, mRNA, protein or enzyme activity is written with upper-case letters and without italics, e.g. 'human, mouse or Drosophila ALDH3A1 cDNA, mRNA, or activity'. If an orthologous gene between species cannot be identified with certainty, sequential naming of these genes will be carried out in chronological order as they are reported to us. In addition, 20 human ALDH variant alleles that have been reported to date are listed herein and are recommended to be given numbers (or a number plus a capital letter) following an asterisk (e.g. 'ALDH3A2*2, ALDH2*4C'). It is anticipated that this eukaryotic ALDH gene nomenclature system will be extended to include bacterial genes within the next 2 years and that this nomenclature system will require updating on a regular basis; an ALDH Web site has been established for this purpose (http://++www.uchsc.edu/sp./sp./alcdbase/a ldhcov.html) and will serve as a medium for interaction amongst colleagues in this field.

Publication types

  • Research Support, U.S. Gov't, P.H.S.
  • Review

MeSH terms

  • Aldehyde Dehydrogenase / genetics*
  • Amino Acid Sequence
  • Animals
  • Chromosome Mapping*
  • Eukaryotic Cells / enzymology
  • Evolution, Molecular*
  • Humans
  • Mice
  • Molecular Sequence Data
  • Polymorphism, Genetic*
  • Sequence Homology, Amino Acid
  • Terminology as Topic

Substances

  • Aldehyde Dehydrogenase