Abstract
The flavin-containing monooxygenases (FMOs) are a family of five microsomal enzymes important for the oxidative metabolism of environmental toxicants, natural products, and therapeutics. With the exception of FMO5, the FMO are encoded within a single gene cluster on human chromosome 1q23–25. As part of the human genome effort, anFMO-like gene, FMO6, was identified between FMO3 and FMO2(GenBank accession no. ALO21026). Sequence analysis of this putative FMO family member revealed nothing that would a priori argue against a functional gene and encoded protein. WhenFMO6 expression was examined by reverse transcriptase coupled polymerase chain reaction DNA amplification, transcripts were identified in 8 of 11 human liver samples, but 0 of 4 kidney biopsy samples. However, in all cases, the observed transcripts were shorter than predicted. Sequence analysis revealed skipping of exon 4, exons 3 and 4, and/or the use of alternative splice donor or acceptor sites in introns 3, 4, 6, and 8, resulting in nine unique transcripts. Based on an analysis of possible open reading frames, none of these transcripts would encode a functional FMO enzyme. Taking advantage of the high sequence identity between FMO3 and FMO6, it is posited that the loss of binding sites for the serine-arginine–rich splicing factor protein family within exons 3 and 4 contributes to the exon skipping events, although the most commonly observed alternative splice event results from a 21-bp insertion immediately 3′ to the predicted FMO6 intron 8 splice acceptor site, diminishing the efficiency of this site.
The flavin-containing monooxygenases (FMOs) [EC 1.14.13.8; dimethylaniline monooxygenase (N-oxide forming)] are a family of microsomal enzymes important for the oxidative metabolism of a wide variety of natural and synthetic compounds. Substrates share the common property of containing soft nucleophilic nitrogen, sulfur, selenium, or phosphorus centers and include dietary components, pesticides, and other synthetic environmental chemicals, therapeutics, and plant alkaloids (for review, see Rettie and Fisher, 1999). Five mammalian FMO enzymes have been identified (FMO1–5), each exhibiting a distinct but broad substrate specificity that overlaps with other FMO family members and with the cytochromes P450. The FMO also exhibit temporal-, tissue-, and species-specific expression patterns that could contribute to organ- and life stage-selective susceptibility to toxicants and therapeutics (Lomri et al., 1992; Shehin-Johnson et al., 1995; Yeung et al., 2000;Koukouritaki et al., 2002). In addition, the expression of one or more FMO enzymes is observed at relatively high levels in many tissues. For example, FMO3 expression in adult liver has been reported to be 60 ± 43 pmol/mg of microsomal protein (Overby et al., 1997), whereas FMO1 expression in adult kidney and in fetal liver has been reported to be 47 ± 9 pmol/mg of microsomal protein (Yeung et al., 2000) and 8 ± 5 pmol/mg of microsomal protein (Koukouritaki et al., 2002), respectively. These expression levels are comparable with the most abundant adult liver cytochromes P450, CYP3A4/5, CYP2C, and CYP1A2, whose specific contents were reported to be 96 ± 51, 60 ± 27, and 42 ± 23 pmol/mg of microsomal protein, respectively (Shimada et al., 1994).
With the exception of FMO5, whose gene is located at 1q21.1, the FMOs are encoded by a single gene cluster on chromosome 1q23–25 (Shephard et al., 1993; McCombie et al., 1996). As part of the human genome project, a sixth FMO-like gene was identified, localized between FMO3 and FMO2 (GenBank accession no. AL021026). A simple sequence analysis of this putative sixth member of the FMO family, FMO6, revealed nothing that would a priori argue against a functional gene and active protein. Comparison with other FMO genes suggested eight exons encoding protein information and, presumably, an upstream exon 1 encoding 5′-untranslated information. All FMO6 exons are defined by the expected, conserved splice donor and acceptor sites (Aebi et al., 1986; Zhang, 1998), sequences immediately 5′ to the presumptive ATG start codon exhibit good agreement with an optimal Kozak sequence (Kozak, 1997) and the open reading frame of the mature message would encode a 539-amino acid protein containing the conserved GXGXXG motifs that predict the βαβ-folds of the FAD- and NADPH-binding domains (Cashman 1995; Dolphin et al., 1997). Despite these observations, there has been no direct evidence forFMO6 expression.
The original objective of the current study was to isolate the full-length human FMO6 transcript, examine its expression pattern, and begin exploring FMO6-dependent catalytic activity. However, evidence is presented that FMO6 represents a nonfunctional gene because of multiple, alternative pre-mRNA processing events leading to altered reading frames and putative truncated proteins that would be devoid of monooxygenase activity.
Experimental Procedures
Materials.
The SuperScript First-Strand Synthesis System, Concert Rapid Gel Extraction System, TRIzol reagent, and Taqpolymerase were obtained from Invitrogen (Carlsbad, CA). The ABI Prism Big Dye Terminator Cycle Sequencing Kit was purchased from Applied Biosystems (Foster City, CA). Oligonucleotide primers were custom synthesized by Integrated DNA Technologies, Inc. (Coralville, IA). Single human fetal and adult hepatic total RNA samples were obtained from Stratagene (La Jolla, CA). All other reagents were purchased from commercial sources at the purest grade available.
Tissue and RNA Isolation.
After informed consent, liver and kidney tissue not needed for pathological assessment was obtained from patients undergoing needle biopsy for liver disease or nephrectomy for renal cancer. Tissue selected for analysis was deemed normal. No identifiers were collected with the tissue. RNA was isolated with TRIzol reagent following the procedure of Chomczynski and Sacchi (1987). This protocol was approved by the Froedtert Memorial Hospital and Medical College of Wisconsin Institutional Review Boards.
FMO6 Transcript Amplification.
Using 2 μg of total RNA as template in a total reaction volume of 20 μl containing 20 mM Tris-HCl, pH 8.4, 50 mM KCl, and 5 mM MgCl2, first strand synthesis was accomplished using 2.5 μg/ml random hexamer primers, 0.5 μM concentrations of each deoxynucleotide triphosphate, and 50 U of Moloney murine leukemia virus RNase H(−) reverse transcriptase. The polymerase chain reaction (PCR) with nestedFMO6-specific primers was employed to amplify the full-length FMO6 transcript. In a final volume of 50 μl containing 20 mM Tris-HCl, pH 8.4, 50 mM KCl, and 1.5 mM MgCl2, the primary amplification reaction contained 2 μl of the first strand reaction, 1.25 U of Taqpolymerase, 0.5 μM concentrations of each primer (see Table1), and 0.2 mM concentrations of each deoxynucleotide triphosphate. Conditions were an initial denaturation step at 94° for 1 min, then 30 cycles of denaturation at 92°C for 30 s; annealing at 50°C for 30 s, and elongation at 75°C for 2 min. The reaction was terminated after a final elongation step at 75°C for 5 min. The primary reaction was diluted 1:25 and a 2-μl aliquot used as a template for the secondary, nested amplification reaction. Conditions were identical to the primary reaction, except that a 55°C annealing temperature was used along with the nested primers (see Table 1). As a control for the quality of the liver and/or isolated RNA, reverse transcriptase-coupled PCR (RT-PCR) reactions also were used to detect the presence of the ubiquitous β-microglobulin transcript using the primers listed in Table 1 and conditions identical to those for the secondary FMO6 RT-PCR amplification.
PCR products were fractionated by agarose gel electrophoresis and visualized by ethidium bromide staining. Bands were excised individually and extracted from the gel with a Concert Rapid Gel Extraction Kit according to the manufacturer's directions. DNA sequencing reactions were performed with the Big Dye Terminator Cycle Sequencing Kit according to the manufacturer's directions using one of five sequencing primers or the secondary amplification primers (see Table 1). Sequencing reactions were analyzed on an ABI 310 sequencer (Applied Biosystems, Foster City, CA).
Data Analysis.
The Align Plus software program (version 4.1; Scientific and Educational Software, Durham, NC) was used to compare the observed versus expected FMO6 RNA sequences as well as to perform an exhaustive, pair-wise global alignment of the FMO1–6 amino acid sequences and progressive assembly using Neighbor-Joining phylogeny. Splice sites were analyzed and scored for their relative fit to a consensus sequence using the Splice Site Prediction Program authored by Dr. Martin G. Reese (University of California Berkeley, CA) and available at the Berkeley Drosophila Genome Project internet site (http://www.fruitfly.org/seq_tools/splice.html ) (Reese et al., 1997). This program scores 5′ splice donor sites and 3′ splice acceptor sites between 0 and 1.0, with scores approaching 1.0 matching more closely to the consensus sequence.
Results and Discussion
Comparison of nonorthologous FMO1–5 across multiple mammalian species or within a given species reveals 48 to 57% sequence identity. In contrast, comparison between orthologous forms across multiple species reveals 82 to 86% sequence identity. This analysis suggests the divergence of these enzymes from a common ancestral form before speciation and a high degree of conservation since that early event. The putative FMO6 protein fits this pattern, except for its identity with FMO3. In the latter case, the predicted human FMO6 amino acid sequence shares 70% identity with human FMO3 and 71 and 73% with mouse and rabbit FMO3, respectively. These relationships within the human FMO gene family are depicted in Fig.1 which shows a global pair-wise alignment and progressive assembly of family members. The unusually high degree of identity with FMO3 and the failure to identify similar sequences within the mouse or rat genomes suggests that a much later evolutionary event led to FMO6 and that the putative FMO6 enzyme might be unique to the human.
To develop the necessary tools for elucidating the FMO6expression pattern and also to explore the significance of possible FMO6 catalytic activity, initial RT-PCR reactions were performed using both human fetal and adult liver total RNA as a template with primers that would amplify the predicted, complete FMO6 coding information. No detectable product was observed using fetal liver RNA as a template. In contrast, a single distinct band was observed using adult liver RNA (Fig. 2, lane 2). However, the observed amplicon was considerably smaller than the predicted 1629-bp product, suggestive of alternative processing.
To determine whether or not a truncated FMO6 transcript was unique to this single RNA sample, liver biopsy tissue was obtained from nine patients and examined for FMO6 expression using an identical RT-PCR assay. Although the β-microglobulin transcript control was detectable in 10 of the 11 liver samples, FMO6transcripts were detectable in only eight. In four of these, a single amplicon smaller than the predicted 1629-bp product was observed. In the remaining four samples, two amplicons were observed, both smaller than the predicted product (e.g., Fig. 2, lane 3). FMO6 mRNA expression also was examined in four kidney tissue samples but was undetectable (data not shown).
Sequence analysis of the excised amplicons was used to define possible alternative processing events leading to the truncated FMO6transcripts (summarized in Fig. 3). Utilization of alternative intron 3 splice donor and intron 4 splice acceptor sites combined with exon 4 skipping (Fig. 3A) was observed. In addition, the use of an alternative intron 3 splice acceptor site (Fig.3B), an alternative intron 6 splice donor site (Fig. 3C) and two different intron 8 splice acceptor sites (Fig. 3, D1 and D2) were noted. Finally, skipping of exon 4 (Fig. 3E) or exons 3 and 4 (Fig. 3F) also was observed. Alone or in combination, these alternative processing events resulted in nine distinct FMO6 transcripts that are summarized in Table 2. The possible open reading frames within each transcript and expected protein products also are listed (Table 2). Although the 119-, 134-, and 148-amino acid products from frame 1 would contain the N-terminal GXGXXG FAD-binding motif, none would contain the second, GXGXXG NADPH-binding motif. Neither the predicted 101-amino acid product (which initiates well downstream within frame 1) nor the predicted 198- and 319-amino acid products (encoded within frame 3) would contain any of the conserved FMO motifs. In contrast, the 418 amino acid product derived from transcript 7 is encoded within the same frame as the predicted, full-length 539 protein and would contain both the conserved FAD and NADPH binding domains. However, previous studies demonstrated that in the case of FMO2, truncation of the protein by as little as 64 amino acids results in an inactive protein (Krueger et al., 2001). Thus, it is unlikely that the 418 amino acid product, or any of the potential FMO6-encoded proteins, would exhibit monooxygenase activity.
The high degree of identity between FMO3 and FMO6(Fig. 1) was taken advantage of to gain insight into the possible mechanisms underlying these alternative processing events. The skipping of FMO6 exon 4 or exons 3 and 4 (Fig. 3, E and F) occurred in 7 of the 13 detected transcripts (Table 2). Recent studies have identified exon splicing enhancers and their cognate serine-arginine–rich (SR) proteins as key regulators controlling splicing efficacy, particularly in the absence of strong splice signals (Cooper and Mattox 1997). Although consensus sequences for exon splicing enhancers have not been fully elucidated, it is possible that one or more of the sequence differences between FMO3 andFMO6 exons 3 and 4 has resulted in decreased or lost binding of one or more SR proteins, facilitating these exon-skipping events.Liu et al. (1998) reported on consensus sequences for SF2/ASF, SRp40, and SRp50; more recently, Tian and Kole (2001) reported a consensus site for SRp30. Although there is no direct evidence concerning which SR protein, if any, might be involved in regulating FMO3 orFMO6 transcript processing, a comparison of FMO6exon 3 to the orthologous FMO3 sequence did reveal the loss of an SF2/ASF consensus sequence, conservation of a single SRp40 site, and unique FMO6 SRp40 and SRp50 sites. Such a comparison for exon 4 is more convincing because FMO3 exon 4 exhibits five matches to SRp40 and one match to SRp50 consensus sequences. Only one of the SRp40 sites is conserved in FMO6 exon 4, whereas two unique SRp40 sites are present. Also consistent with this theory, the predicted FMO6 intron 3 splice donor and acceptor sites do not match the consensus sequences as well as those for FMO3(Fig. 4).
The remaining observed processing events consisted of alternative splice donor or acceptor site selection. The predicted and alternativeFMO6 splice donor and acceptor sites were aligned with and compared with the homologous sites within FMO3 (Fig. 4, A through D, correspond to the variants depicted in Fig. 3, A through D). Splice sites also were analyzed using the Splice Site Predictor Program (see Materials and Methods) to quantify how well each conforms to a consensus sequence.
The alternative FMO6 intron 3 splice donor site (Fig. 4A) is the most unexpected because it fails to conform to the U2-dependent consensus sequence (Aebi et al., 1986; Zhang 1998). Contrary to expected, this site does not contain a T at position +2, purines at positions +3 and +4, or a G at position +5. The only substantial difference between the alternative FMO6 intron 3 site and the homologous FMO3 sequence favoring a splice site is the substitution of a G for an A at position +1. The alternative intron 4 splice-acceptor site (Fig. 4A) also is difficult to reconcile. Both the predicted and alternative FMO6 sites exhibit the expected consensus A and G at positions −2 and −1, respectively, but these same nucleotides also are present in the FMO3 sequence. The consensus splice acceptor site exhibits a preference for pyrimidines at positions −3 and −5 through −15, the polypyrimidine tract. Yet, the alternative FMO6 intron 4 splice-acceptor site has pyrimidines in only 6 of 12 of these positions, whereas the predicted site has pyrimidines in 9 of 12 positions. The observed FMO3intron 4 splice-acceptor site exhibits pyrimidines in 10 of these positions, whereas the sequence corresponding to the alternativeFMO6 intron 4 splice acceptor site exhibits pyrimidines at 7 of the 12 positions. The unexpected nature of these two alternative splice sites relative to the predicted sites was confirmed by analysis with the Splice Site Predictor Program. The predicted FMO6intron 3 splice-donor and intron 4 splice-acceptor sites yielded scores of 0.80 and 0.74, respectively, whereas the alternative sites were not recognized. In contrast, the two corresponding FMO3 intron 3 and 4 sites gave scores of 1.00 and 0.95, respectively. Thus, the alternative FMO6 intron 3 splice-donor and intron 4 splice-acceptor sites would not have been predicted based on sequence analysis or by comparison with the homologous FMO3sequences. Consistent with these observations, this processing event was rare, occurring in only 1 of 13 transcripts (Table 2).
Relative to the homologous FMO3 sequence, the alternativeFMO6 intron 3 splice acceptor site exhibits a substitution of the preferred A for a C at the critical −2 position (Fig. 4B). Despite this, the predicted FMO6 intron 3 splice acceptor site scored 0.79, whereas the alternative site yielded a score of only 0.05. In comparison, the FMO3 intron 3 splice-acceptor site yielded a score of 0.96. Again, the failure of the alternative FMO6 intron 3 site to score well is consistent with its presence in only 3 of 13 transcripts (Table 2).
The predicted and alternative FMO6 intron 6 splice-donor sites (Fig. 4C) conform to the consensus and compared with the homologous FMO3 sequence, the alternative site exhibits a substitution of the preferred T for a C at position +2. Both the predicted and alternative FMO6 sites yielded similar scores, 0.64 and 0.59, respectively. These data would suggest that both sites might be used with equal efficiency and yet, the use of this alternative site was only observed in 2 of 13 transcripts.
In comparing the expected FMO6 intron 8 splice acceptor site to that observed for FMO3, a pyrimidine-rich 21-bp insertion immediately flanking the 3′-side of the splice site is evident. This insertion explains the difference in predicted protein sizes between FMO6 and FMO3 (i.e., 539 versus 532 amino acids, respectively). The insertion also results in a decreased Splice Site Predictor score, from 0.98 (FMO3) to 0.01 (FMO6), most probably because of the substitution of purines for pyrimidines at positions −7, −8, −10, and −11 and the substitution of a T for a G at position +1. The D1 FMO6 intron 8 splice acceptor (Figs. 3, D1, and 4, D1) site also had a low score (0.27), although in comparison with the homologous FMO3 sequence, there has been a substitution of a G for an A at the critical −1 position. The use of this alternative processing site also was only observed in 2 of 13 transcripts (Table2). In contrast, the D2 FMO6 intron 8 splice-acceptor site (Figs. 3, D2, and 4, D2) matches the consensus well with a score of 0.96. Interestingly, the homologous FMO3 sequence also yielded a score of 0.97, consistent with this site being a likely alternative in the event of mutations lowering the efficiency of the observed upstream site. Thus, the relatively frequent use of the alternative FMO6 intron 8 D2 acceptor site (8 of 13 transcripts, Table 2) and the resulting loss of critical FMO coding information is consistent with the loss in efficiency of the expected splice acceptor site caused by the 21-bp insertion and the resulting use of the nearest downstream site exhibiting a close match to the consensus splice acceptor site.
In summary, the high degree of sequence identity between humanFMO6 and FMO3, especially relative to the identity relationships observed among other FMO family members, and the apparent absence of orthologous sequences in other species, is consistent with an evolutionary event leading toFMO6 occurring much later than that leading to the other five members of this gene family. However, sequence drift, particularly in exon 4 and the 21-bp insertion in exon 9, has resulted in decreased splicing efficiency and, consequently, the use of alternative processing yielding FMO6 transcripts incapable of encoding an active monooxygenase. Thus, unless functional protein products corresponding to those predicted from the truncated transcripts can be identified or a rare person identified that might express a full-length transcript, human FMO6 should be considered a pseudogene. Although sequence comparisons to the most closely relatedFMO family member, FMO3, provides considerable insight into the mechanisms contributing to the alternative processing of FMO6 pre-mRNA, many questions remain that simply reflect our incomplete understanding of eukaryotic splicing events (Hastings and Krainer 2001).
Footnotes
- Received May 13, 2002.
- Accepted May 17, 2002.
-
This study was supported by National Institutes of Health grant CA53106 (to R.N.H).
Abbreviations
- FMO
- flavin-containing monooxygenase
- PCR
- polymerase chain reaction
- RT-PCR
- reverse transcriptase coupled polymerase chain reaction
- SR proteins
- serine-arginine–rich splicing factor protein family
- bp
- base pair(s)
- The American Society for Pharmacology and Experimental Therapeutics