Abstract
Both UDP-glucuronosyltransferase 2B4 (UGT2B4) and UGT2B7 are expressed mainly in the human liver and have several overlapping substrates; e.g., catechol estrogens, bile acids, codeine, and carvedilol. To identify novel single nucleotide polymorphisms (SNPs) and haplotypes in a Japanese population, the enhancer/promoter regions, all the exons, and the surrounding intronic regions of UGT2B4 and UGT2B7 were sequenced from 136 Japanese individuals. We found 16 and 21 polymorphisms, including 10 and 4 novel ones in UGT2B4 and UGT2B7, respectively. The novel nonsynonymous SNPs were 1364A>G (K455R) and 1531T>C (C511R) in UGT2B4 and 1192G> A (D398N) in UGT2B7. From linkage disequilibrium analysis, several SNPs in UGT2B7 were found to be highly linked with each other. No close linkage between the SNPs in UGT2B4 and UGT2B7 was observed, indicating that each gene is located within an independent haplotype block. Thus, haplotype analysis was separately performed for the two genes. In UGT2B4, we unambiguously determined 8 haplotypes and inferred an additional 12 haplotypes using an expectation-maximization-based program. In UGT2B7, five haplotypes were unambiguously assigned and an additional eight haplotypes were inferred. The haplotype structure of UGT2B7 was more diverse than that of UGT2B4 in terms of the number of frequent SNPs. In addition, ethnic differences in the UGT2B4*2 and UGT2B7*2 haplotypes between the Japanese and the Caucasian and/or African populations were found. Our findings provide fundamental and useful information for genotyping UGT2B4 and UGT2B7 in the Japanese, and probably other populations.
The glucuronidation reaction catalyzed by UDP-glucuronosyl transferases (UGTs) is responsible for clearance of endogenous substrates including bilirubin, bile acids, steroid hormones and thyroid hormones, and xenobiotics, such as clinical drugs and environmental pollutants (Tukey and Strassburg, 2000). Based on homology, the UGTs are classified into two major families, UGT1 and UGT2. The UGT2 family is further divided into two subfamilies, UGT2A and UGT2B (Mackenzie et al., 1997). To date, seven active UGT2B enzymes (UGT2B4, UGT2B7, UGT2B10, UGT2B11, UGT2B15, UGT2B17, and UGT2B28) have been found in humans (Jackson et al., 1987; Ritter et al., 1990; Jin et al., 1993; Beaulieu et al., 1996, 1998; Lévesque et al., 1999, 2001; Turgeon et al., 2000). In addition, numerous homologous pseudogenes have also been discovered, which are clustered with the UGT2B gene region on chromosome 4 (4q13) (Monaghan et al., 1994; Turgeon et al., 2000).
UGT2B4 and UGT2B7 are highly homologous (85.6%) and expressed mainly in the liver. They glucuronidate catechol estrogens, bile acids, codeine, and 3′-azido-3′-dideoxythymidine with overlapping substrate specificities (Pillot et al., 1993; Lévesque et al., 1999; Turgeon et al., 2001; Court et al., 2003; Mackenzie et al., 2003). Our previous study also suggested that both UGT2B4 and UGT2B7 were involved in the glucuronidation of the β-adrenoceptor antagonist, carvedilol (Ohno et al., 2004)
About 7-fold interindividual differences were reported in hepatic mRNA expression levels of both UGT2B4 and UGT2B7 (Congiu et al., 2002). Furthermore, the morphine 3-O-glucuronidation activity, which is mainly mediated by UGT2B7, varied about 3-fold in the liver microsomes from 20 individuals (Fisher et al., 2000). These differences could be caused in part by polymorphisms in these genes. A common polymorphism in the UGT2B7 gene (H268Y, UGT2B7*2) has been found in Caucasians and Asians (Bhasker et al., 2000). More recently, Hirota et al. (2003) found a novel single nucleotide polymorphism (SNP) in UGT2B7, 211G>T (A71S), by polymerase chain reaction (PCR)-single-strand conformational polymorphisms analysis and subsequent sequencing of genomic DNA from 46 Japanese individuals (Hirota et al., 2003). As for UGT2B4, a T-to-A transversion at nucleotide 1374 has been found in Caucasians and Africans, but not in Asians, which leads to an amino acid change at codon 458 from aspartic acid to glutamic acid (UGT2B4*2) (Lévesque et al., 1999; Lampe et al., 2000; Riedy et al., 2000). However, there has been no report on comprehensive sequencing or haplotype analysis of the UGT2B4 and UGT2B7 genes in a Japanese population.
To identify novel SNPs and to reveal haplotype structures in the Japanese, the known enhancer/promoter regions, all the exons, and the surrounding intronic regions of UGT2B4 and UGT2B7 were se quenced from 136 Japanese individuals. The enhancer/promoter regions surveyed were -1400 to -1110 upstream of the translational initiation codon, which included the peroxisome proliferator-activated receptor responsive element in UGT2B4, and 360 base pairs upstream of the UGT2B7 initiation codon (Carrier et al., 2000; Ishii et al., 2000; Barbier et al., 2003a,b). We found 16 and 21 genetic polymorphisms in UGT2B4 and UGT2B7, respectively, performed linkage disequilibrium (LD) analysis, and estimated their respective haplotypes.
Materials and Methods
Patients. The 136 Japanese subjects were arrhythmic patients who were administered beta-blockers. Genomic DNA was extracted directly from blood leukocytes. The ethics committees of the National Cardiovascular Center and the National Institute of Health Sciences approved this study. Written informed consent was obtained from all patients.
PCR Conditions for DNA Sequencing. First, the entire UGT2B4 (except for the enhancer regions amplified with the UGT2B4proF1-R1 primers) and UGT2B7 genes were amplified from genomic DNA (200 ng) using 2.5 units of Z-Taq (Takara Bio Inc., Shiga, Japan) with a 0.2 μM concentration of the first amplification primers (“First Amplification” in Table 1). The PCR was per formed as follows: 30 cycles of 98°C for 5 s, 55°C for 5 s, and 72°C for 190 s. Then, each region/exon was amplified by Ex-Taq (0.625 unit) (Takara Bio Inc.) using the first PCR products as templates with the second amplification primers (0.2 μM) that were designed in the introns (“Second Amplification” in Table 1). The second round of PCR was 94°C for 5 min, followed by 30 cycles of 94°C for 30 s, 55°C for 1 min, and 72°C for 2 min, and then a final extension for 5 min at 72°C. These PCR products were then purified using a PCR Product PreSequencing Kit (USB, Cleveland, OH) and directly sequenced using an ABI Big Dye Terminator Cycle Sequencing Kit (Applied Biosystems, Foster City, CA) and the primers listed in Table 1 (“Sequencing”). The excess dye was removed by a DyeEx96 kit (QIAGEN, Hilden, Germany), and the eluates were analyzed on an ABI Prism 3700 DNA Analyzer (Applied Biosystems). All the SNPs were confirmed by repeating the PCR from genomic DNA and sequencing these newly generated PCR products.
LD and Haplotype Analysis. LD analysis was carried out using the SNPAlyze software (version 3.1) (Dynacom Co. Ltd., Yokohama, Japan), and a pairwise two-dimensional map between SNPs was obtained for the chi square and rho square values. Some of the haplotypes were unambiguous, with homozygous SNPs at all sites or a heterozygous SNP at only one site. Separately, the diplotype configurations (combinations of haplotypes) were inferred by an expectation-maximization-based program, LDSUPPORT, which determines the posterior probability distribution of the diplotype configuration for each subject based on the estimated haplotype frequencies (Kitamura et al., 2002). The diplotype configurations of the subjects had a probability (certainty) over 0.93 for 129 subjects in UGT2B4 and over 0.99 for all 136 subjects in UGT2B7. The haplotypes inferred in only 1 of the 272 total chromosomes are described as the haplotype name with a question mark, since the predictability for these rare haplotypes is known to be low in some cases.
Results
UGT2B4 and UGT2B7 Polymorphisms Detected in a Japanese Population. First, the enhancer/promoter regions, all exons, and the surrounding intronic regions of UGT2B4 and UGT2B7 were sequenced from 136 Japanese subjects. For the reference sequences, NT_077444.2 and NT_030640.1 (GenBank accession numbers) were utilized for UGT2B4 and UGT2B7, respectively.
In UGT2B4, 16 polymorphisms, including 10 novel ones (two nonsynonymous SNPs, one synonymous SNP, four intronic SNPs, one insertion, and two deletions in the introns) were detected (see Table 2). All the allele frequencies were in Hardy-Weinberg equilibrium. No SNP was found within the known peroxisome proliferator-activated receptor-α and farnesoid X receptor-binding DR-1 site (peroxisome proliferator-activated receptor responsive element). Two novel transitions found in exon 6, A-to-G at position 1364 and T-to-C at position 1531, were nonsynonymous with amino acid changes, K455R and C511R, respectively (Fig. 1, A-D). The known nonsynonymous SNP, 1374T>A (D458E, UGT2B4*2), was also found in one subject as heterozygous. The frequency of *2 was low compared with those of Caucasians and Africans (Lampe et al., 2000; Riedy et al., 2000).
As for UGT2B7, 21 polymorphisms were detected in this study. Among them, four polymorphisms were novel (Table 3): 1192G>A (D398N) in exon 5 (Fig. 1, E and F), 915G>A (V305V) in exon 3, IVS4 + 154_155insA, and IVS4 + 185C>A in intron 4. All the allele frequencies were in Hardy-Weinberg equilibrium. Also, the known nonsynonymous SNPs, 802C>T (H268Y, UGT2B7*2) and 211G>T (A71S), with frequencies of 0.254 and 0.173, respectively, were detected. These frequencies were similar to those seen in a previous report for a Japanese population (Hirota et al., 2003) but were different from Caucasian frequencies (Bhasker et al., 2000; Holthe et al., 2003). Also, the SNPs -327G>A, -161C>T, -125T>C, 372A>G (R124R), 1059G>C (L353L), and 1062C>T (Y354Y), which have been reported by Hirota et al. (2003), were found.
LD Analysis. Using the SNPs detected, LD analysis was performed and the pairwise values for rho square and chi square were obtained. Since the data for chi square and rho square were almost equivalent, the data for rho square are depicted in Fig. 2. In UGT2B4, a perfect linkage was seen among IVS4 + 109_114delATAAAA, IVS5-52C>T, 1374T>A (D458E), and 1375C>A (ρ2 = 1.00). Close associations were found between -1255A>C and -162T>G (ρ2 = 0.87) and between IVS4 + 61T>C and IVS4 + 161_162insT-GATAA (ρ2 = 0.80). A weak association between IVS3-13_6delT and IVS5-83G>C (ρ2 = 0.49) was also found. The other associations were much lower (below 0.2 as rho square values).
In contrast to UGT2B4, strong LDs were observed in multiple SNPs within UGT2B7. The associations among -327G>A, -161C>T, 801T>A, 802C>T (H268Y), IVS2 + 115A>G, and IVS2 + 148A>G, and among IVS3-116A>G, 1059G>C, IVS4 + 64T>A, IVS4 + 154_155insA, IVS4-154G>C, and IVS4-129T>C were prominent, and both of the combinations showed perfect linkages (ρ2 = 1.00). Furthermore, these two groups of linkages were also strongly associated with each other at ρ2 values over 0.94, and IVS3-114G>A was often associated with these 12 variations (ρ2 = 0.78 or higher). Strong LDs were also seen between 735A>G and 1062C>T (ρ2 = 0.95), and between 211G>T (A71S) and 372A>G (ρ2 = 0.84).
UGT2B4 and UGT2B7 are separated by approximately 360 kilobases on chromosome 4. Strong linkages were not found between the SNPs of UGT2B4 and those of UGT2B7 (Fig. 2). Thus, our results suggest that the two genes are not within the same haplotype block.
Haplotype Analysis. Since no strong linkage was observed between UGT2B4 and UGT2B7, haplotype analysis was performed separately.
As for UGT2B4, a group of haplotypes without amino acid changes was defined as *1, and the group bearing the nonsynonymous D458E (*2 allele) was named the *2 haplotypes. Eight haplotypes were first unambiguously assigned by the presence of homozygous SNPs at all sites (*1a, *1b, and *1c) or a heterozygous SNP at only one site [*1d, *1f, *1h, *1j, and *455R (the haplotype bearing 455R was tentatively named *455R)] (Table 2). Separately, we estimated the diplotype configuration (a combination of haplotypes) for each subject by LDSUPPORT software. The diplotype configurations of all the subjects had a probability (certainty) greater than 0.93, except for seven subjects with the maximum probability of 0.51 or higher. In Table 2, the diplotypes for these seven subjects were also described as diplotypes with a question mark. The additionally inferred haplotypes were 12 haplotypes [*1e, *1g, *1i, *1k-*1q, *2a, and *511R (the haplotype bearing 511R was tentatively named *511R)] (Table 2). The most frequent haplotype was *1a (frequency: 0.441), followed by *1b (0.290), *1c (0.081), *1d (0.037), *1e (0.029), and *1f (0.022). The frequencies of the other haplotypes were less than 0.02. Thus, the UGT2B4*1 haplotypes mainly consist of *1a and *1b (total frequency, 0.731).
Regarding UGT2B7, the haplotypes without amino acid changes were defined as *1 haplotypes, and the haplotypes bearing H268Y (*2 allele) were named *2. Five haplotypes were first unambiguously assigned by homozygous SNPs at all sites [*1a, *1b, *2a, and *71S (the haplotype bearing 71S was tentatively named *71S)] or a heterozygous SNP at only one site (*2b) (Table 3). Estimation by LDSUPPORT software inferred all the diplotype configurations with a probability (certainty) greater than 0.99. Eight additionally inferred haplotypes were *1c to *1g, *2c, *2d, and *398N (the haplotype bearing 398N was tentatively named *398N) (Table 3). The most frequent haplotype was *1a (frequency, 0.386), followed by *2a (0.210) and *71S (0.173). The frequencies of the other haplotypes were less than 0.1. Our *2a, *2b, and *2d haplotypes correspond to the Norwegian haplotype A; *1a, *1b, and *1e to haplotype B; and *1c to haplotype C (Holthe et al., 2003). These data demonstrated that the haplotype structure of UGT2B7 is more diverse than that of UGT2B4 in terms of the number of frequent SNPs, although their substrate specificities and gene structures are similar.
Discussion
In this study, the genomic DNA from 136 Japanese subjects was sequenced, and 16 and 21 polymorphisms in UGT2B4 and UGT2B7, respectively, were found.
As for UGT2B4, UGT2B4*2 (1374T>A, D458E) is thought to be fairly common in Caucasian and African populations, with frequencies of approximately 0.2 and 0.15, respectively (Lampe et al., 2000; Riedy et al., 2000). This SNP was detected in one subject as a heterozygote in this study, but this SNP is still rare in the Japanese (allele frequency, 0.004). We found two novel nonsynonymous SNPs in UGT2B4, 1364A>G (K455R) and 1531T>C (C511R). Their positions are located in the latter half (UDP-glucuronic acid binding) domain and cytosolic domain, respectively. The cysteine at position 511 in UGT2B4 is highly conserved in human UGTs. It has been suggested that cytosolic cysteine residues (507, 511, and 514 in rat UGT1A6) are important for UGT1A6 enzymatic activity (Ikushiro et al., 2002). Furthermore, the mutant truncated at amino acid 512 of UGT2B1 (C513 corresponds to C511 in UGT2B4) was reduced to approximately 40% of the activity of the enzyme truncated at the 514-residue (Meech et al., 1996). Thus, the C511R substitution might alter enzymatic activity, although the functional significance remains to be determined.
The UGT2B4 haplotype structure is relatively simple. UGT2B4*2, *455R, and *511R were each found in only one subject. Most haplotypes were rare except for the two major haplotypes, *1a and *1b. Furthermore, we found no polymorphisms in the reported UGT2B4 enhancer region.
As for UGT2B7, UGT2B7*2 (802C>T, H268Y) was shown to be perfectly linked with -327G>A, -161C>T, 801T>A, IVS2 + 115A>G, and IVS2 + 148A>G, and closely linked with IVS3-116A>G, 1059G>C, IVS4 + 64T>A, IVS4 + 154_155insA, IVS4-154G>C, and IVS4-129T>C. Furthermore, on the basis of previous studies (Hirota et al., 2003; Holthe et al., 2003), -1302G>A, -1295C>T, -1111C>T, and -899A>G may also be associated with this haplotype group (*2). Holthe et al. (2003) identified three haplotypes using SNPs detected in Norwegians: haplotype A (*2a, *2b, and *2d in this study), haplotype B (*1a, *1b, and *1e in this study), and haplotype C (*1c in this study). Their frequencies were 0.56, 0.33, and 0.11 for haplotypes A, B, and C, respectively. These frequencies were different from those in the Japanese determined in this study: 0.25, 0.47, and 0.07 for haplotype A, B, and C, respectively. Thus, the *1 and *2 haplotype distributions of UGT2B7 are suggested to be different between Caucasians and Asians (P < 0.01 by the χ2 test), and the frequency of the *2 haplotypes in the Japanese was much lower than that in Norwegians (Holthe et al., 2003). Although no remarkable functional difference was observed between the *1 and *2 haplotypes in several reports (Bhasker et al., 2000; Holthe et al., 2002, 2003; Court et al., 2003), it was recently reported that UGT2B7*2 showed a significantly higher morphine-6-O-glucuronide/morphine ratio than that with UGT2B7*1 (Sawyer et al., 2003). Thus, it is possible that the difference in the UGT2B7*2 frequencies might lead to ethnic differences in morphine metabolism and disposition.
SNP 211G>T (A71S) in UGT2B7 was recently reported in the Japanese at a frequency of 0.185 (Hirota et al., 2003), which was similar to our data (0.173). Codon 71 is located within the N-terminal (substrate binding) domain, and A71S causes a change from a lipophilic side chain to a hydrophilic one. This SNP has not been reported in other ethnic groups and is always associated with 372A>G in the Japanese (this haplotype was named *71S in this study). SNP 372A>G alone was found at a frequency of 0.03 without association with 211G>T (*71S) in Norwegians (Holthe et al., 2003)
One novel nonsynonymous SNP, 1192G>A (D398N), was detected in this study. D398 is located in the latter half (UDP-glucuronic acid binding) domain. This acidic amino acid is highly conserved in mammalian UGTs. In UGT1A6, D394 (corresponding to D398 in UGT2B7) and D397 (corresponding to D401 in UGT2B7) are the most probable sites for interactions with a uridinyl moiety (Radominska-Pandya et al., 1999). Thus, alteration from an acidic amino acid (D) to a neutral amino acid (N) might influence the binding of UDP-glucuronic acid. In fact, we have preliminary findings that the variant enzyme with the UGT2B7*398N (but not *71S) haplotype has reduced glucuronidation activity compared with the wild-type enzyme (with*1a haplotype) toward 7-hydroxy-4-trifluoromethylcoumarin (50 μM) in vitro (Jinno et al., unpublished data).
A SNP in the UGT2B7 promoter region, -125T>C, which is located in the canonical binding site for the octamer transcription factor-1 (Carrier et al., 2000), was shown to be the binding site for nuclear proteins by the DNase I footprint assay (Ishii et al., 2000). Because only the *1b haplotype has this -125T>C SNP, it would be interesting to determine whether the expression of UGT2B7 was different between the subjects with *1b and the other *1 haplotypes.
Finally, the 20 and 13 haplotypes in UGT2B4 and UGT2B7, respectively, estimated in this study provide fundamental information for genotyping UGT2B4 and UGT2B7 in the Japanese and would be useful for studies on the association between the haplotypes and pharmacokinetic or clinical parameters.
Acknowledgments
We thank Chie Knudsen for secretarial assistance.
Footnotes
-
This study was supported in part by the Program for Promotion of Fundamental Studies in Health Sciences (MPJ-3 and -6) of the Organization for Pharmaceutical Safety and Research of Japan.
-
ABBREVIATIONS: UGT, UDP-glucuronosyltransferase; SNP, single nucleotide polymorphism; LD, linkage disequilibrium; PCR, polymerase chain reaction.
- Received January 12, 2004.
- Accepted May 18, 2004.
- The American Society for Pharmacology and Experimental Therapeutics