Abstract
Biliary excretion of bile salts and other bile constituents from hepatocytes is mediated by the apical (canalicular) transporters P-glycoprotein 3 (MDR3, ABCB4) and the bile salt export pump (ABCB11). Mutations in ABCB4 and ABCB11 contribute to cholestatic diseases [e.g., progressive familial intrahepatic cholestasis 2 (PFIC2), PFIC3, and intrahepatic cholestasis of pregnancy], and our objective was to establish genetic variability and haplotype structures of ABCB4 and ABCB11 in healthy populations of different ethnic backgrounds. All coding exons, 5 of 6 noncoding exons, 50 to 300 base pairs of the flanking intronic regions, and 2.5 to 2.8 kilobase pairs of the promoter regions of ABCB4 and ABCB11 were sequenced in 159 and 196 DNA samples of Caucasian, African-American, Japanese, and Korean origin. In total, 76 and 86 polymorphisms were identified in ABCB4 and ABCB11, respectively; among them, 14 and 28 exonic polymorphisms, and 8 and 10 protein-altering variants, of which 4 were predicted to have functional consequences. Both genes showed substantial ethnic differences with respect to allele number, frequency of common and population-specific sites, and patterns of linkage disequilibrium. Population genetic analysis suggested some selective pressure against changes in the protein, supporting the important endogenous role of these transporters. Haplotype variability was greater in ABCB11 than in ABCB4. An ABCB11 promoter haplotype was associated with significant decrease of activity compared with wild type. Our results contribute to a better understanding of the molecular basis and of ethnic differences in drug response, and provide a valuable tool for future research on the heredity of cholestatic liver injury.
ATP-binding cassette (ABC) transporters mediate energy-dependent transport of exogenous and endogenous organic compounds across membranes with distinct substrate specificity, tissue distribution, and intracellular localization (Borst and Elferink, 2002). The clinical importance of ABC transport proteins and the consequences of genetic polymorphisms are being increasingly appreciated. For instance, the P-glycoprotein or multidrug resistance protein 1 (MDR1), which was initially shown to confer resistance to cancer chemotherapy, has a considerable impact upon disposition and therapeutic response of many drugs. Single nucleotide polymorphisms (SNPs) in the MDR1 (ABCB1) gene have been reported to modulate expression levels, protein activity, and bioavailability of substrate drugs (Kerb, 2006). Like MDR1, MDR3 and the bile salt export pump (BSEP) are members of the ABCB gene family. They are localized in the canalicular membrane of hepatocytes, where they form the secretory biliary unit of the liver. BSEP is the predominant hepatocellular efflux system for the excretion of conjugated bile salts, whereas MDR3 acts as a flippase and translocates phosphatidylcholine across the canalicular membrane (Borst and Elferink, 2002; Kullak-Ublick et al., 2004). A lack of this major phospholipid in bile leads to formation of toxic monomeric bile salts in the bile ducts.
MDR3 and BSEP are potentially important targets for drug-induced liver injury. MDR3 P-glycoprotein is able to transport a number of MDR1 P-glycoprotein substrates (e.g., digoxin, paclitaxel, and vinblastin), although the contribution of MDR3 is probably clinically less important. Verapamil, cyclosporine, and vinblastine are able to inhibit MDR3, explaining why these drugs could adversely affect canalicular phosphatidylcholine secretion (Smith et al., 2000). Inhibition of BSEP by several compounds including estrogen, rifampicin, cyclosporine, troglitazone, and bosentan has been implicated in drug-induced cholestasis (DIC) (Stieger et al., 2000; Funk et al., 2001; Borst and Elferink, 2002; Kostrubsky et al., 2003; Kullak-Ublick et al., 2004). Moreover, defects in the genes coding for MDR3 and BSEP, ABCB11 and ABCB4, can cause hereditary disorders of the liver such as progressive familial intrahepatic cholestasis subtypes 2 (PFIC2; Byler syndrome) and 3 (PFIC3) or intrahepatic cholestasis of pregnancy (ICP), as reviewed recently (Pauli-Magnus and Meier, 2003). Currently, little is known about common ABCB4 and ABCB11 polymorphisms and their frequencies in healthy subjects of different ethnic populations. Genetic variants in ABCB4 and ABCB11 have been identified mostly in small samples of patients suffering from PFIC2 and PFIC3 syndromes, intrahepatic cholestasis of pregnancy, and healthy control samples (Pauli-Magnus and Meier, 2003; Pauli-Magnus et al., 2004a,b). Saito et al. (2002) investigated 48 Japanese individuals and identified 50 and 98 genetic variations in ABCB4 and ABCB11, respectively. Among these were eight exonic polymorphisms, including only one protein-altering mutation, which was found in ABCB11. To the best of our knowledge, no systematic screens for polymorphisms have been carried out for ABCB4 in Koreans and ABCB11 in African Americans.
We, therefore, screened different ethnic populations for genetic variations in ABCB4 and ABCB11, and present the haplotype structure, a pairwise linkage disequilibrium mapping of polymorphisms, and report population-genetic parameters of ABCB4 and ABCB11. Furthermore, we investigated promoter activity of reporter gene constructs comprising polymorphisms in the 5′ flanking regions of ABCB4 and ABCB11.
Materials and Methods
DNA Samples. Blood samples for DNA extraction were withdrawn from 149 Caucasian, 47 African-American, 48 Japanese, and 48 Korean healthy volunteers. A total of 64 and 101 Caucasians, 47 and 48 Japanese, as well as 48 Koreans and 47 African Americans were sequenced for ABCB4 and ABCB11, respectively. All subjects were healthy, based on medical history, physical examination, and routine laboratory tests. Anonymous blood samples were obtained for the purpose of pharmacogenetic testing after approval from the responsible local ethics committees and written informed consent. Caucasians were of German origin and selected in Berlin for participation in phase 1 clinical trials. African-American DNA samples were purchased from Genomics Collaboratives Inc. (Cambridge, MA). Japanese samples were purchased from Shin Nippon Biomedical Laboratories Ltd. (Kagoshima, Japan). Korean samples were purchased from the Department of Clinical Pharmacology, Pusan Paik Hospital, Inje University College of Medicine, Korea. The study was performed in accordance with ethical principles that have their origin in the Declaration of Helsinki.
DNA Sequencing. For ABCB4, approximately 17 kb were screened, including noncoding exons –3, –2, –1, 1, and coding exons 2 to 28, including 50 to 300 bp of adjacent intronic sequences, as well as 2610 bp of the promoter region upstream from the translation start in exon 2. In addition, 2841 bp of the 5′ region upstream of the first untranslated exon –4 were screened. Approximately 13 kb of the ABCB11 gene were screened, including noncoding exon 1 and 2498 bp of the upstream promoter region, and coding exons 2 to 28 and 50 to 300 bp of intronic sequence around these exons.
Genomic and cDNA sequences for primer design were derived from GenBank accession numbers AC005068.2 for promoter region, noncoding exons –3, –2, –1, and 1, and coding exons 2 and 3; AC006154.1 for exons 4 to 12; AC0005045.2 for exons 13 to 28 and NM_000443.2 (cDNA) for ABCB4; AC008177.3 for promoter and exon 1 to 21; AC069165.2 for exon 22 to 28; and NM_003742.2 (cDNA) for ABCB11. Exon-intron boundaries of the ABCB4 and ABCB11 gene were defined by comparing genomic sequences with the cDNA sequences. PCRs for generating ABCB4 fragments were generally performed in a reaction volume of 50 μl with 20 to 30 ng of genomic DNA, 10× PCR buffer (Qiagen, Hilden, Germany), 200 μM deoxynucleoside-5′-triphosphates, 20 pmol of each primer, and 1 unit of Taq polymerase (Qiagen). For amplification of exon 1, 4 μl of DMSO (8%) were added, 1 μl of 5× Q-solution (Qiagen) were added for exons 2 and 3, and 2 μl of 25 mM MgCl2 in addition to MgCl2 in PCR buffer were added for exons 22 and 27. PCRs for generating ABCB11 fragments were performed in a reaction volume of 50 μl with 20 to 30 ng of genomic DNA, 10× PCR buffer (Qiagen), 200 μM deoxynucleoside-5′-triphosphates, 10 pmol (for promoter fragments Prom9, Prom8, Prom7, Prom6, Prom5, Prom4, Prom2 and exons 5, 8, 10, 12, 13, 14, 16, 17, 20, 21, 22, 23, 24, and 26) and 20 pmol (for promoter fragment Prom3 and exons 1, 2, 3, 4, 6, 7, 9, 11, 15, 18, 19, 25, 27, and 28) of each primer, 5 μl of 10 mM MgCl2 in addition to the 15 mM MgCl2 in PCR buffer, and 1 unit of Taq polymerase (Qiagen) or 1 unit of HotStarTaq polymerase (Qiagen) for promoter fragments Prom9 and exons 12 and 22. PCR fragments were generated in a GeneAmp PCR System 9700 (ABI, Weiterstadt, Germany) with an initial denaturation step of 2 min at 94°C, followed by 34 cycles of denaturation at 94°C for 45 s, annealing for 45 s at 62°C, and extending for 1 min at 72°C. The initial denaturation temperature for ABCB4 fragments Prom2 and exons 1, 2, and 3 was 96°C for 3 min, followed by 34 cycles of denaturation at 96°C for 45 s, annealing for 45 s at 62°C, and extending for 1 min at 72°C. The initial denaturation temperature for ABCB4 fragments Prom9 and exons 12 and 22 was 95°C for 15 min, followed by 34 cycles of denaturation at 94°C for 45 s, annealing for 45 s at 62°C, and extending for 1 min at 72°C. Oligonucleotide primer sequences are given in Tables 1 and 2. Subsequently, purified amplicons were directly sequenced for genetic polymorphisms on PE ABI 3700 DNA Analyzers by using BigDye Terminator cycle sequencing reactions (ABI). Sequences were analyzed and polymorphisms identified using the PHRED/PHRAP/CONSED/POLYPHRED software package (University of Washington, Seattle, WA). The sequences were inspected for deviations from ABCB4 (NM_000443.2) and ABCB11 (NM_003742.2) sequences, which were defined as reference. Singletons were confirmed by generating a second independent PCR fragment and direct sequencing from both ends.
Statistics, Population Genetics, and Structural and Haplotype Analysis. Computation of nucleotide diversity (π), neutral parameter (θ), and Tajima's D (Tajima, 1989) was carried out using Arlequin 2.0 (Schneider et al., 2000). Functional tolerability of amino acid exchanges was calculated using SIFT (sorting intolerant from tolerant) (blocks.fhcrc.org/sift/SIFT.html), PolyPhen (Polymorphism Phenotyping) (www.bork.embl-heidelberg.de/PolyPhen), the BLOSUM62 amino acid substitution matrix (www.embl-heidelberg.de/~seqanal/courses/predoc97/blosum62.cmp), and Grantham values (Grantham, 1974). Amino acid residues were classified as being evolutionarily conserved (EC) or unconserved (EU) based on protein sequence alignments with mammalian orthologs by using SIFT and ClustalW (www.ebi.ac.uk/clustalw/). Three mammalian protein sequences (the human sequence and at least two from mouse, rat, and/or rabbit) were used for the alignments. An amino acid residue was classified as evolutionarily conserved if it was present in all members of a set of mammalian orthologs; all others were classified as evolutionarily unconserved. Transmembrane domain and loop regions were assigned based on topology data from the SwissProt database (www.ebi.ac.uk/swissprot).
Allele frequencies for the ABCB4 and ABCB11 polymorphisms were compared pairwise between population groups and analyzed for deviation from the Hardy-Weinberg equilibrium using Fisher's exact test. Linkage disequilibrium (LD) between pairs of polymorphisms was quantified by D′ and r2 statistics.
Rare mutations, which were only found once in the entire sample set, were excluded from haplotype analysis. Haplotypes and their frequencies were statistically calculated using PHASE 1.0 (Stephens et al., 2001) by running 10 iterations with different seeds and default parameters. A haplotype was regarded as certain if it was inferred identical in at least seven runs. PHASE can cope with missing genotype data (ABCB4 7.8%, ABCB11 7.8%). LD analyses were performed with the inferred haplotypes using DnaSP4.0 (Rozas et al., 2003). The HAPLOVIEW software package, using the confidence intervals algorithm, was used to determine LD block structures (Barrett et al., 2005).
Promoter activities were statistically evaluated using SPSS 10.0.8 for Macintosh (SPSS Inc., Chicago, IL). The general linear model univariate procedure was applied to compare luciferase activities of the mutant promoter constructs with the wild type. The Bonferroni post hoc range test was used to adjust for multiple comparisons. The general linear model multiple measures analysis was used to compare activities between promoter constructs measured at different chenodeoxycholic acid (CDCA) concentrations.
Construction of ABCB4 and ABCB11 Luciferase Reporter Gene Plasmids. In addition to the core promoter upstream of the translation start in exon 2, an alternative promoter upstream of exon –4 was tested using luciferase reporter gene assays.
DNA from three different subjects was used to generate constructs ABCB4-B-wt (PromB, core promoter) and ABCB4-A-wt (PromA) comprising the reference sequence, and ABCB4-A-Typ2A (PromA) containing a combination of seven variants (IDs: 1, 3, 4, 5, 6, 9, and 10; Table 3). For ABCB11, 8 DNA fragments of 2.6 kb of the 5′ flanking region of exon 1 comprising 19 polymorphisms from different subjects were generated to obtain 14 different ABCB11 promoter constructs (see Fig. 5). Sequence integrity was confirmed by sequencing.
ABCB4 constructs ABCB4-A-wt (2.2 kb), ABCB4-A-Typ2A (2.2 kb), and ABCB4-B-wt (2.8 kb); and ABCB11 constructs C_1 to C_14 had MluI and BclI sites introduced for subsequent cloning into the luciferase reporter gene plasmid pGL3-basic (Promega Catalys AG, Mannheim, Germany).
Luciferase Reporter Gene Assay. HepG2 and human hepatoma (Huh7) cell lines were purchased from the American Type Culture Collection (Manassas, VA) and maintained in RPMI 1640 (Sigma-Aldrich Chemie GmbH, Taufkirchen, Germany) supplemented with 10% fetal calf serum (Invitrogen, Karlsruhe, Germany), 100 U/ml penicillin, and 100 mg/ml streptomycin (Invitrogen). For transactivation assays, cells were grown for 3 days in medium containing 10% charcoal-stripped bovine calf serum and then selected at 90 to 95% density in 24-well plates. For transient transfection, 1.5 ml of Lipofectamine 2000 reagent (Invitrogen, Karlsruhe, Germany) and 500 to 700 ng of plasmid DNA were used per well. Plasmid DNA comprised 450 ng of ABCB4 construct and 50 ng of pSV-β-galactosidase plasmid or 450 ng of ABCB11 construct, 200 ng of farnesoid X receptor, and 50 ng of pSV-β-galactosidase plasmid. For ABCB11, the cells were treated with up to 200 μM CDCA. Cells were lysed with passive lysis buffer (Promega Catalys AG) 24 h after transfection. Luciferase activity was quantified with the luciferase assay system (Promega Catalys AG) using the Lumat LB 9507-2 luminometer (Berthold, Bad Wildbad, Germany). β-Galactosidase activity was quantified with a high-sensitivity assay (Stratagene, La Jolla, CA) in a UVMax kinetic microplate reader (Molecular Devices, Sunnyvale, CA) at 595 nm. The pGL3-basic plasmid served as control in each separate experiment.
Results
Overall, we discovered 76 and 86 polymorphic sites in approximately 17 and 13 kb of the ABCB4 and ABCB11 genes, respectively. The average number of polymorphic sites per kilobase of DNA for ABCB4 (n = 318) and ABCB11 (n = 392) was 4.5 and 6.3 in total, 3.6 (14 in 3.8 kb; 2.1 for missense mutations) and 6.8 (27 in 3.9 kb; 2.5 for missense mutations) in the coding regions, 3.0 (22 in 7.4 kb) and 4.3 (27 in 6.3 kb) in the intronic region, and 7.2 (40 in 5.5 kb) and 10.8 (27 in 2.5 kb) in the 5′ UTR. The number of polymorphic sites in the coding region of other ABC transporters was 3.3 (15 in 4.6 kb, n = 206) and 5.2 (20 in 3.8 kb, n = 494) in MRP3 and MDR1, respectively (Kroetz et al., 2003; Lang et al., 2004). Eight of the 14 (ABCB4) and 10 of the 27 (ABCB11) variants within the coding regions were nonsynonymous mutations (altered the protein sequence), and 9 were not reported in the dbSNP database or the literature previously. As expected, none of the disease-causing PFIC1-3 mutations were detected in our sample set. Genetic variants including their localization and allele frequencies are listed in Tables 3 and 4.
Polymorphisms in the Exons of theABCB4andABCB11Genes.ABCB4. Of 19 genetic variants in coding and noncoding exons, 8 included missense mutations, 6 were silent mutations, and 5 mutations were located in the untranslated exons –3, –2, –1, and 1. These included three Caucasian variants in exon 6 (c.523A>G; allele frequency 3.2%), exon 16 (c.1954A>G; 7.3%), and exon 26 (c.3296A>G; 1.8%), resulting in amino acid substitutions p.T175A, p.R652G, and p.E1099G, respectively. Five rare missense mutations occurred once as single heterozygotes (singletons) in exon 4 (c.261T>C and c.283C>T), exon 10 (c.1099A>G), exon 12 (c.1349A>G), and exon 15 (c.1769G>A), resulting in the amino acid changes p.I367V, p.R590Q, p.D87E, p.P95S, and p.E450G. All amino acid-changing variants of ABCB4 were predicted to be located in the extracellular and cytoplasmic region (Fig. 1A). Six SNPs (5 Caucasian, 1 Korean) were detected in the untranslated exons 1, –1, –2, and –3 of ABCB4, a region that possibly could influence mRNA stability.
ABCB11. Twenty-eight genetic variants were detected in 27 coding exons and 1 noncoding exon; among these were 10 missense mutations, 17 silent mutations, and 1 mutation in the untranslated exon 1. Two Caucasian-specific variants in exon 13 (c.1331T>C; 59.4%) and exon 17 (c.2029A>G; 4.2%) coded for amino acid substitutions p.V444A and p.M677V; one variant, detected in exon 16 (c.1846C>G, 2.2%) in the African-American population sample, resulted in protein sequence alteration p.R616G; and the Japanesespecific exon 21 variant c.2594C>T (2.4%) resulted in amino acid substitution p.A865V (Table 4). Six singletons were detected in exon 8 (c.616A>G → p.I206V), exon 9 (c.851T>C → p.V284 and c.A896G>A → p.R299K), exon 16 (1855A>G → p.T619A), exon 18 (c.2093G>A → p.R698H), and exon 23 (c.2873G>A → p.R958Q). All amino acid polymorphisms of ABCB11 were predicted to be located in the extracellular region (Fig. 1B).
Genetic Variations in the 5′ Flanking Regions and Introns. In ABCB4, 40 genetic variants were identified in the functional promoter region 2.6 kb and 2.8 kb upstream of exon 2 (21 variants) and exon –4 (19 variants), respectively. In ABCB11, 27 genetic variants were detected in the 5′ flanking (promoter) region 2.5 kb upstream of exon 1. Furthermore, 22 (ABCB4) and 30 (ABCB11) genetic variants were detected in adjacent intronic regions.
Prediction of Functional Consequences. Potential consequences of nonsynonymous variants in ABCB4 and ABCB11 were predicted by means of five different computational methods to prioritize further investigations. 1) The type of amino acid change was categorized based upon nonsynonymous Grantham values, which provide a measure of chemical similarity. Grantham scores (possible range from 5 to 215) of <50 were classified as nonsynonymous conservative, 51 to 100 as moderately conservative, 101 to 150 as moderately radical, and >151 as radical (Stephens et al., 2001). 2) SIFT is an algorithm for predicting functional consequences of amino acid substitutions which assigns scores considering alignments of orthologous sequences. SIFT scores range from 0 to 1: scores <0.05 indicate variant sites in codons for evolutionarily conserved amino acids that are predicted to be deleterious, whereas those ≥0.05 are more likely to be tolerated. 3) PolyPhen uses empirically derived rules and computes the absolute value of the difference between profile scores for both variants to predict the likelihood of a nonsynonymous SNP affecting protein function or structure. Large differences (>1.5) indicate that the substitution is rarely or never observed in the protein family and therefore more likely to affect the protein. PSIC scores below 0.5 denote benign variants, whereas PSIC scores between 1.5 and 2 are possibly damaging and PSIC scores above 2 are probably damaging. 4) The amino acid substitution matrix BLOSUM62 applies the same criteria as Cargill et al. (1999) and was used to predict how evolutionarily favorable a nonsynonymous SNP is. Scores range from –4 to +3, and substitutions with scores <0 or ≥0 are evolutionarily less or more favorable, respectively. 5) We classified amino acid changes as EC or EU based on sequence alignments with two mammalian orthologs (e.g., rat, mouse, and/or rabbit). Laebman et al. (2003) and Shu et al. (2003) showed that substitutions at EC positions are more deleterious than those at EU positions.
The results of these computational analyses are summarized in Table 5. However, prediction results are not totally consistent between the different methods. Two ABCB4 and two ABCB11 variants were predominantly predicted to have functional consequences. In particular, ABCB11 p.R616G is suspicious for changing the physicochemical properties of the resulting protein (highest Grantham score of 125), is probably deleterious according to the low SIFT value (0.01), is supposed to affect protein function because of its high PSIC score, and is evolutionarily less favorable considering the negative BLOSUM62 value, although sequence alignments indicated that p.R616G is located in a probably less deleterious, evolutionarily unconserved region. Similarly, ABCB4 p.E1009G and ABCB11 p.T619A have more radical Grantham scores, are less tolerated substitutions (SIFT) with possibly damaging effect (PSIC), and are evolutionary less acceptable but located in an evolutionary unconserved region. In contrast, ABCB4 p.R652G is a more common variant, which changes an evolutionary conserved amino acid to an evolutionary less acceptable one with probably altered physicochemical properties. On the other hand, SIFT and PSIC scores of this variant do not confirm functional consequences of ABCB4 p.R652G.
Furthermore, reference and variant genomic sequences of ABCB4 and ABCB11 were used to predict potential splice site variants (www.fruitfly.org/seq_tools/splice.html). None of the detected intronic mutations were identified to alter consensus sequences of existing splice sites.
Ethnic Specificity of SNPs.ABCB4 was investigated in Caucasian, Japanese, and Korean populations, whereas ABCB11 was investigated in Caucasians, Japanese, and African Americans. Twenty-nine SNPs occurred in all three populations in ABCB4 and 15 SNPs occurred in all three populations in ABCB11. Most ABCB4 and ABCB11 variants were population-specific, and either their occurrence or their population frequency varied among different ethnic groups. Not surprisingly, the largest genetic diversity was detected in samples of African-American origin (ABCB11). They had the highest number and proportion (30/54) of population-specific alleles, compared with 11/37 in the Caucasian sample and 15/32 in the Japanese, and compared with 14/47, 11/47, and 13/47 population-specific ABCB4 SNPs in Caucasian, Japanese, and Koreans, respectively.
Few population-specific SNPs were observed at higher frequencies. In ABCB4, only the Caucasian population had 1/14 population-specific variants at a frequency ≥5%. In contrast, in ABCB11, the African-American population sample had 6/30 at frequencies ≥5%, whereas Caucasians had no population-specific alleles at ≥5%. In addition, 43% of the African-American-specific mutations (ABCB11), 72% of Caucasian-specific, and 82% of Asian-specific variants (ABCB11 and ABCB4) were singletons.
Rare variants are more likely to be recently derived than are the common variants and are, therefore, more likely to be population-specific. Hence, they are sensitive indicators for the relationships among populations. As expected, our Japanese sample shared more rare variants (<5%) with Korean samples than with Caucasian or African-American samples. Most ABCB4 and ABCB11 variants that were found in all populations had allele frequencies ≥5%.
ABCB4 p.R652G was the only protein-altering variant with high allele frequency in all groups (7.2% in Caucasian, 1.4% in Japanese, and 2.3% in Korean). ABCB4 p.T175A (3.2%) and p.E1099G (1.8%) were only present in the Caucasian sample. The other five missense mutations were singletons. The most common ABCB11 protein-altering polymorphism was p.V444A, which was frequently observed in all groups [ABCB11 p.M677V was present in both the Caucasian (4.2%,) and the African-American (14%) population sample and p.A865V was only found in Japanese samples]. The other six nonsynonymous mutations were singletons.
Population Genetic Analysis. Nucleotide diversity provides a measure of genetic variation that is normalized by the sample size. We estimated two measures of nucleotide diversity, the average heterozygosity per site (π), and the population mutation parameter (θ). In addition, Tajima's D was calculated to detect deviations from the neutral mutation model (Tajima, 1989). These parameters were estimated for all variable sites with less than 15% of missing data for various gene regions (coding region, noncoding region, exon-intron boundaries, and 5′-UTR) as well as for various sites within the coding region (synonymous and nonsynonymous sites), and separately for each population (Table 6). It should be noted that only 36 of 54 segregating sites from African Americans were included in the calculations, and the estimates may therefore not reflect the actual population values. The diversity estimates were similar across ethnic groups and within the range previously reported for other genes in various ethnic groups (Cargill et al., 1999; Halushka et al., 1999; Glatt et al., 2001; Laebman et al., 2002). The estimates of θ (×10–4) ranged from 4.39 to 4.98 (ABCB4) and 4.53 to 5.52 (ABCB11), and those of π (×10–4) ranged from 2.79 to 4.31 (ABCB4) and 3.63 to 5.28 (ABCB11) for the entire sequenced region (Table 6). Furthermore, θ values were higher compared with the corresponding π values in five of six population groups, and Tajima's D values were positive only for ABCB11 in the Japanese (0.36) and negative in all other population samples and for ABCB4. However, neither estimate was statistically significant. The highest θ and π values were estimated in the 5′-UTR in both genes in all population samples.
Linkage Disequilibrium of Genetic Variants in theABCB11andABCB4Genes. Linkage disequilibrium (LD) was evaluated for the Caucasian, African-American, Japanese, and Korean sample populations. Significant disequilibrium between site pairs as calculated by Fisher's exact test and blocks of LD calculated using the confidence-bound method (Gabriel et al., 2002) are shown in Fig. 2, A and B. The D′ values between all site pairs are generally much higher then the r2 values, with a large proportion of D′ values equal to 1.0 or –1.0 (maximum disequilibrium). Several groups of SNPs in ABCB4 and ABCB11 were in tight LD (most D′ values are 0.7–1.0 and –1.0) with each other, and there is evidence for single-block structures. In general, the largest LD blocks and the highest number of alleles within the individual LD blocks were found in ABCB4 compared with ABCB11.
In ABCB4, three groups of genetic variants appeared to be in strong LD. The largest group comprised 19 SNPs (IDs 1, 3, 4, 5, 6, 9, 10, 14, 16, 17, 19, 24, 26, 28, 30, 33, and 37) in all population samples and, additionally, ID 41 in Japanese and Korean, or ID 42 in the Caucasian population sample. Linkage disequilibrium between these SNPs occurred across an 80-kb distance spanning almost the entire ABCB4 gene. Other groups of variable sites in strong LD were observed in Caucasians between IDs 2, 6, 11, 12, 20, 21, 31, 32, 35, 36, 38, and 39 and between the eight variable sites 20, 21, 31, 32, 35, 36, 38, and 39 in Japanese and Koreans. The highest number of alleles (n = 11–14) was found in the LD block 1 of all populations, consisting mainly of promoter SNPs with IDs 1 to 23. In Caucasian the LD blocks 3 and 4 were the largest containing exon/intron alleles.
In ABCB11, the bulk of significant linkage disequilibrium and the largest LD blocks occurred in Caucasians in three groups of segregating sites, among the nine SNPs (IDs 1, 2, 3, 4, 7, 8, 39, 40, 41), the seven SNPs (IDs 29, 30, 31, 39, 40, 41, 43), and the four SNPs (IDs 44, 47, 51, 53). In African Americans, significant LD was observed between the nine variable sites 6, 7, 10, 18, 20, 23, 39, 40, and 41 and 13, 15, 20, 25, 26, 31, 33, 35, and 38, as well as between the four variable sites 45, 46, 48, and 50 and three variable sites 44, 51, and 53. In Japanese, significant LD was detected among the six segregating sites 18, 20, 23, 39, 40, and 41, the five variable sites 1, 2, 3, 4, and 8, the four variables sites 32, 39, 40, and 41, and the three variable sites 44, 51, and 53.
Haplotype Structure. The combination of alleles present at each segregating site in each individual was computed by assigning a specific pair of haplotypes to each individual, as well as a score reflecting the confidence in that assignment. Haplotypes were statistically inferred by PHASE separately for each population group. Resulting haplotypes are arranged according to sequence similarity in Tables 7 and 8. The ABCB11 and ABCB4 reference sequences are composites assembled from different sources as given under Materialsand Methods, and as expected, they were not encountered in our study population.
ABCB4 Haplotype Structure. Altogether, 72 haplotypes, 29 in Caucasians, 22 in Japanese, and 21 in Koreans, were identified with high confidence. Among them were eight common haplotypes: ABCB4_11, ABCB4_12, ABCB4_16, ABCB4_20, ABCB4_23, ABCB4_26, ABCB4_40, and ABCB4_45, accounting for 72% of the 318 chromosomes. Both number and frequency distribution of haplotypes were subject to great interethnic variability. Twenty haplotypes (28%) were specifically detected only in the Caucasian population sample, compared with 10 (14%) Japanese-specific and 11 (15%) Korean-specific haplotypes. The ethnic distribution of the eight most common ABCB4 haplotypes is illustrated in Fig. 3A. ABCB4_12 and ABCB4_20 were the major alleles in all populations and detected at similar frequencies (Caucasian, 32.8% and 18.0%; Japanese, 36.2% and 11.7%; and Korean, 25.0% and 18.8%), whereas ABCB4_45 was a dominant haplotype in both Asian groups. ABCB4_40 and ABCB4_45 were specific to Japanese and Korean individuals, and ABCB4_16 and ABCB4_23 were specific to the Caucasian population.
The reference sequence (GenBank accession number NM_000443) was not found in the entire sample set. However, except for the fact that the random mutation variant ID8 is contained, haplotype-ABCB4_15 is identical to the reference sequence and was found in 8% of Caucasians. The two major allelic variants ABCB4_12 and ABCB4_20 carried five SNPs including one promoter, three intronic, and one synonymous. Of the eight common haplotypes, three (ABCB4_11, ABCB4_12, and ABCB4_16) contained one coding SNP, three (ABCB4_23, ABCB4_26, and ABCB4_45) had two coding SNPs, and none included a nonsynonymous variant.
ABCB11 Haplotype Structure. Altogether, the 53 variant sites segregated as 38 (CA), 41 (AA), and 30 (JA) distinct haplotypes, which were identified with high confidence. Only 38 haplotypes were present in three or more chromosomes. The percentage of chromosomes in the entire population that could be assigned to one of these 38 common haplotypes was 64.1%, and it differed among groups (85.6%, 43.6%, and 64.6% for Caucasian, African-American, and Japanese individuals, respectively). The 10 most common haplotypes, ABCB11_3, ABCB11_4, ABCB11_5, ABCB11_6, ABCB11_11, ABCB11_26, ABCB11_28, ABCB11_37, ABCB11_38, and ABCB11_65, accounted for 57.9%, 23.4%, and 45.8% of the 202 Caucasian, 94 African-American, and 96 Japanese chromosomes, respectively (Fig. 3B).
ABCB11_3, ABCB11_6, and ABCB11_28 were the major alleles in Caucasians, whereas ABCB11_3, ABCB11_26, and ABCB11_37 were dominant in the Japanese and ABCB11_11 and ABCB11_65 were predominant in the African-American population, respectively. The cDNA sequence with the GenBank accession number NM_003742 was defined as reference. With respect to the reference sequence, ABCB11_3, ABCB11_4, ABCB11_5, ABCB11_6, ABCB11_11 ABCB11_37, and ABCB11_38 contain one nonsynonymous cSNP (c.1331T>C) in exon 13 (p.V444A) and three to four linked intronic SNPs in introns 9, 13, and 14, and 20 as a common denominator. Furthermore, ABCB11_4, ABCB11_6, and ABCB11_38 contain two additional intronic and synonymous SNPs in intron 28 and exon 24 (p.A804A). Furthermore, a number of population-specific haplotypes were detected (25 in Caucasians 33 in African Americans, and 22 in Japanese). Most of these unique haplotypes differed only in their promoter or intronic sequence from the common alleles. Since the synonymous polymorphism in exon 6 (p.I134I) is specific for the African-American population, all haplotypes containing this SNP are specific for this ethnic group (12 haplotypes). Similarly, all haplotypes containing the Japanese-specific variant in exon 4 (p.D36D) and exon 9 (p.Y269Y) are specific for the Japanese population (12 haplotypes).
ABCB4andABCB11DNA Promoter Activity. Reporter gene activity was measured for all three constructs in HepG2 and Huh7 cells after transfection of luciferase gene under the control of the 5′ flanking region of ABCB4. Neither the ABCB4-A-wt nor the ABCB4-A-Typ2A constructs showed promoter activity in either cell line, as compared with the strong induction of luciferase activity that was conferred by construct ABCB4-B-wt (Fig. 4).
The luciferase activity of 13 ABCB11 promoter constructs comprising mutations at 19 segregating sites was compared with wild type in HepG2 cells (Fig. 5). In two independent experiments, promoter constructs C_2 and C_3 showed approximately 50% lower activity than wild type (p < 0.0001 and p = 0.043 for C_2 and p < 0.0001 and p = 0.045 for C_3; GML univariate procedure with Bonferroni post hoc test). C_2 and C_3 differ from wild type at four positions and, additionally, C_3 differs from C_2 and wild type at position g.-15150. Since the latter variant was present in a number of promoter constructs without showing any effect, only C_2 was investigated further. To confirm results, luciferase activity of C_2 was investigated at various CDCA concentrations (Fig. 6). At all CDCA concentrations, C_2 demonstrated a statistically significantly lower activity compared with the wild-type construct (p = 0.0072; GML repeated measures analysis).
Discussion
We evaluated genetic variability, linkage disequilibrium, and haplotype profiles of hepatic efflux transporter genes ABCB4 and ABCB11 in 292 individuals of different ethnic populations.
Genetic variation in ABCB4 and ABCB11 has recently been investigated with respect to their potential pathogenetic role in primary biliary cirrhosis and primary sclerosing cholangitis. In these case-control studies, our healthy Caucasian population sample served as control group (Pauli-Magnus et al., 2004a,b). In the study at hand, however, ABCB4 and ABCB11 genotyping data, including those obtained from Caucasians, have been computed independently. Furthermore, ABCB4 haplotype analysis considered 13 Prom variants. Our findings with respect to haplotype structures and linkage disequilibrium correspond essentially with those reported from Pauli-Magnus et al. (2004a,b). Not surprisingly, this applies also for ABCB4. Most of the ABCB4 haplotypes reported from Pauli-Magnus et al. (2004a,b) were inferred in our analysis as well. PromA variants clustered in one block resulted in 18 additional haplotypes (MDR3_1, 2, 3, 23, 24, 25, 26, 27, 28, 29, 43, 45, 47, 48, 49, 67, 68, and 69).
A major goal was to establish potential hereditary markers as a prerequisite for assessing interindividual genetic variability of cholestatic liver injury such as DIC or ICP. More than 30 mutations in ABCB4 and ABCB11 have been associated with lack or low level of protein expression, or cholestatic liver injury such as PFIC2, PFIC3, DIC, or ICP (Pauli-Magnus and Meier, 2003), although the causative role of most of the mutations for etiology remains to be established. Not surprisingly, most of these rare variants were not detected in our samples. Only two variants from our study had been related to cholestatic disease earlier, ABCB4 c.523A>G (p.T175A), the second most prevalent nonsynonymous change in Caucasians (Rosmorduc et al., 2001), and the most prevalent nonsynonymous variant, ABCB4 c.1954G>A (p.R652G), which was present in all of our population samples (Jacquemin et al., 2001). In addition to these known genetic variants, 13 ABCB4 and 23 ABCB11 coding variants, among them 3 and 6 missense mutations, have been described for the first time. The functional relevance of these novel variants is unknown. Although their absence in various cholestatic diseases and their presence in healthy volunteers may cast doubt on a significant contribution to risk for more common cholestatic diseases such as primary biliary cirrhosis and primary sclerosing cholangitis, a role of these variants for DIC or ICP cannot be excluded. A mutation may have sufficient bile salt excretion under normal conditions but, combined with another mutation or in circumstances such as systemic infection, pregnancy, or drug intake, may lead to clinical symptoms. For instance, the MDR3 variant p.R652G was present in PFIC3 patients and healthy subjects and hence cannot, alone, be sufficient to cause cholestasis. It was suggested that p.R652G had no or mild consequences under normal conditions, but resulted in cholestasis during pregnancy (Jacquemin et al., 2001). This view is further supported by the detection of compound heterozygosity. A splicing mutation (+3)A>C (intron 4) combined with a frameshift mutation in exon 22 (p.K930X), resulting in a PFIC2 phenotype and two nonsynonymous variants in exon 9 (p. E297G) and in exon 12 (p.R432T), were encountered in the patient exhibiting a BRIC (benign recurrent intrahepatic cholestasis) phenotype. Single occurrence of these mutations did not confer any disease risk (Noe et al., 2005). Furthermore, a contribution of less common mutations, which have been detected only as heterozygotes, to familial cholestatic disease cannot be excluded. Familial diseases such as PFIC follow a recessive trait caused by rare mutants; homozygous carriers of the mutation are affected, whereas heterozygous carriers appear phenotypically normal. Family studies are required to identify rare mutations that are causative for hereditary recessive disorders, whereas carefully designed, sufficiently powered case-control studies with well defined cholestatic phenotypes (e.g., ICP or DIC patients) and matched healthy controls could establish whether the detected ABCB4 and ABCB11 genetic variants are of any clinical significance.
Our study provides the common genetic variability in various ethnic populations to facilitate such case-control studies in the future. In the absence of time-consuming and labor-intensive in vitro assays for prediction of a possible functional role of nonsynonymous variants, computational tools can serve as a guide to prioritize those genetic variants that are most likely functionally effective for further analysis. We combined several computational tools that categorize the type of amino acid change based on physicochemical considerations and grade of evolutionary sequence conservation (Grantham, SIFT, PolyPhen, BLOSUM62, EC/EU). The validity of this approach has been shown recently for the organic cation transporter OCT1 (Shu et al., 2003). Furthermore, disease-causing mutations were more prevalent at evolutionarily conserved sites (Miller and Kumar, 2001). Although experimentally not validated, we conclude that the amino acid changes p.R590Q, p.E1099G (ABCB4), p.R616G, and p.T619A (ABCB11) are the strongest candidates to alter protein function and subsequently biliary excretion (Table 5).
Population genetic analysis resulted in negative Tajima's D values, suggesting that ABCB4 and ABCB11 are under some selective pressure against changes in the protein. Compared with the closely related xenobiotic transporter ABCB1 (Kroetz et al., 2003), ABCB4 and ABCB11 are genetically less diverse and display only few protein-altering changes and particularly few at higher frequency. Possibly, ABCB4 and ABCB11 are functionally more constrained, reflecting their important endogenous role. Unlike ABCB1 with its primary role in xenobiotics detoxification, ABCB4 and ABCB11, although involved in disposition of xenobiotics to some extent, primarily maintain homeostasis of endogenous cholephilic compounds. Moreover, protein-altering variants occurred at lower frequency than synonymous changes. Only 1 of 8 ABCB4 and 2 of 10 ABCB11 amino acid changes occurred at ≥5%, compared with half the synonymous sites (3 of 6 sites in ABCB4 and 9 of 18 sites in ABCB11). This finding is compliant with the hypothesis that deleterious mutations, which are more likely to be at nonsynonymous sites, are kept at low frequency and spread less easily to multiple populations (Fay et al., 2001).
Haplotype-based approaches taking into consideration the combination of SNPs, which ultimately represent the functional unit of the gene, are particularly useful to correlate cellular and clinical phenotypes with a specific gene. They have been proven very useful in small population samples, even when correlation analysis with SNPs fails (Drysdale et al., 2000). Moreover, attempts to draw associations between phenotypes and genetic variations are more likely to succeed when the SNPs used in such studies have been confirmed to be in linkage disequilibrium by methods such as haplotyping. Thus, understanding the haplotype structure and to what extent diversity and distribution of haplotypes vary across populations is important for designing and interpreting genotype-phenotype association studies. Our haplotypes were statistically inferred without experimental validation of the molecular linkage of SNPs. A few SNPs that do not fit the model could be overlooked, and analyzing a large genomic region is likely to have a certain error rate deteriorating the precision of haplotype prediction. This is a potential source for errors, although it is unlikely to affect the result of association analyses, which are based on more common haplotypes. In contrast to other ABC transporters such as ABCB1 (Kroetz et al., 2003), ABCB4 and ABCB11 had no predominant haplotype. Instead, there were multiple haplotypes, most of which were observed in several populations, that accounted for a large fraction of genomic variability. The two most common ABCB4 haplotypes were observed in 25 to 36% and 12 to 19% of all populations, whereas the four most common ABCB11 haplotypes occurred in approximately 15% of distinct populations (Fig. 3, A and B). There was distinct interethnic variability in the total number and the pattern of population-specific haplotypes, and haplotypes that were observed in all populations showed ethnically distinct population frequencies. Haplotype-based association studies to predict MDR3 or BSEP phenotypes need to carefully consider this ethnic population substructure. Smit et al. (1995) characterized 3 kb of the 5′-flanking region of ABCB4 upstream of the translation start in exon 2 and located the core promoter. Genetic variants of this core promoter have been tested for differential activity in luciferase assays (Pauli-Magnus et al., 2004). In addition, an aberrant 6-kb ABCB4 cDNA (GenBank accession number Z35284) containing four untranslated exons (–1, –2, –3, and –4) indicated the presence of an alternative transcription start upstream of exon –4 (Smit et al., 1995). In our luciferase reporter gene assays, plasmid constructs containing the alternative promoter of ABCB4 (PromA) showed no significant activity in the liver cell lines HepG2 and Huh7. Although we could not detect promoter activity, the function of this putative regulatory element remains unclear. It may only be active in entity with the entire ABCB4 gene, in conjunction with the core promoter in a tissue-specific manner, or only under certain physiological conditions. From 13 ABCB11 promoter haplotypes comprising 19 variants, two (C_2 and C_3) altered promoter activity compared with wild type. At least one of variants 13, 15, or 83 is responsible for the significant decrease of promoter activity. A very interesting candidate is the short allele of a variable-length [T]n polymorphism (9–13 T-repeats, IDs 83–86, respectively), which is common in all populations (Tables 5, 7, and 8). Promoter nucleotiderepeats have been previously reported to be required for the binding of transcription factors with the length of the T-stretch influencing transcription (Beutler et al., 1998).
In conclusion, our findings give a comprehensive overview on genetic variability, haplotype structure, ethnic diversity, and linkage disequilibrium of two important hepatic transporters. Based on computational predictions, it seems likely that some genetic variants may contribute to interindividual variability in MDR3 and BSEP function. In contrast to xenobiotic transporters such as MDR1, which control the access of substrates to pharmacological sanctuaries, changes in MDR3 and BSEP transport function may have their greatest impact on occurrence, pattern, and prognosis of cholestatic liver injury. Furthermore, they may modulate the individual sensitivity to drug-mediated inhibition of MDR3 and BSEP transport and, consequently, the individual sensitivity to DIC. Further in vitro approaches and clinical studies need to clarify the potential functional and clinical consequences of the variants identified in this study and whether these contribute to cholestatic disease or to altered sensitivity for drug-mediated inhibition of MDR3 or BSEP transport.
Footnotes
-
Article, publication date, and citation information can be found at http://dmd.aspetjournals.org.
-
Supported by Grants 01GG9846 and 01GG9848 from the German Federal Ministry for Education and Science (Bundesministerium für Bildung und Forschung).
-
doi:10.1124/dmd.105.008854.
-
ABBREVIATIONS: ABC, ATP-binding cassette; MDR, multidrug resistance; BSEP, bile salt export pump; DIC, drug-induced cholestasis; PFIC, progressive familial intrahepatic cholestasis; ICP, intrahepatic cholestasis of pregnancy; SNP, single nucleotide polymorphism; LD, linkage disequilibrium; PCR, polymerase chain reaction; kb, kilobase(s); bp, base pair(s); EC, evolutionarily conserved; EU, evolutionarily unconserved; CDCA, chenodeoxycholic acid; UTR, untranslated region; SIFT, sorting intolerant from tolerant; PolyPhen, Polymorphism Phenotyping; wt, wild type; AA, African American; CA, Caucasian; JA, Japanese, KO, Korean; PSIC, position-specific independent count.
- Received December 7, 2005.
- Accepted June 7, 2006.
- The American Society for Pharmacology and Experimental Therapeutics