Abstract
The interaction of competitive type inhibitors with the active site of cytochrome P450 (CYP) 2C9 has been predicted using three- and four-dimensional quantitative structure activity relationship (3D-/4D-QSAR) models constructed using previously unreported and literature-derived data. 3D-QSAR pharmacophore models of the common structural features of CYP2C9 inhibitors were built using the program Catalyst and compared with 3D- and 4D-QSAR partial least-squares models, which use molecular surface-weighted holistic invariant molecular descriptors of the size and shape of inhibitors. The Catalyst models generated from multiple conformers of competitive inhibitors of CYP2C9 activities contained at least one hydrophobic and two hydrogen bond acceptor/donor regions. Catalyst model 1 was constructed with Ki(apparent) values for inhibitors of tolbutamide and diclofenac 4′-hydroxylation (n = 9). Catalyst model 2 was generated from literature Ki(apparent) values for (S)-warfarin 7-hydroxylation (n = 29), and Catalyst model 3 from literature IC50 values for tolbutamide 4-hydroxylation (n = 13). These three models illustrated correlation values of observed and predicted inhibition for CYP2C9 of r = 0.91, 0.89, and 0.71, respectively. Catalyst pharmacophores generated withKi(apparent) values were validated by predicting the Ki(apparent) value of a test set of CYP2C9 inhibitors also derived from the literature (n = 14). Twelve of fourteen of theseKi(apparent) values were predicted to be within 1 log residual of the observed value using Catalyst model 1, whereas Catalyst model 2 predicted 10 of 14Ki(apparent) values. The corresponding partial least-squares molecular surface-weighted holistic invariant molecular 3D- and 4D-QSAR models for all CYP2C9 data sets yielded predictable models as assessed using cross-validation. These 3D- and 4D-QSAR models of CYP inhibition will aid in future prediction of drug-drug interactions.
The cytochrome P450s (CYPs)2 have been described as enzymes with broad substrate specificity responsible for the metabolism of many drugs. Understanding the nature of the substrate specificity of each CYP requires knowledge of the interaction of drugs with each CYP active site. However, due to the absence of a crystallized human CYP, attempts to describe human CYP active sites have been restricted to homology modeling with the crystal structures of bacterial CYPs, pharmacophore, and three-dimensional-quantitative structure activity relationship (3D-QSAR) studies (Smith et al., 1997a,b). In particular, 3D-QSAR studies have been very useful in rationalizing active sites for many data sets of ligands or substrates for receptors and enzymes (Kubinyi, 1997), suggesting their use for modeling CYPs as demonstrated recently (Ekins et al., 1999a,b,c,d).
CYP2C9 is one of the most important CYPs involved in drug metabolism in humans (Miners and Birkett, 1998) and is involved in the metabolism of many commonly used polar acidic drugs (Hall et al., 1994). CYP2C9 is also competitively inhibited by nonsteroidal anti-inflammatory drugs (Leeman et al., 1992; Newlands et al., 1992) as well as other classes of drugs (Schmider et al., 1997). Therefore, drug interactions with CYP2C9 are important in vivo and have been reviewed (Miners and Birkett, 1998). The utility of predicting the ability of a compound to inhibit CYP2C9 early in drug development would therefore greatly increase the efficiency of this process. There has been a steady progression in understanding the active site of CYP2C9 using a number of approaches. These studies have involved using overlapped substrates to define a binding template (Jones et al., 1993, 1996a). Others have used tienilic acid derivatives (Mancy et al., 1995), phenytoin analogs, and bis-triazole antifungals (Morsman et al., 1995) to build structure activity relationships to aid in understanding the substrate and inhibitor specificity of CYP2C9. In addition, site-directed mutagenesis has suggested the importance of the I-helix residues Ser286 and Asn289 for conferring specificity for the substrates diclofenac and ibuprofen (Klose et al., 1998). NMR and molecular modeling have also been combined to assist in defining the positioning of substrates in the CYP2C9 active site (Poli-Scaife et al., 1997). This work suggested two structural determinants for substrate recognition, namely, an anionic site and a hydrophobic site (Poli-Scaife et al., 1997). Recently a CYP2C9 3D-QSAR comparative molecular field analysis (CoMFA) model was reported that described compounds that inhibited (S)-warfarin 7-hydroxylation and enabled leave one out (LOO) predictions of Ki (apparent) to be performed (Jones et al., 1996b). This model indicated the presence of at least one site of electrostatic interaction between the enzyme and substrate. This first CYP2C9 CoMFA model enabled the refinement and construction of a homology model of the CYP2C9 active site (Jones et al., 1996b). Very recently, site-directed mutagenesis studies have been combined with the homology model to suggest the role of the amino acid residues F114 and V113 in the hydrophobic binding of warfarin (Haining et al., 1999). However, the molecules used in the CoMFA study were structurally similar and were also aligned manually. As with most 3D-QSARs using CoMFA, the structural similarity and manual alignment of the training set limit the usefulness of this CYP2C9 3D-QSAR. The LOO model in this case was also restricted to accurate predictions for structurally similar molecules only, which is obviously a disadvantage if the model were intended to predict Kivalues for molecules from other more diverse structural classes. Therefore, for a predictive 3D-QSAR, structural diversity is required but this comes at the considerable cost of difficulty in alignment of common features of molecules.
Catalyst is a commercially available 3D-QSAR technique that generates a representative set of conformers of molecules in a training set that accounts for the maximum occupation of conformational space of chemical functionalities (Quintana et al., 1995). Catalyst, unlike CoMFA, does not require manual alignment of molecules. Instead, the program generates a model from the chemical features of the appropriate conformers representing features involved in binding interactions with the enzyme after correlating measured and estimated biological activity. Previously, Catalyst was used to build substrate pharmacophores for CYP2B6 (Ekins et al., 1999c) and CYP3A4 (Ekins et al., 1999d), as well as inhibitor pharmacophores for CYP2D6 (Ekins et al., 1999a) and CYP3A4 (Ekins et al., 1999b). In this study, different QSAR modeling approaches, Catalyst and partial least-squares (PLS) molecular surface-weighted holistic invariant molecular (MS-WHIM) 3D- and 4D-QSARs (Klein and Hopfinger, 1998) are compared for CYP2C9 using the in vitro Ki(apparent) and IC50 values for inhibitors.
Experimental Procedures
Materials.
Tolbutamide, diclofenac, chlorpropamide, triethylamine, and NADPH were obtained from Sigma Chemical Co. (St. Louis, MO). Chloroform and acetonitrile were obtained from Burdick & Jackson Laboratories Inc. (Muskegan, MI). Meclofenamate was purchased from Cayman Chemical (Ann Arbor, MI). 4-Hydroxy tolbutamide was purchased from Research Biochemicals International (Natick, MA). 4′-Hydroxy diclofenac was obtained from Gentest Corp. (Woburn, MA) or Ultrafine Chemicals, (Manchester, UK).
Liver Specimens.
Human liver specimens were obtained from the liver transplant unit at the Medical College of Wisconsin (Milwaukee) and the Pathology Department of the Indiana School of Medicine (Indianapolis), under protocols approved by the appropriate committee for the conduct of human research. Microsomes were prepared from these specimens using differential centrifugation (van der Hoeven and Coon, 1974).
Microsomal Incubations.
All incubations containing human liver microsomes were carried out with 1 mM NADPH in 100 mM sodium phosphate pH 7.4 and underwent a 3-min preincubation at 37°C. Inhibitors were used at four concentrations whereas the following substrates were used at five different concentrations.
Tolbutamide 4-Hydroxylation.
Using a previously published method (Miners et al., 1988), 25–300 μM tolbutamide was incubated with or without differing concentrations of inhibitor for 20 min and the reaction was terminated by the addition of 20 μl of 35% phosphoric acid. The internal standard chlorpropamide (0.01 mg/ml of stock) was added, and the incubation mixture was extracted with 1 ml of chloroform. The vials were then mixed in a shaker for 10 min and centrifuged. The aqueous layer was discarded while the chloroform layer was transferred to clean tubes and evaporated under nitrogen before reconstitution in mobile phase and injection of 50 μl onto the HPLC.
The isocratic HPLC system used a mobile phase consisting of 50 mM potassium phosphate (pH 3)/acetonitrile (73:27) and used a 5-μm SB-CN column, 4.6 × 150 mm. The flow rate was 0.9 ml/min and the UV detector was set at a wavelength of 230 nm.
Diclofenac 4′-Hydroxylation.
Diclofenac (2.5–100 μM) was incubated with or without differing concentrations of inhibitor for 15 min and terminated by the addition of 200 μl of acetonitrile and 10 μl of internal standard, meclofenamate (0.05 mg/ml of stock). The tubes were centrifuged and 50 μl of the supernatant was then injected on the HPLC.
The HPLC system used a mobile phase consisting of 50 mM sodium phosphate buffer, pH 7.4, with 0.003% triethylamine (v/v) and used a betabasic C18 column 5 μm (Keystone Scientific), 4.6 × 50 mm. The flow rate was 1 ml/min and the UV detector was set at a wavelength of 282 nm.
Kinetic Analysis.
The Ki values for inhibition of CYP2C9 were determined using nonlinear regression analysis as described in detail elsewhere (Ring et al., 1995).
Molecular Modeling.
The computational molecular modeling studies were carried out using Silicon Graphics Indigo and Onyx workstations.
Modeling with Catalyst.
Briefly, three models were constructed using Catalyst version 2.3 or 3.1 (Molecular Simulations, San Diego, CA). Catalyst model 1 was constructed with Ki(apparent) values reported in Table 1, model 2 withKi(apparent) values reported by Jones et al. (1996), and model 3 with IC50 values reported by Morsman et al. (1995). The 3D structures of CYP2C9 inhibitors were built interactively as previously described for other CYPs (Ekins et al., 1999a,b,c,d). The number of conformers generated using the ‘best’ functionality for each inhibitor was limited to a maximum number of 255, with an energy range of 20 kcal/mol. Ten hypotheses were generated using these conformers for each of the molecules, and theKi(apparent) or IC50values after selection of the following features for the inhibitors: hydrogen bond donor, hydrogen bond acceptor, hydrophobic, and negative ionizable. After assessing all 10 hypotheses generated, the lowest energy cost hypothesis was considered the best, as this possessed features representative of all the hypotheses and had the lowest total cost. This procedure was followed for all training sets.
The goodness of the structure activity correlation between the estimated and observed activity values was estimated by means of an r value. Statistical significance of the retrieved hypothesis was verified by permuting (randomizing) the response variable, i.e., the activities of the training set compounds were mixed a number of times (so that each value was no longer assigned to the original molecule) and the Catalyst hypothesis generation procedure was repeated. The total energy cost of the generated pharmacophores can be calculated from the deviation between the estimated activity and the observed activity, combined with the complexity of the hypothesis (i.e., the number of pharmacophore features). A null hypothesis can also be calculated that presumes that there is no relationship in the data and the experimental activities are normally distributed about their mean. The greater the difference between the energy cost of the generated hypothesis and the energy cost of the null hypothesis, the less likely it is that the hypothesis reflects a chance correlation.
Validation of Catalyst CYP2C9 Pharmacophores Using a Test Set of Ki(apparent) Values.
The test set of Ki(apparent) values for 14 molecules was not included in the initial training sets for either Catalyst model 1 or 2 that were generated withKi values. These test set molecules were fit by fast-fit and best-fit methods to the Catalyst hypothesis for each Ki(apparent) model to predict aKi(apparent) value as previously described for other CYPs (Ekins et al., 1999a,b,c,). Fast fit refers to the method of finding the optimum fit of the substrate to the hypothesis among all conformers of the molecule without performing an energy minimization on the conformers of the molecule. The best-fit procedure starts with fast fit and allows individual conformers to flex over a given energy threshold of 20 kcal/mol. This allows examination of more conformational space and minimizes the distance between the center of the hypothesis features and their mapping to atoms on the molecule. The result of best fit is a conformer whose energy is generally higher than the starting one. The conformer selected after this process was the one giving the prediction for best fit (Catalyst tutorials release 3.0; MSI, San Diego, CA.). Both of these predictions were compared by means of a residual, which was calculated from the difference (in log units) between predicted and observed Ki(apparent)values. A predicted Ki(apparent) value within one log unit of the observedKi(apparent) value was considered to be a valid prediction of fit.
3D- and 4D-QSAR Modeling PLS MS-WHIM.
Each molecule was coded into a SMILES (simplified molecular input line entry system) string format (Weininger, 1988). Atomic 3D coordinates and Gasteiger Huckel charges were generated by CONCORD 3.2.1 (CONCORD user's manual; Tripos Inc., St. Louis, MO). MS-WHIM descriptors were computed using the program EL3DMD (Bravi et al., 1997; Bravi and Wikel, 2000a,b) as previously described for other CYPs (Ekins et al., 1999a,b,c).
MS-WHIM descriptors are a set of statistical parameters that contain information about the structure of the molecules in terms of size, shape, symmetry, and distribution of molecular surface point coordinates after “weighted” centering and principal component analysis (PCA). The following weights were applied: 1) unweighted value, 2) positive electrostatic potential, 3) negative electrostatic potential, 4) hydrogen bonding acceptor capacity, 5) donor capacity, and 6) hydrophobicity, which yield a total of 72 descriptors (17 for each weight; see Bravi et al., 1997 for details on MS-WHIM calculation).
To generate 4D-QSAR this whole procedure was repeated after first using the CATCONF program within Catalyst to produce multiple conformers of the inhibitors. For each molecule and for each descriptor the mean, the highest and lowest values, the range, and the S.D. over the conformations were computed. This resulted in a maximum of 504 descriptors (85 for each weight).
PLS (Wold et al., 1993) was applied to correlate MS-WHIM descriptors with the observed Ki(apparent) values. Because variance associated with different descriptors can be very different, MS-WHIM was autoscaled so as to assign unit variance to each descriptor. Activity data were transformed as log(1/Ki) or log(1/IC50). The optimum number of components in each PLS model generated was determined through two cross-validation procedures: 1) LOO and 2) five random groups repeated up to 100 times (5RG×100) (Baroni et al., 1993). Within the latter protocol, training set molecules are randomly assigned to five groups. Because results are strongly influenced by initial random assignment, the entire cross-validation procedure is repeated many times to achieve a meaningful statistical result. The predictive power of the PLS model was evaluated by means of q2 and S.D. of error of prediction (Baroni et al., 1993) computed as follows:
Results
Catalyst CYP2C9 Pharmacophore Models.
Catalyst uses a collection of molecules with CYP2C9 inhibitory activity over a number of orders of magnitude to enable construction of hypotheses to explain the variability of inhibition with respect to the position of features present in the inhibitors. TheKi values reported in Table 1 for inhibitors of tolbutamide 4-hydroxylation and diclofenac 4′-hydroxylation vary over 27-fold. However, the structures in this data set are diverse (Fig. 1). The lowest energy pharmacophore constructed by Catalyst and referred to as model 1 produced four features suggested to be necessary for inhibition of CYP2C9, namely, two hydrophobes, one hydrogen bond donor, and one hydrogen bond acceptor (Fig. 2, Table2). The Catalyst model 1 demonstrated a good correlation of observed versus estimatedKi values (r = 0.91, Table3). This initial CYP2C9 hypothesis possessed a total energy cost of 42.1 (arbitrary units) whereas the energy cost of the null hypothesis was 33.9 (Table 3).
Two previously published in vitro CYP2C9 inhibition data sets were used in this study as a data source for development of 3D-QSAR models (Morsman et al., 1995; Jones et al., 1996b). TheKi values for CYP2C9 using the 7-hydroxylation of (S)-warfarin as the CYP2C9 catalytic activity cover 2 orders of magnitude but the molecules are structurally similar (Jones et al., 1996b) (Table 3). The lowest energy pharmacophore from this data set referred to as Catalyst model 2 produced four features necessary for inhibition of CYP2C9, namely, one hydrophobe, two hydrogen bond donors, and one hydrogen bond acceptor (Fig. 3, Table4). The Catalyst model 2 demonstrated a good correlation of observed versus estimatedKi value (r = 0.89, Table2). Catalyst model 2 possessed a total energy cost of 118.2 whereas the energy cost of the null hypothesis was 130.5 (Table 3). The IC50 values for CYP2C9 inhibition using the 4-hydroxylation of tolbutamide as the substrate probe, covered 1 order of magnitude and consisted of two structural classes, phenytoin and bis-triazole analogs (Morsman et al., 1995) (Table 3). The lowest energy pharmacophore for this data set referred to as Catalyst model 3, produced three features necessary for inhibition of CYP2C9, namely, one hydrophobe and two hydrogen bond acceptors (Fig.4, Table5). Catalyst model 3 demonstrated a reasonable correlation of observed versus estimated IC50 (r = 0.71, Table 3) and a higher total cost (59.9) than the null hypothesis (47.9).
An accepted method to validate the Catalyst pharmacophores (Ekins et al., 1999a,b,c,d) involves permuting the activities with their corresponding structures. This procedure should produce no significant model (decreased r value), altered pharmacophore features compared with the original hypothesis, and an energy cost value close to or greater than the null hypothesis. Following this procedure twice, Catalyst model 1 demonstrated total energy costs of 35.9 and 42.8 and the pharmacophore fit had decreased, although the same features were selected (Table 6). After permuting (as described above) Catalyst model 2 there was no difference in the total energy cost (130.3) when compared with the null hypothesis (130.5), the pharmacophore features changed, and the fit decreased. In contrast, following permuting Catalyst model 3, it demonstrated a higher hypothesis total cost (55) than the null hypothesis (47.9), and the number of pharmacophore features decreased along with the correlation (Table 6).
Catalyst CYP2C9 Pharmacophore Validation Using a Test Set ofKi(apparent) Values.
After constructing the Catalyst 3D-QSAR models with a training set of Ki(apparent) values, models 1 and 2 were used to predict Ki(apparent) values of a test set of molecules (Table 7). Predicted Ki(apparent) data were generated from both models for this test set of R-7-fluorowarfarin,S-7-fluorowarfarin, Rac-6-fluorowarfarin, Rac-6,7,8-trifluorowarfarin, S-warfarin alcohol, ipriflavone, 7-hydroxyisoflavone, ticlopidine, lansoprazole, fluvoxamine, paroxetine, sertraline, omeprazole, desmethylcitalopram, and desmethylsertraline. These predictedKi(apparent) data were then compared withKi(apparent) values reported in the literature, which had been generated in human liver microsomes. Twelve of fourteen molecules were predicted within the 1 log unit residual cutoff, using the fast fit function by Catalyst model 1, whereas 10 of 14 were predicted within the 1 log unit residual cutoff using the best fit function. For catalyst model 2, 10 of 14 molecules were predicted within the 1 log unit residual cutoff using the fast fit function, whereas 7 of 14 molecules were predicted within the 1 log unit residual cutoff using the best fit method.
PLS MS-WHIM 3D- and 4D-QSAR Models of CYP2C9 Data.
The PLS MS-WHIM method initially used the alignment of single molecular conformations to produce a 3D-QSAR. However, by using multiple conformers and, therefore, multiple alignments of molecules, this technique can be used as a 4D-QSAR. The PLS MS-WHIM model for single conformers, 3D-QSAR, of inhibitors of tolbutamide 4-hydroxylation and diclofenac 4′-hydroxylation (Table 1) generated a LOO q2 value of 0.40. This value improved on inclusion of multiple conformers, 4D-QSAR, of inhibitors to give a LOO q2 value of 0.54 (Table8). A significant 3D-QSAR model could not be generated for the CYP2C9 Ki model derived from inhibition of (S)-warfarin 7-hydroxylation (Jones et al., 1996b) as the q2 value was less than 0.3. However, in this case 4D-QSAR was able to generate a q2 value of 0.64 with five components consisting of the molecular surface properties of negative electrostatic potential, hydrogen bond acceptor, hydrogen bond donor, and hydrophobicity (Table 8). Using the more credible cross-validation protocol of 5RG×100 (described earlier) also resulted in a significant 4D-QSAR model for the Jones et al. data set as assessed by the 5RG q2 value of 0.55. In contrast, a 3D-QSAR was generated for CYP2C9 using the IC50 data derived from the inhibition of 4-hydroxylation of tolbutamide (Morsman et al., 1995) with a LOO q2 value of 0.5 with the molecular surface weights of positive electrostatic potential and hydrogen bond acceptor. The q2 value was not improved by using 4D-QSAR of the inhibitors in this case, as no significant model could be generated. The S.D. estimate predictions are directly related to the uncertainty of predictions and provide an estimate of confidence limits in the data. These values were comparable in the cases where both 3D- and 4D-QSARs could be generated.
Random permuting of the response variable a number of times (so that the inhibition values no longer corresponded to the same molecules) was adopted to test the validity of the obtained models. This procedure should not result in a predictive model. In each case, a q2 of zero or lower was obtained, indicating that permuting the response variable does not result in a predictive PLS MS-WHIM model. This approach confirms that the cross-validated PLS MS-WHIM models obtained for CYP2C9 with a q2 > 0.3 are statistically valid.
Discussion
The investigation of substrate and inhibitor requirements for CYP2C9 has started to receive increased attention. Initial work suggested a schematic model for CYP2C9 binding that contained hydrophobic and hydrogen bonding regions (Guengerich et al., 1986;Distelrath and Guengerich, 1987). The structural requirements of CYP2C9 were initially speculated on by others (Smith and Jones, 1992) and were followed by a preliminary active site model of CYP2C9 constructed with seven substrates (Jones et al., 1993) [later expanded to eight substrates and one inhibitor (Jones et al., 1996a)]. This preliminary model indicated the distance and angle between the site of metabolism and the site of the hydrogen bond donor was 6.7 Å and 133°, respectively. Another group, modeling 12 CYP2C9 substrates, suggested the site of hydroxylation is 7.8 Å away from an anionic site, which is in turn 4 Å from a cationic site on the protein, separated by an angle of 82.3° (Mancy et al., 1995). Both of these active site models have six compounds in common and used low-energy conformers for model building derived from different software, which might account for some of the dissimilarities in the final models (Mancy et al., 1995; Jones et al., 1996a). A more recent CYP2C9 active site model using 10 molecules indicated a distance of 6.5 Å from the site of metabolism to the hydrogen bond donor (Lewis et al., 1998). A study using bis-triazole and phenytoin analogs as inhibitors of CYP2C9 has also shown the importance of hydrogen bonding for potency (Morsman et al., 1995). There have been no reports of models being constructed with this data, neither have there been any details published on distances and angles between chemical features published for CYP2C9 inhibitors. This study demonstrates the use of CYP2C9 3D- and 4D-QSAR inhibitor models. The CYP2C9 3D-QSAR models were used to satisfactorily predict the Ki(apparent) values of molecules not present in the training set of the models.
The data set in Table 1 derived from nine inhibitors of tolbutamide and diclofenac hydroxylations produced Catalyst model 1 with an r value of 0.91 with four pharmacophore features (Table 2). The small difference of 11.8 arbitrary units between the hypothesis and null cost changed in one of two cases of permuting. The correlation decreased but the pharmacophore features maintained were similar to the initial hypothesis. This suggests that Catalyst model 1 was suboptimal (Table6). The Catalyst model 2 constructed from 29 structurally similar CYP2C9 inhibitors of (S)-warfarin 7-hydroxylation (Jones et al., 1996b) produced a model with an r value of 0.89 and four pharmacophore features (Table 3). Also, in agreement with a previous report, there was at least one hydrogen bond donor and acceptor pharmacophore feature in the Catalyst model of this data set (Table 3) (Jones et al., 1996b). The hypothesis total cost was less than the null cost value indicative of a significant model. On permutation, Catalyst model 2 illustrated almost identical total and null energy costs, decreased correlations, and altered pharmacophore features (Table 6), suggesting that the initial Catalyst model was valid. Further literature data sets of CYP2C9 inhibitors have also been published documenting IC50 values for inhibition of tienilic acid hydroxylation (Mancy et al., 1995) and tolbutamide hydroxylation (Morsman et al., 1995). The Morsman et al. data set contained two classes of structural analogs, and it was decided to model it using the same criteria as applied to the twoKi(apparent) data sets. Catalyst model 3 constructed from this data with phenytoin analogs and bis-triazole antifungals (Morsman et al., 1995) produced a good r value of 0.71 for three pharmacophore features even though the hypothesis total cost was higher than the null cost (59.9 and 47.9, respectively; Table 3). After permuting, the r value decreased to less than 0.3, the pharmacophore features changed, but the total cost value did not change significantly (Table 6), suggesting that the original model was valid. In summary, permuting resulted in a decrease in the r value for all of the models, suggesting that this may be a useful test for model validity. In addition, all three Catalyst CYP2C9 inhibitor pharmacophores possessed at least one hydrophobic region and a hydrogen bond acceptor, suggesting that these are common features of all CYP2C9 inhibitors.
Previous studies have shown that even with small hypothesis-null hypothesis energy cost differences, valuable models of CYPs can be constructed (Ekins et al., a,b,c,d). This is demonstrated by the prediction of Ki(apparent) values for a test set of compounds not included in the training set for model building. In this study using the fast fit method to Catalyst model 1, predictions of Ki (apparent) were within 1 log unit residual for 12 of 14 inhibitors. Using the best fit method, the number of molecules predicted within this cutoff decreased to 10 of 14. Catalyst model 2 (data of Jones et al., 1996b) produced 10 of 14 and 7 of 14 predictions within the 1 log unit cutoff when the fast fit and best fit methods, respectively, were used. Even though the training set used to build Catalyst model 1 was relatively small (n = 9), it appeared to have produced more realistic predictions than the Catalyst model 2 built with the much larger data set (n = 29) of Jones et al. This is likely due to the structural diversity of the training data set for model 1 and the varied structural nature of the test set. Although indicative of important features necessary to fit in the active site, Catalyst model 2 would be expected to be better at predicting theKi value of molecules similar in structure to that of its training set. However, this may not be strictly true as fluorowarfarin analogs were, on the whole, poorly predicted (Table 7). To assess the suitability of the test set, we compared theKi(apparent) range (expressed as the log of the micromolar Ki(apparent) value) for the test and training sets. The test setKi(apparent) range (<0–1.82) was not concentrated around the mean Ki(apparent)value (1.24). Similarly, the Ki(apparent)range for Catalyst model 1 (0.54–1.97) was not concentrated around the mean Ki(apparent) value (1.18). In this case, the mean Ki(apparent) values of both test and training sets were similar. The range of the Catalyst model 2Ki(apparent) values (−1–1.84) was larger than the test set and Catalyst model 1. The meanKi(apparent) value in this case (0.82) was smaller than the mean Ki(apparent) values for Catalyst model 1 and the test set. These observations suggest that the test set used for both models was suitable for interpolating predictions from both training sets. The assessment ofKi prediction appears to be a useful approach for determining the quality of these CYP2C9 Catalyst models and may ultimately be more valuable as a determinant of Catalyst 3D-QSAR model quality than the cost difference between the hypothesis and null hypothesis.
Using the Ki(apparent) value data in Table1, PLS MS-WHIM models were also constructed (Table 8). Due to the nature of this 3D- and 4D-QSAR approach, there is no graphical output and the models are, therefore, solely assessed by means of the LOO q2 value. The 3D-QSAR derived using PLS MS-WHIM produced a LOO q2 value of 0.4 for eight of nine inhibitors with the weights negative potential and hydrogen bond donor. One molecule (LY333531) was excluded, as either its molecular size or flexibility exceeded the limit imposed by our PLS MS-WHIM method. In contrast, the 4D-QSAR generated with this same software illustrated an improvement in the q2value to 0.54 (Table 8). The q2 values represent predictive models using the criteria of Cramer et al. (1993), who have suggested that a q2 value greater than 0.3 indicates that the model is meaningful (predictive) and a q2 of 1.0 is optimal. The CoMFA study of Jones et al. (1996b) used PLS to analyze the steric and electrostatic features of 27 of 29 ligands along with the inverse logarithm of the binding affinity, which resulted in a q2 value of 0.7. In this study, PLS MS-WHIM was used to modelKi data for all 29 ligands reported in the Jones et al. CoMFA study. A significant 3D-QSAR was not produced for this data set (q2 < 0.3), but a significant 4D-QSAR was generated for the following weights: negative potential, hydrogen bond acceptor, hydrogen bond donor, and hydrophobicity. This yielded a 5RG q2 value of 0.64 (Table 8), similar to that published for 3D-QSAR of Jones et al. (1999b). The same PLS MS-WHIM approach was used for the phenytoin analogs and bis-triazole antifungal CYP2C9 inhibitors reported by Morsman et al. This model generated a LOO q2 value of 0.5 for 3D-QSAR and a 5RG q2 value of 0.45 for 4D-QSAR using the weights positive potential and hydrogen bond acceptor (Table8). The data from 4D-QSAR would suggest that using multiple conformations for these three data sets (Fig. 1) produce valid models as assessed by q2 validation with values >0.45 (Table 8). In all three cases, there is also some degree of agreement between the PLS MS-WHIM feature weights and the features identified by Catalyst. In the future, it will be interesting to evaluate the predictive nature of these PLS MS-WHIM models using the test set detailed in this study. A comparison of both 3D- and 4D-QSAR versions of PLS MS-WHIM would also be ideal to indicate how maximizing conformational space affects the predictions. Also, it may be useful to obtain a test set of IC50 values to enable quantitation of the predictive ability of the IC50 model described in this study.
The 3D arrangement of Catalyst pharmacophore features proved interesting to evaluate (Tables 2, 4, and 5). Qualitatively, the general shape of all three Catalyst CYP2C9 pharmacophores were similar with the distances between a hydrogen bond acceptor and a second hydrogen bond acceptor/donor being 3.4 to 5.7 Å (Figs. 2-4). Secondly, a hydrophobic feature was positioned 3 to 5.8 Å from a hydrogen bond acceptor. All of these pharmacophores fit within the substrate template distances between residue contacts and sites of metabolism suggested for CYP2C9 modeled after alignment with the bacterial CYPBM-3 (Lewis et al., 1998). Among all three CYP2C9 substrate models previously reported (Jones et al., 1993, 1996a; Lewis et al., 1998) there is little difference in the diversity of molecules used. This is unlike the present study in which three models were constructed from unique data sets composed of a range of structurally diverse CYP2C9 inhibitors. These multiple data sets were important as they helped confirm multiple interaction determinants as necessary for inhibitors (hydrophobic and hydrogen bond acceptor/donor features) within the CYP2C9 active site. This is in agreement with what is known for substrates of CYP2C9 (Guengerich et al., 1986; Distelrath and Guengerich, 1987; Poli-Scaife et al., 1997; Haining et al., 1999). Interestingly, these multiple substrate and inhibitor binding determinants may explain the nonMichaelis-Menten kinetics displayed for desmethylnaproxen (Tracy et al., 1997) and N-hydroxydapsone formation (Korzekwa et al., 1998). The ability of these CYP2C9 substrates to illustrate both hetero- and homotropic activation in vitro (Korzekwa et al., 1998) is similar to other CYPs (Ekins et al., 1998; Korzekwa et al., 1998). It is possible that the generation of a pharmacophore for substrates that display this non-Michaelis-Menten behavior alone may be necessary for the determination of the specific structural features that elicit these kinetics, as previously demonstrated for CYP3A4 autoactivators (Ekins et al., 1999d).
The 3D- and 4D-QSAR modeling approaches taken in this study represent an indirect examination of the CYP2C9 active site. From these models we have been able to predict the interaction of some compounds with this CYP. Additional studies will undoubtedly refine these CYP2C9 models and may account for the compounds poorly predicted. This will require an iterative approach of adding further data to the model and evaluation with another set of test molecules. In the absence of a crystal structure for CYP2C9, researchers are limited to computational models of substrates and inhibitors or active sites homology-modeled on bacterial or other mammalian CYPs. Although the multiple 3D- and 4D-QSAR approaches taken in this study were not direct indicators of the active site, these results demonstrated that statistically predictive models of inhibitors of this CYP could be generated. In addition, these models could provide further information regarding the active site by the construction of a receptor surface model. These procedures would potentially confirm the few structural modeling studies conducted with CYP2C9 as well as providing more accurate drug interaction predictions with this CYP active site. The value of in silico approaches for predicting competitive type drug-drug interactions will become apparent after initial evaluation alongsideKi(apparent) values obtained with widely used in vitro systems.
Acknowledgments
We thank Dr. David Cummins for his help in the implementation of the F-test in the PLS algorithm, Robert Coner for assistance with permuting, and Dr. Patrick J. Murphy for his initial encouragement in pursuing this direction.
Footnotes
-
Send reprint requests to: Sean Ekins, Ph.D., Department of Drug Disposition, Lilly Research Laboratories, Eli Lilly and Co., Lilly Corporate Center, Drop Code 0730, Indianapolis, IN 46285. E-mail:ekins_sean{at}lilly.com
-
↵1 Present address: Glaxo Wellcome Research and Development, Medicines Research Center, Gunnels Wood Road, Stevenage, Hertfordshire, SG1 2NY, UK.
- Abbreviations used are::
- CYP
- cytochrome P450
- CoMFA
- comparative molecular field analysis
- 3D-QSAR
- 3 dimensional-quantitative structure activity relationship
- 4D-QSAR
- 4 dimensional-quantitative structure activity relationship
- LOO
- leave one out
- MS-WHIM
- molecular surface-weighted holistic invariant molecular
- PLS
- partial least squares
- 5RG×100
- five random groups repeated up to 100 times
- Received January 31, 2000.
- Accepted May 9, 2000.
- The American Society for Pharmacology and Experimental Therapeutics