DMD

Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
 QUICK SEARCH:   [advanced]


     


0090-9556/04/3211-1218-1229$20.00
DMD 32:1218-1229, 2004

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Afzelius, L.
Right arrow Articles by Zamora, I.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Afzelius, L.
Right arrow Articles by Zamora, I.

STRUCTURAL ANALYSIS OF CYP2C9 AND CYP2C5 AND AN EVALUATION OF COMMONLY USED MOLECULAR MODELING TECHNIQUES

Lovisa Afzelius, Florian Raubacher, Anders Karlén, Flemming Steen Jørgensen, Tommy B. Andersson, Collen M. Masimirembwa, and Ismael Zamora

Department of Drug Metabolism and Pharmacokinetics and Bioanalytical Chemistry (L.A., T.B.A., C.M.M.) and Department of Medicinal Chemistry (F.R.), AstraZeneca R&D, Mölndal, Sweden; Department of Organic Pharmaceutical Chemistry, Uppsala University, Uppsala, Sweden (L.A., A.K.); Lead Molecular Design, Barcelona, Spain (I.Z.); Institut Municipal d'Investigació Mèdica, Barcelona, Spain (I.Z.); and Department of Medicinal Chemistry, the Danish University of Pharmaceutical Science, Copenhagen, Denmark (L.A., F.S.J.)

(Received February 26, 2004; accepted July 12, 2004)


    Abstract
 Top
 Abstract
 Materials and Methods
 Results
 Discussion
 References
 
This work had two separate aims: to evaluate different modeling techniques and to make a detailed structural characterization of CYP2C9. To achieve these goals, the consensus principal component analysis (CPCA) technique and distance measurements were used to explore available crystal structures, newly built homology models, and repeated molecular dynamics simulations. The CPCA was based on molecular interaction fields focused on the active site regions of the proteins and include detailed amino acid analysis. The comparison of the CYP2C9 and CYP2C5 crystal structures revealed differences in the flexible regions such as the B-C and F-G loop and the N and C termini. Cross homology models of CYP2C9 and CYP2C5, using their respective crystal structures as templates, indicated that such models were more similar to their templates than to their target proteins. Inclusion of multiple templates slightly improved the similarity to the crystal target in some cases and could be recommended even though it requires a careful manual alignment process. The application of molecular dynamics simulations to highly flexible proteins such as cytochromes P450 is also explored and the information is extracted by the CPCA. Advantages and drawbacks are presented for the different modeling techniques. Despite the varying modeling success, the models give insight and understanding by the mutual forming and discarding of hypotheses. This is a dynamic process since the crystal structures are improving with time and, therefore, the answers to the models are also changing accordingly.


The cytochrome P450 (P450) superfamily plays a fundamental role in the metabolism of xenobiotics and endogenous compounds. Extensive research toward understanding the mechanism of P450-catalyzed reactions and of P450 substrate or inhibitor interactions has been done. The complexity of the reaction cycle and the lack of crystal structure information on membrane-bound P450 limited the process. Crystal structure data on mammalian cytochrome P450s started to emerge a few years ago. Before that, homology models were based on structure coordinates of bacterial isoforms (Sevrioukova et al., 1999Go). The bacterial P450s have low sequence similarity to mammalian P450s. The great structural conservation within the P450 superfamily (Graham-Lorence and Peterson, 1996Go) was, however, exploited toward the generation of many homology models (Lewis et al., 1998Go; Payne et al., 1999Go; Dai et al., 2000Go).

In January 2000 the structure of the first mammalian cytochrome P450, a construct of the rabbit isoform CYP2C5, 2C5/3LVdH, was deposited in the Brookhaven Protein Data Bank (Brookhaven PDB), 1dt6 [PDB] (Williams et al., 2000Go). Except for the deletion of the membrane-bound N terminus, additional mutations based on the corresponding amino acids in the rabbit CYP2C3 sequence increased solubility and facilitated the crystallization of the protein. This was a landmark in homology modeling for the human isoforms due to the high sequence similarity of the CYP2C5 with human CYP2Cs. Numerous homology models are now based on this structure and have been used to evaluate substrate specificity and interactions, site of metabolism, and pharmacophore models (Lewis, 2000Go; Afzelius et al., 2001Go; Ridderstrom et al., 2001Go; de Groot et al., 2002Go; Lewis, 2002Go).

The next mammalian P450 structure published was from the same 2C5/3LVdH construct but with a dimethyl derivative of sulfaphenazole cocrystallized (1n6b [PDB] ) (Wester et al., 2003aGo). This was done at an improved resolution, from 3.0 Å to 2.3 Å, and the previously unresolved F-G loop was now captured and revealed two short helices denoted F' and G' within the loop. The B-C region was also better resolved and uncovered a B'-helix that is also present within many bacterial species (Peterson and Graham, 1998Go). Using modeling techniques, it was shown that two substrates could fill the electron density in the active site in two different binding modes. The construct was later also solved with diclofenac as a cocomplex (Wester et al., 2003bGo) (1nr6 [PDB] ). In this complex, the site of metabolism of diclofenac was found at a reasonable distance (4.4 Å) from the iron to facilitate hydroxylation. At the same time, the first human cytochrome P450s were crystallized, a mutated form of the CYP2C9 in the substrate free form (1og2 [PDB] ) and in complex with the anticoagulant warfarin (1og5 [PDB] ) (Williams et al., 2003Go). In this structure an unexpected binding mode was seen where the site of oxidation was positioned 10 Å from the heme. Other crystal structures that have been solved include the substrate-free form of 2B4 (1PO5 [PDB] ) that was resolved to 1.6 Å resolution. The structure is captured as a reversible homodimer in an open conformer where His 226 forms an intermolecular coordinate bond to the heme iron. An ~15-Å-wide cleft leads down to the heme. It will be of greatest interest to study changes upon substrate binding for 2B4 and the plausible closure of the active site around the substrate to elucidate the conformational freedom available to these proteins. Structures that have been solved also include the human 2C8 to 2.7 Å (1PQ2 [PDB] ) and the recent release (June 15th, 2004) of the wild-type human 2C9 cocrystallized with flubiprofen to 2.0 Å (1R90), in which no mutations except those in the terminal ends have been made.

The crystal structures provide a significant amount of data that can be explored by computational methods to improve our understanding of P450 enzymes. The information can be used to validate homology models and predictors of inhibition/metabolic stability/activation, but also to build new hypothesis.

Nevertheless, a tool for analysis is required to put all this information in concrete form. In this work we present the novel application of consensus principal component analysis (CPCA) to explore modeling success, which is often evaluated based on docking and stereochemical parameters (Szklarz and Halpert, 1997Go; Kirton et al., 2002Go). The work also includes, to our knowledge, the pioneer use of CPCA as a tool for evaluating molecular dynamics simulations. Selectivity analysis between isoforms is also performed successfully with this methodology, which has been reported previously (Kastenholz et al., 2000Go; Ridderstrom et al., 2001Go). The CPCA was used for a comparative analysis of available crystal structures to homology models based on single or multiple templates and snapshots from molecular dynamics simulations for CYP2C9 and CYP2C5, respectively. The analysis is restricted to the active site of the proteins, which is described by molecular interaction fields (MIFs) calculated by the program GRID (Goodford, 1985Go). Multivariate data analysis is applied to these descriptors to identify selective regions of the MIFs. The selective MIFs highlight regions and type of interactions in the binding site that reflect differences between the structures.

This analysis gave an increased general understanding of the structural characteristics of each isoform; selective regions were identified and changes induced by binding were traced. The study also validated the performance of computational techniques such as homology modeling and molecular dynamics simulations.


    Materials and Methods
 Top
 Abstract
 Materials and Methods
 Results
 Discussion
 References
 
Equipment and Software. Molecular interaction fields were calculated in an Irix environment on a Silicon Graphics O2 workstation (Silicon Graphics Inc., Mountain View, CA) and in Linux environment on a 32MB personal computer. The software utilized in the computational analysis was GRID v21 (Molecular Discovery Ltd., http://www.moldiscovery.com), GOLPE (Baroni et al., 1993Go), MetaSite (Zamora et al., 2003Go), MODELLER v6.1 (http://salilab.org/modeller/modeller.html), SYBYL 6.5.3 (Tripos Associates Inc., St Louis, MO), PROCHECK v.3.4.3 (Laskowski et al., 1993Go), ClustalX (ftp://ftp-igbmc.u-strasbg.fr/pub/ClustalX/clustalx1.83.alpha.tar.gz), and AMBER 7 (Case et al., 2002Go).

Overview. The analysis was performed in different steps. In the first step, all available crystal structures of CYP2C9 and CYP2C5 were compared to show whether high sequence similarity proteins share the same geometry and interaction features in the binding site and to recognize selective regions that correspond to isoform-specific reactions.

Second, the interaction maps of the active site of homology models of CYP2C9 and CYP2C5, based on different templates, were compared with the crystal structures. This enables the evaluation of techniques available for work on proteins with a high degree of similarity.

In the third step, snapshots from molecular dynamics simulations of CYP2C9 and CYP2C5 crystals in explicit water were analyzed to determine whether they could capture changes upon substrate binding and to determine which parts of the cavity were more flexible and could participate in substrate recognition and access. Finally, all structures, crystals, homology models, and molecular dynamics conformations for both CYP2C5 and CYP2C9 were compared in a CPCA to see whether they overlap in chemical space and how they intracorrelate dependent on the modeling technique used.

Protein Homology Modeling. Comparative modeling techniques were used to prepare homology models for CYP2C9 and CYP2C5. This process requires one or several homologous crystal structures referred to as template structures. The amino acid sequence for the desired protein is referred to as the target. The crystal template structures were selected from a PSI-BLAST search against the Brookhaven PDB to identify suitable template structures for comparative modeling (Kirton et al., 2002Go). The following templates (Table 1) were downloaded from the Brookhaven PDB (http://www.rcsb.org/pdb/): P450BM-3 (1bu7 [PDB] ; Sevrioukova et al., 1999Go), P450terp (1cpt [PDB] ; Hasemann et al., 1994Go), P450eryF (1eup [PDB] ; Cupp-Vickery et al., 2000Go) and P450cam (1phc [PDB] ; Poulos et al., 1986Go), CYP2C9 (1og2 [PDB] ; Williams et al., 2003Go), CYP2C9 with warfarin cocrystallized (1og5 [PDB] ; Williams et al., 2003Go), CYP2C5/3LVdH (1dt6 [PDB] ; Williams et al., 2000Go), CYP2C5/3LVdH with 4-methyl-N-methyl-N-(2-phenyl-2H-pyrazol-3-yl)benzenesulfonamide (1n6b [PDB] ; Wester et al., 2003aGo), and CYP2C5/3LVdH with dioclofenac (1nr6 [PDB] ; Wester et al., 2003bGo). The target and template sequences were downloaded from the SWISS-PROT data bank (http://us.expasy.org/sprot/).


View this table:
[in this window]
[in a new window]
 
TABLE 1 Crystal structures used as templates in homology models

 

A total of 30 homology models, 15 for CYP2C9 and 15 for CYP2C5, were built using single or multiple templates (Table 2). The exact same procedures were used to generate models for CYP2C9 and CYP2C5 independently, the only difference being that the crystal structure of CYP2C9 was the template for the CYP2C5 models and the crystal structure of CYP2C5 was the template for the CYP2C9 models. First, models were built based only on the mammalian CYP2C templates (CYP2C5 or CYP2C9). The two-dimensional alignment was made in ClustalX and, due to the great similarity, the deviations were small. Second, the crystal structures of BM3 were aligned to either of the human templates based on the three-dimensional structure in the MALIGN module in MODELLER with different gap penalties (0–2, 2–4, and 4–6). These alignments were considered as a profile in ClustalX and the target sequence was aligned toward it. Finally, the same procedure was repeated for the template CYP2C and multiple bacterial templates (P450BM-3, P450terp, P450eryF, and P450cam).


View this table:
[in this window]
[in a new window]
 
TABLE 2 Homology models built for CYP2C9 and CYP2C5

 

When this preliminary alignment was done in MALIGN and ClustalX, a detailed manual alignment process continued. The secondary structures of the sequences were predicted using PSIPRED (http://bioinf.cs.ucl.ac.uk/psipred/). In the alignment, gaps within predicted helices and sheets were removed to preserve the secondary structure since it is known to be well conserved throughout the superfamily of P450s (Graham-Lorence and Peterson, 1996Go). Thereafter, MODELLER was set to generate five models for each of the three alignments, giving 15 different models each for CYP2C9 and CYP2C5 (Table 2). To ensure quality, in terms of geometry, the stereochemical parameters were checked in Ramachandran plots calculated in PROCHECK. The backbone RMSD toward template and target was calculated in SYBYL using the "Align structure using homology" mode (Tables 1 and 2).

Molecular Dynamics Simulations. Setup and equilibration of CYP2C9 and CYP2C5 in solution. The molecular dynamics simulations were made using the SANDER module in AMBER 7 (Case et al., 2002Go). The force field used was AMBER99 (Cornell et al., 1995Go). The force field parameters for the heme group were adopted from the parameterization available from the parameter database of the AMBER force field (Giammona, 1984Go). Using the crystal structures of CYP2C9 (1og2 [PDB] ) and CYP2C5 (1dt6 [PDB] ), hydrogen atoms were added and crystallographic ambiguous side chain orientations were optimized using Reduce (Word et al., 1999Go). The hydrogen atoms were minimized for 500 optimization steps in vacuum, keeping the heavy atoms of the protein fixed. For an initial relaxation of the protein, this structure was minimized for 150 steps using a position restrain on the backbone. Two different starting structures were used in the molecular dynamics simulations, the first with the oxygen bound to the iron (the oxyferryl species) and the second without an oxygen bound to the iron (Ortiz de Montellano and De Voss, 2002Go).

To further relax the structure in an aqueous solution, the protein was immersed in a cubic box of TIP3 water. Electroneutrality was achieved by adding sodium ions. The water molecules were minimized and then equilibrated for 25 ps with a molecular dynamics simulation at constant volume and a temperature ramp from 5°K to 300°K, keeping the protein rigid. The position restraint on the protein was gradually removed in subsequent minimization steps. Next, the entire system (protein, water molecules, and counter ions) was heated from 5°K to 300°K over 20 ps and equilibrated for 200 ps in an additional constant pressure-constant temperature molecular dynamics simulation. The system was then run for a subsequent trajectory of 800 ps used for structural analysis.

A time step of 2 fs was used for the Particle–Mesh–Ewald molecular dynamics simulation with a nonbonded list update every 10 steps. All bonds involving hydrogens were constrained with the SHAKE (Ryckaert et al., 1977Go) algorithm. A cut-off of 9 Å was applied. The temperature/pressure was maintained by the Berendsen weak-coupling scheme (Berendsen et al., 1984Go).

The changes in the intracorrelation of secondary structure elements during the two runs of molecular dynamics simulations for CYP2C9 were examined by measuring distances between {alpha}-carbons in the backbone for certain selected amino acids (see Table 4). This was done to obtain indications of possible substrate access channels. For Lys72, the terminal nitrogen position was also monitored, since this flexible side chain had been suggested to gate an access channel selective for anionic compounds (Williams et al., 2003Go).


View this table:
[in this window]
[in a new window]
 
TABLE 4 Distances from the MD snapshots for CYP2C9 compared to crystal data

 

Molecular Interaction Field Calculations using GRID. All protein structures (see Tables 1 and 2) were 3D aligned to the CYP2C9 crystal structure (1og2 [PDB] ) in SYBYL based on the smallest RMSD to the backbone atoms. The proteins were imported into the GRID interface called Greater. In GRID, the carboxy terminus of a protein and the carboxy groups of Asp and Glu are treated by default as anionic. Similarly, the N-terminal nitrogen and Arg and Lys side chains are treated as cationic, and the overall net charge of the protein is then calculated by summation. This is an arbitrary calculation that gives an overall net charge. If the charge differs between the proteins that are studied, the electrostatic effect can dominate the predicted interactions of charged probes. Therefore, the proteins were made neutral by positioning movable Na+ or Cl- ions at minima energy positions calculated by GRID. Since these ions are movable they will not compete with a probe (see the GRID manual http://www.moldiscovery.com/docs/grid21/index.html). Secondly, a box was defined to include the heme, the active site, and access channels. The exact same box size, 34 x 30 x 40 (Å), was used for all calculations to enable the subsequent comparison in the statistical software GOLPE. The MOVE directive, which alters flexibility of the target, was set to 0 (ridgid mode) and the LIST directive, which defines the file output format, to -2. The default values were used for the other parameters. The following probes were included; OH2, DRY, C3, N1, N1+, O-, and O (see Table 3)


View this table:
[in this window]
[in a new window]
 
TABLE 3 Probes used in the CPCA

 

Active Site Cut-Out File Generated in MetaSite. Many interaction points that are calculated in GRID within the defined box are not accessible to the substrate. Therefore, the nonaccessible points were deleted in the subsequent CPCA to remove noise from the analysis. Cut-out files were prepared for each of the crystal structures and then merged into a single file. This was done in MetaSite, which is a program used to predict the site of metabolism of the most common cytochromes (Zamora et al., 2003Go) (CYP3A4, CYP2C9, CYP2D6, CYP2C19 and CYP1A2). The program automatically detects the protein cavity starting from the protein reactive center, the oxygen bound to the iron, as implemented in MetaSite (Fig. 1).



View larger version (26K):
[in this window]
[in a new window]
 
FIG. 1. The CYP2C9 crystal structure, in magenta, is visualized together with the box defining the outer borders for GRID calculations. a, raw molecular interaction fields (cyan) calculated in GRID and imported to GOLPE. b, MetaSite definition of the active-site regions that are in direct access to the heme. c, molecular interaction fields after pretreatment in GOLPE. The coordinates defined in MetaSite (b) were used to select only the points from the molecular interaction fields that correspond to the active site to focus the subsequent analysis.

 

Consensus Principal Component Analysis (CPCA). The molecular interaction fields generated in GRID were imported into GOLPE. Pretreatment was applied to focus on the areas of interest and increase the signal-to-noise ratio: 1) grid points within a radius of 4 Å from the active site cut-out file were selected using the cut-out tool (Fig. 1), 2) all positive energies, corresponding to close contacts between the probe and the protein, were excluded, together with all variables showing a variance of less than 0.01 standard deviation, and 3) block unscaled weights were applied. This is a scaling factor that depends on the variance for each block. All blocks are given the same variance to normalize the interaction energies between the probes. The data set was then suitable for the subsequent multivariate analysis, the CPCA. The CPCA is an algorithm developed (Kastenholz et al., 2000Go) from the principal component analysis (PCA).

A principal component analysis identifies "underlying" data structure or variables that best summarize the information of the original descriptors by explaining the variance in the data. PCA is a technique that reduces the dimensionality of a data matrix to a smaller number of underlying variables called principal component or latent variables, which are a linear combination of the variables. To a greater or lesser extent, all descriptors contribute to the component extraction. The first principal component is a line through the data space that explains the data with the least squares residual. This type of fitting ensures that the first component explains the maximum variance, the second component is orthogonal to the first component and explains as much as possible of the variance that was not explained by the first component and is derived from the residuals obtained from the first component. The first two components generate a plane to which all objects can be projected. This plane is called the score plot and describes the position of the observations based on the latent variables. Observations that are close in the score plot have a comparable variance and are similar. In addition, the methodology provides the loadings plot, in which the importance of each of the original descriptors in defining the latent variable is described. The score plots (observations; e.g., the proteins) and loading plots (descriptors; e.g., molecular interaction energies for each grid point) are related. Variables that are positively correlated to an observation are positioned at the same place in the loadings plot as the observation in the score plot. Since we are dealing with only negative energies here, where more negative interactions are more favorable, the interpretation is reversed, so that superimposed scores and loadings are negatively correlated and an observation in the score plot is positively correlated to the loadings at the same coordinates, but with opposed signs. In the PCA it can be difficult to distinguish the relative influence of different probes if more than two objects are studied. In these cases, the analysis benefits from a CPCA, to enable the recognition of single amino acids responsible for binding by studying each probe both individually and under the influence of all probes.

In the CPCA (Kastenholz et al., 2000Go), the data that originate from several probes are intrinsically organized in blocks in the descriptor matrix; i.e., the results for each probe generate a block in the matrix. The CPCA can be considered as a PCA on two levels. First, the principal components are derived for the entire descriptor matrix at a superlevel which is identical to the usual PCA. Second, the principal components are derived on a block level (= probe). The second derivation of principal components is not exactly a standard PCA procedure since the principal components are rotated to reproduce the scores from the superlevel PCA. The minimization criteria are not just to get the lowest residual values, but also to reproduce the values obtained in the PCA level. Consequently, the first principal component in the block level does not need to explain the maximum variance as in a normal PCA; instead, it reflects the importance of a specific probe in the superlevel analysis.

All models that are discussed were calculated using eight probes, OH2, DRY, C3, N1, N1+, O-, and O. The different steps in the analysis are visualized in Fig. 2. The connection between the first and second level e.g., the influence of each block in the description of the score plot for the entire set, is analyzed by the CPCA superweight plot (Fig. 2a). From this level, the relative importance of each probe is distinguished as the plot highlights the influence of the different blocks into the principal components. For each probe the results are visualized in a score plot (Fig. 2b) in which the objects are positioned based on their similarity and the variance is explained. The score plot is a summary of the relationship among the observations. The loading plots (Fig. 2, c and d) show the importance of each variable to the patterns seen in the score plot. The loading and score plots are related as in the PCA case. The encircled loadings are responsible for the selectivity of the 2C9 family (negative energies of interactions are favorable and, therefore, the positions of the loadings are superimposed with opposed signs). These corresponding loadings, the molecular interaction field descriptors, can then be visualized together with the protein to distinguish regions of importance for selectivity (Fig.2, c, d, and e).



View larger version (28K):
[in this window]
[in a new window]
 
FIG. 2. CPCA on all crystal structures of CYP2C5 and CYP2C9 based on molecular interaction fields calculated in GRID. This series of pictures explains the route for the interpretation of the CPCA results: a, superweights plot describing the influence of the different probes in the analysis (PC1 and PC2 = principal component 1 and 2, respectively); b, PCA score plot that shows the intracorrelation between observations; c, PCA loadings plot that is complementary to the score plot with the selective loadings for CYP2C9 versus CYP2C5 marked in red. The loadings plot explains which loadings are responsible for the results in the score plot. Since the variables are negatively correlated (negative energies of interaction are favorable), the loadings that are responsible are at the exact inverse position in the loadings plot compared with the score plot: d, the selective loadings from c visualized in space; e, interpretation of molecular interaction fields corresponding to the selective loadings in c within the active site of the protein to suggest areas of importance for selectivity.

 


    Results
 Top
 Abstract
 Materials and Methods
 Results
 Discussion
 References
 
In this work the massive amount of data, experimental and calculated, made it necessary to base the analysis on a multivariate analysis technique such as the CPCA. This type of analysis enables the reduction of the data dimensionality and the recognition of patterns in data. Each pattern can then be correlated to the actual discriminative structural characteristics. The protein structures analyzed can thereby be grouped together with other similar structures. If there is no matching pattern for a certain structure, it is nontypical compared with the rest of the data set. In some of the cases described below, the structural characteristics are explored on an amino acid level. In other cases, general relationships between structures are more informative.

Crystal Structures of CYP2C9 versus CYP2C5. First, the differences between the crystal structures of CYP2C9 and CYP2C5 are highlighted since they are also highly influential in the latter analysis of the homology models. The comparison between the crystal structures with and without the bound substrate has been extensively described in their original publications (Wester et al., 2003aGo,bGo; Williams et al., 2003Go) and will therefore not be described in depth here. Visual inspection and distance measurements between the crystal structures of CYP2C9 and CYP2C5 (without substrates bound) (Table 4; Fig. 3) reveal that the greatest changes occur in the merging region of the B-C loop, the F-G loop, and the N terminus. This is also reflected in the B-factors of the structures (Fig. 4a). In CYP2C9, the C-terminal loop (Val473, Asn474, Gly475, and Phe476) and the N-terminal loop (Ile47, Lys48) are closer to each other, making the distance between the C-terminal and the F-G loops (Ile207, Leu208, Ser209, Ser210, Pro211, and Trp212) larger than in the CYP2C5 structure. In the CYP2C5 structure, the C terminus (Val470, Asn471, Gly472, and Phe473) and the F-G loops (Leu207, Leu208, Gly209, Thr210, Pro211, and Trp212) are closer to each other and the active site is smaller, since they are positioned nearer to the heme than in the CYP2C9 structure. The F-G loop and its F' and G' helices seem to form a lid to the active site. In the CYP2C5 structure, the lid restricts the active site, which also gives rise to a displacement of the B-C loop. However, the low resolution of the B-C loop for the CYP2C5 crystal structure and the possible influence of mutations in the F-G loop of the CYP2C9 structure have to be kept in mind. In comparison, the B-C loop in the CYP2C9 structure is well determined and more ordered with the presence of a B'-helix.



View larger version (64K):
[in this window]
[in a new window]
 
FIG. 3. Visualization of the active-site regions of the crystal structure of CYP2C9 (1og5 [PDB] ) in magenta versus the crystal structure of CYP2C5 (1dt6 [PDB] ) in cyan. This representation has been used consistently throughout the paper.

 


View larger version (33K):
[in this window]
[in a new window]
 
FIG. 4. a, CYP2C9 crystal structure colored by the B-factors. b, averaged B-factors from the molecular dynamics simulations for CYP2C9. Secondary structures shown are the regions around the F-G loop and the B-C loop. Lowest to highest B-factors are in the order green, yellow, red, purple, and blue.

 

The results of a CPCA, including all crystal structures, with and without substrates bound, available from the PDB for CYP2C9 and CYP2C5 are shown in Fig. 2. The superweights plot (Fig. 2a) describes the influence of each probe in the analysis. In this plot, all probes but the hydrophobic (denoted 2) have a similar contribution in the first component. The DRY probe, however, is of similar importance in the second component. The PCA score plot (Fig. 2b) shows that the first component (x-axis) discriminates between 2C9 (1og5 [PDB] and 1og2 [PDB] ) and 2C5 (1dt6 [PDB] , 1nr6 [PDB] , and 1n6b [PDB] ) since they have opposing coordinates for the first principal component. The second component discriminates between substrate-free 2C5 (1dt6 [PDB] ) and 2C5 and substrate-bound 2C5 (1nr6 [PDB] , 1n6b [PDB] ). The most important discriminant loadings for CYP2C9 (first component, marked in red) described by the H2O probe are shown in Fig. 2c, and their structural interpretation is visualized as raw grid points (Fig. 2d), as molecular interaction fields, together with the structure (Fig. 2e). The loadings are used to narrow down single amino acids that could be responsible for the selectivity between the isoforms. This process can give reasonable suggestions for site-directed mutagenesis to convert substrate specificity or can be used for ligand design to achieve selectivity.

The water probe (Fig. 2e) shows selectivity for CYP2C9 in a corner of the active site where Asn107, Gly109, and the backbone of Arg108, all in the B-C loop, could interact, possibly via a water molecule, with Asp293 and Asn289 in the I-helix. Asp293 has previously been suggested from mutagenesis studies to have a structural role in substrate recognition and catalytic activity. This is further supported by the conservation of the Asp residue over other P450 families (Flanagan et al., 2003Go). Arg108, which has also been suggested to be of functional importance from mutagenesis data (Ridderstrom et al., 2000Go), is pointing away from the active site. This might be explained by the conformation of the B-C loop in the crystal structure, since this region is highly flexible and a charge-charge interaction would be favorable. This interaction is not possible in the CYP2C5 structure since the B-C loop is oriented differently (Fig. 3). In the region of the active site where the F-G loop (Ile208, Ser209, Ser210, and Gln214) approaches the region of the C terminus and the backbone carbonyl of Asn474, another favorable hydrogen bonding possibility is found. The DRY probe has three distinct regions in the active site that are also known to be important from mutagenesis studies (Melet et al., 2003Go): one close to Val113 and Phe114, a second one in the proximity of Phe476 and Phe100, and a third one near Leu102 and Ala103. The C3 probe is important in describing steric changes, and in this case, the most important loadings identify a pocket near the heme close to Thr301, Thr304, Ala477, and Leu361. The N1 probe is a mimic of an amide nitrogen and has a possibility to donate one hydrogen in a favorable direction to any available lone pair of electrons. The carbonyl of the peptide backbone between Ser478 and Val479 is highly involved, but the hydroxyl group of Thr301 and the peptide bond carbonyl between Leu361 and Leu362 are also directionally well positioned to interact with a hydrogen bond-donating group. Another selective patch is due to the carbonyl side chain of Gln214, and the backbone carbonyl groups between Leu208-Ser209 and Ser210-Pro211 and on the opposite side of the pocket between Asn474 and Gly475. Asn217 is another amino acid of importance in selectivity. The last selective patch seen for this probe is due to Asp293 and the backbone carbonyls between Arg108 and Gly109. The N1+ probe is the cation of an sp3-hybridized amine and should therefore emphasize interactions with negatively charged amino acids, e.g., the carboxylic acids of aspartic and glutamic acid. The selectivity is mainly confined to Asp293, but with minor contributions from Asp360 and Asp49. These two amino acids are positioned on the protein surface and are therefore of little or no importance for the active site binding. The last two probes included in the analysis describe a phenolate and a carbonyl oxygen, respectively, and both have the possibility to accept two hydrogen bonds. The phenolate is negatively charged and will therefore interact with positively charged amino acids such as lysine and arginine. The loadings analysis showed that the exact same positions were selective for the O- probe and the O probe and, thus, only the interactions for the O probe are further described. The carbonyl probe finds several selective regions, mainly close to the heme and Thr301. This highly conserved amino acid is of great importance to the proton transfer path (Schlichting et al., 2000Go) in the catalysis of substrates.

Homology Models versus Crystal Structures. In Fig. 5, a and b, the PCA score plots for homology models versus their targets are visualized for CYP2C9 and CYP2C5 separately. In Fig. 5c, the information included in Fig. 5, a and b, is combined in a single analysis. It is obvious from this simple statistical analysis of the active site that the models are close mimics of their templates. That is, the CYP2C9 homology models closely resemble the CYP2C5 crystal structure from which they were built and not the target, the CYP2C9 crystal structure. Models built from multiple targets are in one case (CYP2C5_3D:B_1-5; Fig. 5b) more similar to their target than to their templates. The idea of retaining conserved regions from multiple templates can be rewarding if the alignment is successful, as in the case mentioned. This can be a difficult task in cases where the sequence identity is low; i.e., identity lower than 30%. Then, the alignment has to be performed by initially introducing a low gap penalty to increase the weight of the low-identity structure. In the next step, gaps must then be removed to retain the secondary structure. In this procedure, it can be difficult to determine, for example, which is the first and last amino acid in a secondary element, since secondary structure predictors give slightly different results.



View larger version (10K):
[in this window]
[in a new window]
 
FIG. 5. a, PCA score plot of CYP2C9 homology models versus CYP2C9 crystal structures. The crystal structures are separated from all homology models since they are more similar to each other than to the crystals that they should mimic. b, PCA score plot of CYP2C5 homology models versus CYP2C5 crystal structures. In this case the crystal structures and the homology models based on multiple templates are selective in the first component compared with homology models based on fewer templates. c, PCA score plot of all homology models and crystal structures of CYP2C9 and CYP2C5. The results show that the homology models are closer to their templates than to the targets that they are supposed to describe except for the CYP2C5 multiple template case.

 

Even though spatial restraints force the geometry of the homology models toward the template, the influence of the different amino acid sequences makes the homology model resemble the target. The influence of different amino acid sequences on a ligand can be explored in different ways and is a measurement of the quality of the model. An excellent choice is to dock compounds that are substrates of both 2C9 and 2C5 but with different sites of metabolism and see how well this outcome is predicted. However, such specific information is difficult to obtain. Also, due to the low specificity and selectivity among isoforms, commercial docking algorithms most often give rise to multiple binding modes. Scoring functions are seldom able to pick the preferred orientation, in which the site of metabolism points toward the heme at a reasonable distance for a reaction to occur. In this work, we have explored the active sites of crystal versus homology models with different probes and then focused on regions that have major differences defined by a CPCA differential plot, which signals lower model-quality regions. Docking solutions of ligands that are predicted in these regions are therefore questionable. The major differences captured in the first component, which discriminates between the crystal structures and all homology models (Fig. 5a), are consistent with the differences seen between 2C5 and 2C9 as described above. The second component is discriminative between the homology models of CYP2C9_3D:A and CYP2C9_3D:B (see Table 2). The CPCA superweights plot shows that all probes are equally important except that from the DRY probe, which is less relevant. The most important discriminative features for the water probe are visualized in Fig. 6. The most interesting finding concerns Arg108, which is predicted to point into the active site in the CYP2C9_3D:B model and out of the active site in the CYP2C9_3D:A model. The two crystal structures available for human 2C9 (1og2 [PDB] and 1R9O [PDB] ) show the same contradiction. In 1og2 [PDB] the side chain of Arg108 is pointing out of the binding site cavity, whereas in 1R9O [PDB] , this side chain is pointing into the active site and is highly involved in substrate binding, which has previously been suggested based on experimental results (Ridderstrom et al., 2001Go). It remains to be determined whether this is a reflection of protein flexibility or whether it is due to artifacts in structure determination. Notable also is that major deviations in the backbone, as seen in the F-G loop, do not necessarily relate to major differences in environment. Other regions that give rise to differences are the enclosing amino acids Gln214 and Asn474.



View larger version (54K):
[in this window]
[in a new window]
 
FIG. 6. Selectivity between the homology models CYP2C9_3D:A_2 (gray) and CYP2C9_3D:B_3 (cyan) for the water probe. Regions of selectivity are marked by the amino acids responsible for the interactions. The Arg108 is highly interesting since it points in opposite directions in the discussed homology models. These two modes are consistent with the contradictory findings in the two crystal structures of human 2C9 available to date.

 

Molecular Dynamics Simulations. CYP2C9. Throughout the simulations, 200 snapshots were collected and the GRID interaction fields were calculated for the active site using the exact same settings as in the case of the homology models and the crystal structures. A CPCA was then made of all snapshots. The snapshots were distributed in a horseshoe shape around the center of the first two components in the CPCA plot. To explore the changes that occur in the equilibration phase, the initial crystal structures were plotted in the space of the CYP2C9 dynamics simulations. The crystals were not well explained by the first component and only slightly better explained in the second component. This was found to be due to the unequal distribution of variance in the data since it is dominated by the contribution of the 200 snapshots from the molecular dynamics. Therefore, the four most diverse snapshots from the molecular dynamics simulations were selected from the first two components of the CPCA. The snapshots are arranged in a consecutive order; i.e., the conformers move according to their time series numbers from one end of the horseshoe to the other. Conformers 1, 50, 150, and 200 were therefore selected for further analysis, together with the crystal structures and the AMBER-minimized crystal structures. This analysis showed that the minimization in AMBER did not induce fundamental changes as compared with the unsolvated crystal structures, and the structures were placed in the same quadrant as the initial crystal structures (1og2 [PDB] and 1og5 [PDB] ). The exception was the finally minimized structure in AMBER that approached the first conformations of the simulations. Because the simulations should explore the conformational freedom of the protein, the hypothesis was that the crystal structure with and without bound substrate would be found within the conformational space covered by the molecular dynamics simulations. Although the starting structures were slightly different (see Materials and Methods), the two molecular dynamics simulations were believed to generate a similar plot in which at least part of the structures would mix in the CPCA space. The snapshots from the second molecular dynamics run were analyzed in the exact same way as for the first run. The shape of the distribution was very similar to the first run: a horseshoe shape. On the basis of the distribution in the CPCA score plot, the same four conformations, 1, 50, 150, and 200, were chosen as representatives for the entire run. In the next step, the selected conformers from both simulations were analyzed together with the starting structures; the results are shown in Fig. 7a. The first component discriminated between the first MD run and the second MD run together with the starting structures. The second component discriminated between the second MD run and the starting structure. Each of the runs and the starting structures were distributed in one quadrant each and did not overlap at any simulation time.



View larger version (9K):
[in this window]
[in a new window]
 
FIG. 7. a, CPCA score plot of the four representative conformers (conf 1, 50, 150, and 200) of each of the molecular dynamics simulations (MD1 and MD2) of CYP2C9, together with original crystal structures and those minimized in explicit water. The finally minimized complex (crystal, explicit waters, and counter ions) in AMBER for the first run is denoted starting structure MD 1. b, the relation between structures in a and the homology models of CYP2C9.

 

To gain a structural understanding of the movements during the molecular dynamics simulations, a number of distances between the secondary elements were measured for each of the snapshots for the two separate runs and compared with the crystal structure (Table 4). The overall RMSD and the average range for all distances measured correspond well between the two runs, but the individual distances differ significantly. The most apparent change is that the nitrogen of the Lys72 in the first run moves over an 8-Å range compared with its original position, whereas it moves only 2.3 Å in the second run (see Discussion). With regard to the other movements, the same regions are flexible through both runs, but the internal correlations of movements differ. The main conclusion made from these MD runs is that the protein is highly flexible. The parts of the protein that have high B-factors in the crystal structure also show great flexibility in the dynamics (Fig. 4b). As a next step, the homology models were added to the analysis to find out how these efforts correlate to the molecular dynamics simulations. The homology models did not occupy the same regions in the CPCA score plot as the molecular dynamics simulations or the crystal structures. Inclusion of the homology models shows that they are even more diverse than the crystal structures compared with the molecular dynamics simulations (Fig. 7b). This is due to the great sequence similarity to the template, i.e., CYP2C5, which was shown in the previous analysis of the homology models versus crystal structures. In the final step of the analysis, CYP2C5 MD simulations, homology models, and crystal structures were added to the CPCA of CYP2C9 MD simulations, homology models, and crystal structures. The first two components describe 12% and 10% of the variance, respectively, and the distribution is visualized in Fig. 8. The homology models of CYP2C9 and CYP2C5 and the crystal structures of CYP2C5 and CYP2C9 are distinguished from the CYP2C5 and CYP2C9 molecular dynamics in the first component, which means that the isoforms mix in the first component. In the second component, the CYP2C5 MD run, the homology models of CYP2C9 and the crystal structure of CYP2C5 are separated from the crystal structures of CYP2C9 and the homology models of CYP2C5. From this plot it becomes obvious how similar the homology models are to their template molecules.



View larger version (14K):
[in this window]
[in a new window]
 
FIG. 8. CPCA score plot for the simultaneous analysis of the crystal structures, the homology models, and both molecular dynamics simulations for CYP2C9 and CYP2C5.

 


    Discussion
 Top
 Abstract
 Materials and Methods
 Results
 Discussion
 References
 
This work had two separate aims: to evaluate different modeling techniques and to make a detailed structural characterization of CYP2C9. To achieve these goals, the CPCA technique and distance measurements were used to explore the available crystal structures, the newly built homology models, and repeated molecular dynamics simulations.

It must be kept in mind throughout the analysis that the crystal structures of CYP2C9 are solved from a construct including seven amino acid substitutions based on CYP2C sequences to improve properties for crystallization (Williams et al., 2003Go). All of them are located in the F-G loop that is found throughout the analysis to be of great importance for substrate recognition, binding, active-site volume, and probably membrane association. Only a wild-type CYP2C9 crystal structure will show the actual influence of these mutations. A wild-type crystal structure of CYP2C9 (1R9O [PDB] ) was released on the June 15, 2004, but seven amino acids, Gly214-Ser220, of the F-G loop were not located in the experiment. Due to the recent release, the structure was not included in the calculations but was considered in the interpretation. Apart from that, the protein has been released from the membrane by cutting the N terminus, and this could also induce structural changes. The CYP2C5 structure also lacks the N-terminal part, and five amino acid positions have been changed for the corresponding residues in CYP2C3 (Williams et al., 2000Go). In the first crystal structure, without substrate bound, the F-G loop was not resolved and the loop coordinates were described on the basis of calculations. During the process of structural determination, errors can be introduced at several stages. Apart from pure experimental measurement errors, loops can be trapped in unphysiological conformations and errors can be introduced in the process of modeling atoms into the electron density. A crystallographic model should therefore be evaluated with a degree of uncertainty.

The CPCA is based on possibilities for interacting with different probes, and it is very useful to elucidate structure characteristics with regard to steric and electrostatic properties. It is thus highly dependent on how the possible sites of interactions, e.g., amino acid side chains, are oriented. In some cases these results will be biased by how the side chains were built into the electron density and the template for the refinement process. A difference between a carbon and an oxygen is not distinguishable in the electron density at this resolution due to the similar number of electrons.

The homology models are of good quality from a stereochemical point of view, where the percentage within the most favored regions in a Ramachandran plot is higher in the models than in the templates. Nevertheless, this work clearly shows that a model typically resembles the underlying target. The homology modeling algorithms are per se trained to base the models on similarities. In one case, the introduction of multiple templates improved the results significantly, and this approach is therefore recommended, although it requires careful manual adjustment of the alignment. Initially, lower gap penalties have to be introduced in the alignment to facilitate an influence from templates of lower similarity, but then gaps corrupting evolutionary conserved regions must be removed manually. If several templates of high amino acid identity are present, great improvements can be expected, which can be the case in the near future when several, although far from all, isoforms have been crystallized. CYP2C8 or CYP2C19 will be interesting targets since they represent two templates of similar high sequence identity, like CYP2C9 and CYP2C5. The resulting data can then be assessed in a manner similar to that done here.

The reliability of homology models must be evaluated on the basis of the question asked. Substrates with a low selectivity and moderate affinity could probably be predicted successfully, whereas rational design of high-affinity and high-selectivity compounds are likely to have a much lower reliability since these are dependent on strongly corresponding complementarities. Despite the drawbacks presented here, the benefits of homology models are indisputable. The models, which are being revised over time as new experimental data emerge, give insight and understanding by the mutual forming and discarding of hypotheses, as the models are refined. Apart from that, due to the limitations of the process of structural determination, even the most sophisticated crystal diffraction data will be afflicted with an experimental error, which makes the final result a model itself.

As a next step in the analysis, the molecular dynamics information was evaluated. The molecular dynamics simulations cover a different CPCA space from the crystal structures with and without substrate bound, independent of the different starting structures. Consequently, the simulations cannot be used to predict changes associated with substrate binding. On the other hand, it would not be probable, since the driving force of an approaching substrate is not present.

Nevertheless, the data can be used to produce mechanistically possible suggestions of where flexibility occurs and how it affects the surroundings. That was explored by measuring distances between backbone {alpha}-carbons of secondary structure elements to explore openings and closures that could correspond to access channels. These results show that there is great conformational rearrangement in the protein. The overall RMSD and range of movement are similar for both MD runs, but the most flexible parts differ between the runs. During the first run, one major conformational freedom was seen in the opening between the B-C loop and the ß1-1 sheet. This entrance is guarded by a lysine, Lys72, where the terminal nitrogen moves up to 8 Å during the simulations. It has been suggested that substrate recognition and access occur at the opening between the B-C loop and the ß1-1 sheet to the active site and that Lys72 gates it (de Groot et al., 2002Go). This amino acid has previously been suggested as a possible discriminate switch for acidic versus basic ligands to explain the substrate selectivity between CYP2C9 and CYP2C19 that is dependent on these characteristics (Williams et al., 2003Go). This positively charged amino acid is exchanged for a negatively charged glutamate in CYP2C19. Despite the high sequence identity between CYP2C9 and CYP2C19 (~ 88%), a great deal of substrate specificity is seen (Jung et al., 1998Go). However, recent mutagenesis data show that Lys72 has little or no effect on the interaction with the polar compounds ibuprofen and diclofenac. These findings rule out the critical role of this amino acid in determining substrate specificity (Davies et al., 2004Go). In the second MD run, this region shows only minor flexibility which emphases the difficulty of making structural assumptions on the basis of molecular dynamics in the case of the highly flexible CYP2C enzymes.

In both runs, conformational flexibility is seen between the B-C loop and the F-G loop, which opens a channel toward the N terminus, which also moves out, whereas the C terminus remains in place. In CYP2C5, an access channel is described between the B'- and the C-helices and helices G and I. This channel is closed upon substrate binding by hydrogen bonding of Lys241 with the backbone carbonyl of Val106 (Wester et al., 2003bGo). This opening is larger in the CYP2C9 crystal structure and increases during the simulation. A second solvent channel was seen in the CYP2C5 structure between the F- and I-helices. The distance between the conserved amino acids Glu300 (Glu297 in CYP2C5) of the I-helix and Glu206 of the F-helix was therefore measured and the backbone moved considerably during both runs. The influence of the mutations in the F-G loop must be taken into consideration when these results are interpreted, especially since the positively charged Lys206 was mutated to the negatively charged amino acid Glu206. The association to the membrane is also likely to affect the conformational freedom in this region. The F-G loop may function as a lid to the active site and the hydrophobic outer part is likely to be attached to the membrane. Recognition and access of substrates or solvent seem possible, in the region between the G'- and B'-helices, between the B-C loop and the ß1-1, and between the F- and I-helices. The CYP2C5 structure has a smaller active site, where the F-G loop is positioned closer to the heme (Table 4), and it seems reasonable to believe that this is also a possible conformation for CYP2C9. The wider active site described by the CYP2C9 crystal structure could reflect the conformation in which this crystal has been captured. A half-opened structure could then also rationalize the position of warfarin as being a transition state. It seems reasonable, therefore, to believe that CYP2C9 and CYP2C5 are more similar than they seem from experiments and that differences could be a result of the experimental uncertainties introduced by low resolution and mutations.

It is our opinion that the results of molecular dynamic simulations have to be evaluated with care. At the moment, it is still difficult to simulate or validate a biological process such as substrate access and recognition. Simulations might give a picture of what is mechanistically possible for the protein, but the actual course of events also strongly depends on external factors such as the influence of an approaching substrate and the presence of the membrane.

Nevertheless, modeling attempts such as homology modeling and molecular dynamics simulations create hypotheses that can later be used to design experiments and analyze emerging experimental data.


    Footnotes
 
ABBREVIATIONS: P450, cytochrome P450; PDB, Protein Data Bank; CPCA, consensus principal component analysis; MIF, molecular interaction field; RMSD, root mean square deviation; 3D, three-dimensional; PCA, principal component analysis; MD, molecular dynamics.

Address correspondence to: Lovisa Afzelius, AstraZeneca R&D, Mölndal,S-431 83 Mölndal, Sweden. E-mail: Lovisa.afzelius{at}astrazeneca.com


    References
 Top
 Abstract
 Materials and Methods
 Results
 Discussion
 References
 


Afzelius L, Zamora I, Ridderstrom M, Andersson TB, Karlen A, and Masimirembwa CM (2001) Competitive CYP2C9 inhibitors: enzyme inhibition studies, protein homology modeling and three-dimensional quantitative structure-activity relationship analysis. Mol Pharmacol 59: 909-919.[Abstract/Free Full Text]

Baroni M, Cruciani G, Costantino G, Riganelli D, Valigi R, and Clementi S (1993) Generating optimal linear PLS estimations (GOLPE): an advanced chemometric tool for handling 3D-QSAR problems. Quant Struct-Act Relat 12: 9-20.

Berendsen HJ, Postma JPM, van Gunsteren WF, DiNola A, and Haak JR (1984) Molecular dynamics with coupling to an external bath. J Phys Chem 81: 3684-3690.[CrossRef]

Case DA, Pearlman DA, Caldwell JW, Cheatham TE III, Wang J, Ross WS, Simmerling CL, Darden TA, Merz KM, Stanton RV, et al. (2002) AMBER 7, University of California, San Francisco.

Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM Jr, Ferguson DM, Spellmeyer DC, Fox T, Caldwell JW, and Kollman PA (1995) A second generation force field for the simulation of proteins, nucleic acids and organic molecules. J Am Chem Soc 117: 5179-5197.[CrossRef]

Cupp-Vickery J, Anderson R, and Hatziris Z (2000) Crystal structures of ligand complexes of P450eryF exhibiting homotropic cooperativity. Proc Natl Acad Sci USA 97: 3050-3055.[Abstract/Free Full Text]

Dai R, Pincus MR, and Friedman FK (2000) Molecular modeling of mammalian cytochrome P450. Cell Mol Life Sci 57: 487-499.[CrossRef][Medline]

Davies C, Witham K, Scott JR, Pearson A, DeVoss JJ, Graham SE, and Gillam EMJ (2004) Assessment of arginine 97 and lysine 72 as determinants of substrate specificity in cytochrome P450 2C9 (CYP2C9). Drug Metab Dispos 32: 431-436.[Abstract/Free Full Text]

de Groot MJ, Alex AA, and Jones BC (2002) Development of a combined protein and pharmacophore model for cytochrome P450 2C9. J Med Chem 45: 1983-1993.[CrossRef][Medline]

Flanagan JU, McLaughlin LA, Paine MJ, Sutcliffe MJ, Roberts GC, and Wolf CR (2003) Role of conserved Asp293 of cytochrome P450 2C9 in substrate recognition and catalytic activity. Biochem J 370: 921-926.[Medline]

Giammona DA (1984), Ph.D. thesis, University of California.

Goodford PJ (1985) A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. J Med Chem 28: 849-857.[CrossRef][Medline]

Graham-Lorence S and Peterson JA (1996) P450s: structural similarities and functional differences. FASEB J 10: 206-214.[Medline]

Hasemann CA, Ravichandran KG, Peterson JA, and Deisenhofer J (1994) Crystal structure and refinement of cytochrome P450terp at 2.3 A resolution. J Mol Biol 236: 1169-1185.[CrossRef][Medline]

Jung F, Griffin KJ, Song W, Richardson TH, Yang M, and Johnson EF (1998) Identification of amino acid substitutions that confer a high affinity for sulfaphenazole binding and a high catalytic efficiency for warfarin metabolism to P450 2C19. Biochemistry 37: 16270-16279.[CrossRef][Medline]

Kastenholz MA, Pastor M, Cruciani G, Haaksma EE, and Fox T (2000) GRID/CPCA: a new computational tool to design selective ligands. J Med Chem 43: 3033-3044.[CrossRef][Medline]

Kirton SB, Baxter CA, and Sutcliffe MJ (2002) Comparative modelling of cytochromes P450. Adv Drug Delivery Rev 54: 385-406.[CrossRef][Medline]

Laskowski RA, MacArthur MW, Moss DS, and Thornton JM (1993) PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Crystallogr 26: 283-291.[CrossRef]

Lewis DF (2000) Modeling human cytochromes P450 for evaluating drug metabolism: an update. Drug Metab Drug Interact 16: 307-324.[Medline]

Lewis DF (2002) Homology modelling of human CYP2 family enzymes based on the CYP2C5 crystal structure. Xenobiotica 32: 305-323.[CrossRef][Medline]

Lewis DF, Dickins M, Weaver RJ, Eddershaw PJ, Goldfarb PS, and Tarbit MH (1998) Molecular modelling of human CYP2C subfamily enzymes CYP2C9 and CYP2C19: rationalization of substrate specificity and site-directed mutagenesis experiments in the CYP2C subfamily. Xenobiotica 28: 235-268.[CrossRef][Medline]

Melet A, Assrir N, Jean P, Pilar Lopez-Garcia M, Marques-Soares C, Jaouen M, Dansette PM, Sari MA, and Mansuy D (2003) Substrate selectivity of human cytochrome P450 2C9:importance of residues 476, 365 and 114 in recognition of diclofenac and sulfaphenazole and in mechanism-based inactivation by tienilic acid. Arch Biochem Biophys 409: 80-91.[CrossRef][Medline]

Ortiz de Montellano PR and De Voss JJ (2002) Oxidizing species in the mechanism of cytochrome P450. Nat Prod Rep 19: 477-493.[CrossRef][Medline]

Payne VA, Chang Y-T, and Loew GH (1999) Homology modeling and substrate binding study of human CYP2C9 enzyme. Proteins Struct Funct Genet 37: 176-190.[CrossRef][Medline]

Peterson JA and Graham SE (1998) A close family resemblance: the importance of structure in understanding cytochromes P450. Structure 6: 1079-1085.[Medline]

Poulos TL, Finzel BC, and Howard AJ (1986) Crystal structure of substrate-free Pseudomonas putida cytochrome P-450. Biochemistry 25: 5314-5322.[CrossRef][Medline]

Ridderstrom M, Masimirembwa C, Trump-Kallmeyer S, Ahlefelt M, Otter C, and Andersson TB (2000) Arginines 97 and 108 in CYP2C9 are important determinants of the catalytic function. Biochem Biophys Res Commun 270: 983-987.[CrossRef][Medline]

Ridderstrom M, Zamora I, Fjellstrom O, and Andersson TB (2001) Analysis of selective regions in the active sites of human cytochromes P450, 2C8, 2C9, 2C18 and 2C19 homology models using GRID/CPCA. J Med Chem 44: 4072-4081.[CrossRef][Medline]

Ryckaert J-R, Ciccotti G, and Berendsen HJC (1977) Numerical integration of the cartesian equation of motion of a system with constraints: molecular dynamics of n-alkanes. J Comput Phys 23: 1977.

Schlichting I, Berendzen J, Chu K, Stock AM, Maves SA, Benson DE, Sweet RM, Ringe D, Petsko GA, and Sligar SG (2000) The catalytic pathway of cytochrome p450cam at atomic resolution. Science (Wash DC) 287: 1615-1622.[Abstract/Free Full Text]

Sevrioukova IF, Li H, Zhang H, Peterson JA, and Poulos TL (1999) Structure of a cytochrome P450-redox partner electron-transfer complex. Proc Natl Acad Sci USA 96: 1863-1868.[Abstract/Free Full Text]

Szklarz GD and Halpert JR (1997) Use of homology modeling in conjunction with site-directed mutagenesis for analysis of structure-function relationships of mammalian cytochromes P450. Life Sci 61: 2507-2520.[CrossRef][Medline]

Wester MR, Johnson EF, Marques-Soares C, Dansette PM, Mansuy D, and Stout CD (2003a) Structure of a substrate complex of mammalian cytochrome P450 2C5 at 2.3 Å resolution: evidence for multiple substrate binding modes. Biochemistry 42: 6370-6379.[CrossRef][Medline]

Wester MR, Johnson EF, Marques-Soares C, Dijols S, Dansette PM, Mansuy D, and Stout CD (2003b) Structure of mammalian cytochrome P450 2C5 complexed with diclofenac at 2.1 Å resolution: evidence for an induced fit model of substrate binding. Biochemistry 42: 9335-9345.[CrossRef][Medline]

Williams PA, Cosme J, Sridhar V, Johnson EF, and McRee DE (2000) Mammalian microsomal cytochrome P450 monooxygenase: structural adaptations for membrane binding and functional diversity. Mol Cell 5: 121-131.[CrossRef][Medline]

Williams PA, Cosme J, Ward A, Angove HC, Matak Vinkovic D, and Jhoti H (2003) Crystal structure of human cytochrome P450 2C9 with bound warfarin. Nature (Lond) 424: 464-468.[CrossRef][Medline]

Word J, Lovell SC, Richardson JS, and Richardson DC (1999) Asparagine and glutamine: using hydrogen atom contacts in the choice of side-chain amide orientation. J Mol Biol 285: 1733-1747.

Zam