Abstract
We present a model for volume of distribution at steady state (VDss) prediction, via fraction unbound in tissues, from the Øie–equation as an extension of our and other authors’ previous work. It is based on easily determined or computed physicochemical descriptors such as logD7.4 and fi (7.4) (cationic fraction ionized at pH 7.4) in addition to fraction unbound in plasma (fup). We had collected, as part of other work, an extensive dataset of VDss and fup values and used the descriptors above, gathered from the literature, for a preliminary assessment of the robustness of the method applied to 191 different compounds belonging to different charge classes and scaffolds. After this step, we addressed the use of easily computed physicochemical descriptors and experimentally derived fup on the same data set and compare the results between the two approaches and against the Øie–Tozer equation using in vivo data. This approach positions itself between fully computational models and scaling methods based on in vivo animal models or in vitro Kp (tissue:plasma) data utilizing model tissues. We consider it a useful and orthogonal complement to the two very diverse approaches mentioned yet requiring minimal in vitro experimental work. It offers a relatively inexpensive, rapid, intuitive, and simple way to predict VDss in humans, at a relatively early stage of the drug discovery.
SIGNIFICANCE STATEMENT This method allows the prediction of volume of distribution at steady state for small molecules in humans without the use of animal PK data because it utilizes only in vitro data. It is therefore amenable to use at early stages, simple, intuitive, animal-sparing, and quite accurate, and it may serve scaling efforts well. Furthermore, utilizing the same dataset, we show that the performance of a model using computed pKa and logD7.4, still using experimental fraction unbound in plasma, compares well with the model using experimentally derived values.
Introduction
Volume of distribution, in its various forms [e.g., volume of distribution at steady state (VDss), volume of distribution of beta phase (VDβ), volume of distribution of central compartment (VDc)], does not provide any insight into the mechanism of distribution but only a descriptive index of the propensity of a compound to partition away from the plasma compartment, and it is an important determinant, together with clearance, of mean residence time or half-life, the latter using the volume of distribution of beta phase VDβ as the volume term. There is not a good or a bad volume of distribution, and its value may range from 0.04 l/kg (plasma volume) to several hundreds of l/kg. The total body water volume is generally taken to be 0.6 to 0.7 l/kg, and it may be considered as an upper physiologic limit, thus offering a threshold value for the definition of moderate or high volume of distribution. Lombardo et al., 2009 discussed these aspects in some detail and point out that there may be some dominant interactions that, in general, are governed to a large extent by physicochemical properties (Smith et al., 2015). This may explain the success in prediction upon assumption of a largely passive diffusion nature, despite hundreds or possibly thousands of specific and nonspecific drug:tissue interactions. One recognized phenomenon, which may contribute to very large volumes of distribution generally observed for basic compounds (Lombardo et al., 2018), is lysosomal trapping described by Daniel and Wójcikowski (1997) and mentioned as a possible contributor, for example, by Lombardo et al. (2002, 2004) and Sui et al. (2009).
The recent publication of a large dataset of human pharmacokinetics (PK) data (Lombardo et al., 2018) provided some impetus to revisit the prediction of VDss, using the Øie–Tozer equation (Øie and Tozer, 1979), to extract and then predict the fraction unbound in tissues (fut). The latter parameter, and the equation on which it is based, has been shown to be predictable with good results from relatively inexpensive measurements and/or computed descriptors (Lombardo et al., 2002, 2004; Sui et al., 2009). These approaches also offer access to VDss and its application to the prediction of human PK, which is discussed in more detail elsewhere ((Lombardo et al., 2009, 2013)).
There are several other methods to predict VDss. They range from scaling of animal VDss data (Obach et al., 1997; Ward and Smith, 2004; Fagerholm, 2007; Jones et al., 2011; Lombardo et al., 2013; Petersson et al., 2019), to the use of selected animal tissue as surrogate for human VDss predictions (Björkman, 2002), and to the use of physiologically-based pharmacokinetic (PBPK) or mechanistic approaches (Chan et al., 2018; Shimizu et al., 2019). Some authors have reported use of chromatographic indices determination from immobilized artificial membranes (Sui et al., 2009 for the prediction of fut). Other authors have coupled those chromatographic indices to binding affinity from immobilized human serum albumin on a chromatographic column for the direct prediction of VDss (Hollósy et al., 2006). In addition, the direct calculation of VDss, using computed descriptors from molecular structure without the use of any experimental parameter, has been reported, among others, by Ghafourian et al. (2006), Gleeson et al. (2006), Lombardo et al. (2006), Berellini et al. (2009), Gombar and Hall (2013), and Lombardo and Jing (2016).
In regard to the application of fut and the descriptors used to predict it, and differing from other authors (Hollósy et al., 2006; Sui et al., 2009), we prefer the use of a well-known (and easily computed) lipophilicity parameter, such as , referred to as logD7.4 in the rest of this work. This physicochemical parameter is much more ingrained in the use and understanding by the DMPK and Medicinal Chemistry scientists, as opposed to chromatographic indices that do not exactly reproduce logD7.4, although they may correlate with it, and ultimately with the target property. We also used the experimental plasma-free fraction from several experimental methods, as opposed to either calculated values, or values obtained through chromatographic affinity determination. We believe that even though determination of fraction unbound in plasma (fup) for highly bound compounds may suffer from uncertainty, it is highly preferable to logK values based on chromatographic affinity on albumin only. Along similar lines, whereas computational approaches (Gleeson, 2007) offer access to data from structure only, they do not seem to measure up to the accurate prediction level needed, especially in the case of low fup (high binding). Furthermore, in more recent years, methods and tools for higher throughput fup determination, such as the rapid equilibrium dialysis method (Waters et al., 2008), have become available in 96-well plates, are amenable to automation, and have become a mainstay in the pharmaceutical industry.
Materials and Methods
Human VDss and fup Dataset.
All VDss and fup data were taken from the recent trend analysis reported by Lombardo et al. (2018), referring to human intravenous data (VDss) with accompanying data for fup. In all cases, the most recently reported VDss and fup data were used for the modeling effort, as some changes had been made, in successive publications, on some of the data reported by Lombardo et al. (2004) for the 120 cationic and neutral compounds used in that work.
The steps toward data collections were extensively detailed in the cited paper (Lombardo et al., 2018) as well as in previous work (Obach et al., 2008). Briefly, those data were assembled from original papers, and a complete list of data, references, and comments can be found in the Supplemental Material for the respective publication, with the latter including all data from the former work. Some data were found in the literature directly as VDss, some VDss values were calculated using reported micro- or macroconstants, and some others after digitization of concentration versus time plots via noncompartmental analysis. The plasma protein–binding data reported in the cited work were taken from original references as well, and they do overall refer to multiple methods of determination, spanning across orders of magnitude. The full set of data, including logD7.4, fi (7.4), and pKa data, with full references, is provided as Supplemental Material (Supplemental Material 1: human VDss and fup dataset).
logD7.4 and pKa Data.
The experimental logD7.4 as well as pKa data for the calculation of the fraction ionized at pH 7.4 (fi (7.4)) were taken from the literature, for the initial set of 199 compounds. Overall, they were taken from different authors using different methods, but for basic and neutral compounds all logD7.4 values were taken from the work of Lombardo et al. (2004), and we refer to the pKa references reported therein. In that work, all logD7.4 data referred to their published ElogD7.4 method (Lombardo et al., 2001), and that offers a measure of consistency. The data for the other compounds (acidic and zwitterionic) were taken from literature, and they are all provided in the Supplemental Material together with the appropriate references (Supplemental Material 1: human VDss and fup dataset). We were able to initially gather experimental logD7.4 and pKa data for 199 compounds, adding 79 acidic and zwitterionic compounds to the 120 basic and neutral compounds taken from literature (Lombardo et al., 2004). Eight of these compounds were excluded because of a calculated negative fut value, which cannot be transformed into a logarithmic value. We kept all other compounds in the preliminary model with 191 compounds and then also built models in turn, excluding the following: 1) the upper outliers (13 compounds with fut > 1) on a dataset of 178 remaining compounds, and 2) the 15 compounds with fup < 0.01, on a dataset of 176 compounds. For all compounds, the total anionic and cationic fractions were calculated using the sum of the contributions of each ionized species, treated independently. One quaternary ammonium compound (cephaloridine) was treated as a cation utilizing a high pKa to ensure the generation of a highly positive fi (7.4).
Computed logD7.4 as well as pKa data were calculated using MoKa (v. 3.2.1; Molecular Discovery) to explore its use as in Lombardo et al. (2002), but limited to the present data set of 191 compounds to have a direct comparison with the same data.
Calculation of fut and VDss from Human Data.
The calculation of fut was performed from human VDss and fup data, using a rearranged version of the Øie–Tozer equation (Øie and Tozer, 1979) and solving for fut. The classic equation was used then to recalculate VDss from the predicted fut values. The two equations are shown below in the order described.In these equations, fup and fut have the usual meaning of fraction unbound in plasma and fraction unbound in tissues, respectively. The term RE/I refers to the ratio of extravascular to intravascular proteins, but it accounts for albumin only, and it takes a value of 1.4. VP, VE, and VR take the values, respectively, of 0.0436, 0.151, and 0.380 l/kg, and they are defined, respectively, as the plasma volume, the extracellular fluid volume, and the physical volume in which the drug distributes minus the extracellular space (VR, remainder volume).
Generation and Assessment of Predictive Performance of the Models.
We have used, as in past work from our and other authors’ modeling efforts, several statistics based on geometric mean fold error on both fut and VDss, utilizing training and test sets. As in previous work by us and other authors, we used a rugged leave-class-out (LCO) approach and the percentage below 2- and 3-fold error of predicted versus observed values. Training and test set data are reported. All models were built using the multiple linear regression and other statistical, filter, reader, and writer nodes as available in Knime (v.3.4.2; Knime, Konstanz, Germany).
We reported, as in the past (Lombardo et al., 2004; Lombardo and Jing, 2016), and as adopted by other authors (Sui et al., 2009), the performance of the LCO approach in which each model is built without a class of close analogs (e.g., nonsteroidal anti-inflammatory drugs or benzodiazepines), and then the model is tested against the prediction of that class.
Results
Characteristics of the Pharmacokinetic and Physicochemical Values.
The data for an overall set of 199 compounds, with VDss and fup available, and for which we were able to find logD7.4 and pKa literature values, were all taken from the work of Lombardo et al. (2018), which covers a very broad property and structural space. The description of criteria for data collection is briefly offered in Materials and Methods and, more extensively, in the work by Obach et al. (2008) and Lombardo et al. (2018). The compounds in the present data set range from a VDss of 0.04 (suprofen) to a VDss of 60 (amiodarone) l/kg, and from a fup of 0.0002 (amiodarone) to a fup of 0.97 (gabapentin). The heterogeneity of, and possible errors present in the data sources found, is acknowledged, especially for fup, where different techniques have been reported in the literature, whereas, for example, all neutral and basic compounds had logD7.4 values derived from one source, as in Lombardo et al. (2004). Structural–therapeutic classes were also identified for further analysis as reported in previous work, and no class was considered unless it comprised at least 10 analogs.
Model Building.
Models 1 and 2 (Model 1 shown in eq. 1 below) were generated using the available experimental data on 199 compounds, one including all but eight compounds with fut < 0 (dataset of 191) and the other excluding, in addition, 13 compounds with fut > 1 (dataset of 178), respectively. They were built as preliminary models based on experimental logD7.4 as well as fup and cationic fi (7.4) from experimental pKa, to assess their predictive performance using several statistics reported in Tables 1 and 2.(1)
The sign of the parameters, as observed in previous work (Lombardo et al., 2004; Sui et al., 2009), is intuitively what should be expected because a very high fraction ionized (for a base) should be detrimental to free fraction in tissues, the electrostatic interactions with membrane phospholipids being a very significant determinant of a compound behavior. Lipophilicity should show the same sign, although the value of the coefficient, not being scaled and ranging at least an order of magnitude (about 13 logD7.4 units), shows a lower value. Plasma protein binding, conversely, would limit access to tissues and membranes (although there is albumin in addition to many other proteins in tissues), and its coefficient is indeed positive. Tables 1 and 2 show the coefficients and relevant statistics confirming the extremely high relevance of all three parameters. They also show that there was not much difference whether the models were built with or without the inclusion of fut values above 1. Table 3 shows the performance of the models in the prediction of fut and back calculation of VDss from predicted values, and it reports the statistics on Model 3, which was built using the 176 compounds with fup values at or above 0.01.
We did explore the use of fraction ionized for anionic groups (whether anions or zwitterions; data not shown) as a separate term, but we did not find it to be significant in the initial models, with experimental lipophilicity and pKa data. These results suggest that the anionic charge fraction is not a needed descriptor, at least for our data set, and we did not pursue its application any further. Sui et al. (2009), in contrast, included both charges (only the cationic charge for zwitterions) in the single charge descriptor they used.
The coefficient reported by Sui et al. (2009) for the logarithm of capacity factor from immobilized artificial membrane columns index is closer to our logD7.4 coefficient (−0.3199 vs. −0.249, respectively, Model 1) than either of the other two other coefficients, which were reported to be smaller (taken as absolute values) than ours, with 0.4699 versus 0.735 for logfup and −0.4069 versus −0.999 for fi (7.4), respectively. The intercept (error) is not significant in our Model 1 (P > |t| = 0.333), whereas it was significant in the model we explored using both charge types as independent descriptors.
As a test, we calculated the fut and VDss for the eight compounds we excluded because of a negative fut value, using Model 1. The statistics are in Table 4, and we note in this work that they are all anionic compounds with very small VDss values (in some cases confined to blood or plasma, with VDss of 0.08 and 0.04 l/kg, respectively), and the overall geometric mean fold error (GMFE) on the test set is 3.04. This is due to significant outliers (e.g., glyburide with a VDss fold error (FE) value of 6.26) that weigh heavily in a small test set. The entire set of compounds with observed and predicted values is reported (Supplemental Table 1). The overall outcome of this test, however, was unsatisfactory.
As a second step, we calculated the predicted fut and VDss values for the 13 compounds with fut > 1 utilizing Model 2, which was built with their exclusion. The full results are shown in the Supplemental Material (Supplemental Table 2), and the GMFE for VDss on the test set was 1.95, whereas the bias (observed-predicted) was found to be −0.23, as reported in Table 4. We note that all compounds in this set have experimental VDss values <0.5, and that the prediction does a reasonably good job in keeping the GMFE of VDss prediction just below 2-fold, but with a much larger GMFE for fut at 5.6. The two largest values of fold error were found for cephradine to be 3.64 and enalaprilat 3.13 (Supplemental Table 2).
We attempted to remove all compounds with a fup < 0.01, based on recent guidance from the Food and Drug Administration for in vitro drug-drug interaction (DDI) studies (https://www.fda.gov/regulatory-information/search-fda-guidance-documents/vitro-metabolism-and-transporter-mediated-drug-drug-interaction-studies-guidance-industry) out of concern about the accuracy of such measurements. Also, we considered the simulations reported by Waters and Lombardo (2010) regarding the sensitivity of fut on fup and RE/I, especially when looking at compounds with fup < 0.1. Lombardo et al. (2002) had also explored, on a small test set of 14 proprietary compounds, the exclusion of compounds with fup < 0.02, and reported significantly improved mean FE values on the prediction, although their usable set was reduced, in some instances, to six compounds, when compounds with such low fup were excluded. When we performed a similar test, excluding the very highly bound compounds from the 191 compounds dataset used for Model 1, and recast the model, now termed Model 3, the latter yielded a reasonably good result. The GMFE for fut and VDss on the training set for the 176 compounds model were 2.10 and 1.73, respectively, as shown in Table 3. These values are almost identical to the values for Model 1. The test set (Supplemental Table 3), represented by the 15 compounds with fup < 0.01, yielded a GMFE of 2.20 for VDss, as shown in Table 4, which is a bit higher than the GMFE for the prediction of compounds with fut > 1 (Table 4).
We also performed what we consider a very rugged test, the LCO, which we and other authors have used in several examples of predictive work (Lombardo et al., 2004; Sui et al., 2009; Lombardo and Jing, 2016). In this approach, all members of a class of analogs (at least 10 for each class) are removed, and the model is built without them. Then each model is used to predict the class of analogs not used in deriving it. The results are shown in Table 5, and the overall GMFE was a very good 1.69 with 68% and 89% of compounds predicted with 2- and 3-fold, respectively.
In addition, we performed a test utilizing 22 of the 60 compounds, which overlapped with the set used by Lombardo et al. (2013) in their scaling work utilizing the Øie–Tozer method based on all three species, as in model V7 in that work. Two compounds were then excluded in keeping with their approach of using only compounds with in vivo 0 < fut < 1, and we recalculated the GMFE for those 20 compounds, from the available Supplemental Material using model V7, obtaining a value of 1.44. The GMFE calculated from Model 1–predicted VDss (training set) yielded a value of 1.36. The full set of data is reported in the Supplemental Material (Supplemental Table 4). Most recently, Petersson et al. (2019) have revisited and discussed the use of fut (from the average of three species) as a predictor of VDss in humans with and without the elimination of aberrant fut values. In analogy with our conclusions, they recommended it as the most accurate method, at least at later stages, when data in rat, dog, and monkey are available.
Armed with these results, generated using experimentally determined logD7.4, pKa, and fup values, we set out to explore the use of computed logD7.4 and pKa values, using MoKa, for the same 191 compounds we used to develop Model 1 (eq. 1). Model 1c was built, and its statistics (on the 191 compounds of the training set) are reported in Table 6. We note that the coefficients of the equation are very similar to the ones in Model 1 (eq. 1) and that the observed GMFE values for fut (2.36) and VDss (1.86) for the same training set of 191 compounds are only slightly higher than the values reported in Table 3 for Model 1. In addition, the model shows a greatly increased accuracy with respect to the data reported by Lombardo et al. (2002), which were based on significantly smaller dataset (64 compounds) comprising only basic and neutral compounds. Both outcomes, however, were obtained after recalculation of training set values, and all 64 compounds were comprised within the 191 compounds set.
A LCO approach utilizing computed logD7.4 and pKa values was also tested, and the results are in Table 7. Overall, the performance is like the one observed for Model 1 (Table 5) even though there are some noticeable differences between models. For example, β-lactams perform better with the former (all in vitro data), and benzodiazepine performs better with the latter model (computed logD7.4 and pKa). Similarly, the use of computed descriptors (Supplemental Table 4) did not seem to worsen the performance, and the overall GMFE for the 20 compounds mentioned earlier (used for the in vivo methods comparison) is 1.44, which is essentially the same as the value obtained from Model 1 and identical to the recalculated value using in vivo literature data (Lombardo et al., 2013, model V7 in that work). It is recognized, however, that the test set is relatively small.
We also examined the performance of Model 1 across the ranges of predicted fut values and the four charge classes, and the results for the latter are shown in Table 8. We note that the performance (recalculated values from Model 1) is not highly variable by the charge class, and indeed anions, the class with generally low VDss, and zwitterions are predicted very well. Thus, the homogeneity of prediction is generally preserved across charge classes. These observations are generally confirmed graphically by the plots in Fig. 1 (compounds shaded by fut ranges) and Fig. 2. In Fig. 1, we show the observed versus predicted VDss value, and we note that there is some variation (generally underprediction) at higher rather than lower VDss (and predicted fut) values. In Fig. 2, we show the same compounds colored by their charge class, and the red dashed vertical line is set at 0.7 l/kg or total body water, on the x-axis (predicted VDss values). There are 65 compounds with predicted VDss < 0.7 l/kg, and the GMFE is 1.59 with 75% and 91% of compounds below 2- and 3-fold error, respectively. The blue line, instead, identifies an (arbitrary) upper limit of 2.8 l/kg approximately equal to four times the total body water, with 52 compounds above that threshold. In this range, the GMFE is 1.96, and the corresponding fold value thresholds are 58% and 81%.
Discussion
We start our discussion from the exploration of the anionic fraction, and we note that Sui et al. (2009) did not report the use of a specific anionic fraction term. They used only one fi (7.4) term in their equations, treating the zwitterionic compounds (six in the training and one in the test set) as cations (with a positive sign of the values), and the anions as such with a negative sign for the latter fi (7.4) values. They also used a chromatographic index and a smaller data set (121 compounds), with a somewhat lower range of VDss, which may have influenced the significance of the charge state, and the overall magnitude of coefficients. Our coefficients for logfup and fi (7.4) are in fact significantly different from theirs (see Results). We did try, as mentioned in Results, the incorporation of a separate term for anionic charge fraction, but we did not find it necessary. In addition, the coefficients of Model 1 are very close to the coefficients reported by Lombardo et al. (2004) for 120 neutral and basic compounds only (set entirely contained within the 191 compounds used). In that work, the authors reported values of −0.2294 (ElogD7.4), 0.8885 for logfup, and −0.9311 for fi (7.4). This observation suggests that the fraction ionized for anionic groups may not be strongly correlated with fut, even after the inclusion of a sizable number of anionic and zwitterionic molecules. It is possible that the fup and logD7.4 terms for anionic compounds, considering their higher propensity toward protein binding (largely but not exclusively to albumin), may be able to explain the smaller variance in fut (and VDss) for these compounds. At any rate, as we did not find the anionic fi (7.4) to be necessary, and, at least within the domain of physicochemical properties, range of VDss values, and structural features expressed by our dataset, there would be no need to determine it experimentally for acidic compounds.
As described in Results, we tried to predict the VDss of the eight compounds with a negative fut value that we had set aside from the overall set of 191 compounds. The results, shown in Table 4 (statistics in Supplemental Table 1) for the full set, yielded a relatively poor performance with only one compound (naproxen) predicted at a FE of 2 and all other above, for an overall GMFE of 3.04. This set is, of course, a very harsh test, as it may be expected, and the model cannot effectively compensate for the negative values obtained through the rearranged Øie–Tozer equation, its basic assumption being passive diffusion. A poor performance, in our experience, is sometimes observed when data from animal studies with a back-calculated fut < 0 and fut > 1 are used, as species to species differences seem to matter significantly. That is the basis of the selection of the 38 compounds by Lombardo et al. (2013) all having 0 < fut < 1. The prediction using the present model(s) will always generate positive fut values that will offer no potential warning, as it may be the case with methods using animal data. Conversely, very recent results such as those reported by Petersson et al. (2019) seem to indicate that, even with fut < 0 and fut > 1 values, a good overall prediction can be generated. Single compounds may have to be examined though, via the generation of more data at later stages. This will involve the use of animal data and much more detailed studies (e.g., transporters), which is much more expensive and involved, and it is reserved for late(r)-stage candidates.
We then turned our attention to the calculation of the fut values for the compounds having fut > 1. Such values, fut being a fraction, are also considered an aberrant product of the rearranged Øie–Tozer equation. It may be reasonable to expect in general a predicted fut (much) smaller than a value calculated from the Øie–Tozer equation, being that these compounds were excluded a priori. Nevertheless, we obtained some predicted fut values >1, and, in most cases, they yielded reasonably close and acceptable predictions of VDss. This may lend support to the fact that fut > 1 values may still generate good VDss predictions. That is, a seemingly aberrant predicted fut, which may caution against its use, may in fact be usable in the prediction. We caution, though, that it is difficult to judge the validity of such results, in the absence of corroborating Supplemental Material. At any rate, the predictive GMFE from Table 4 was 1.95, whereas the results for the full dataset are also reported (Supplemental Table 2).
The argument based on transporters, as a possible explanation for either type of aberrant results, i.e., fut < 0 and fut > 1, offered by Waters and Lombardo (2010), may be of difficult application for prospective predictions. This may be the case even if transporters data and/or observation from in vivo PK in animals were available for the compounds being examined. Grover and Benet (2009) showed that transporters could be important, especially at the organ level and primarily in liver and kidney, and they can influence VDss, but their effect is generally limited to 2-fold and varies greatly from species to species. Furthermore, the impact of transporters, as it may be intuitively understood, is different depending on whether they are efflux or uptake ones and depending on the type of volume of distribution considered. Smith et al. (2015) more recently reiterated the fact that transporters do not seem to be major determinants of volume of distribution, even though there are notable exceptions. The latter authors note that charge (first and foremost) and then lipophilicity are the primary determinants of volume of distribution.
Thus, it may be more likely that the empirical nature of the Øie–Tozer equation, coupled with the choice of fixed (RE/I) or species-dependent terms, plus the uncertainties associated with the determination of fup especially when very low, are the causes of 1 < fut and fut < 0 values. Waters and Lombardo (2010) showed the impact of the RE/I term, by simulating fut back-calculation response when varying its value, for compounds with relatively high and relatively low fup. In the context of the present work, the test set is too small to allow a definitive decision, as only a few (4/13) predicted fut > 1 values were observed (Supplemental Table 2). We add that we generally have found the use of animal fut < 0 and fut > 1 values detrimental to a good performance toward predicting human VDss.
As a third approach, we examined the prediction of compounds with fup < 0.01 in part based on the findings of Waters and Lombardo (2010) on the sensitivity toward RE/I of back-calculated fut for those compounds, and in part based on Food and Drug Administration guidelines for in vitro DDI studies (https://www.fda.gov/regulatory-information/search-fda-guidance-documents/vitro-metabolism-and-transporter-mediated-drug-drug-interaction-studies-guidance-industry). Model 3 did show similar performance on recalculated values for the training set as Model 1 did. Its predictive GMFE based on the excluded (test) compounds only yielded a value of 2.20, whereas, for example, the prediction of compounds with fut > 1 yielded a GMFE of 1.95 (Table 4, full set in Supplemental Table 3). Also, the prediction yielded a very respectable percentage of <2-fold value of 73%, but identical to the percentage of <3-fold. That is, no compound was predicted between 2- and 3-fold, and, outside the narrower limit, a larger error was observed. This may suggest caution in predicting VDss values for compounds with fup values (known a priori as required by the model) below 0.01.
The next step was the prediction of the compound classes using the LCO approach, as described in Results. In general, the results show a very good performance across many classes with the tricyclic antidepressant being the only class with GMFE > 2, which may be due to the difficulty, in general, to predict very high volumes of distribution. Excellent results were obtained for steroids, adrenergics, and fluoroquinolones. The nonsteroidal anti-inflammatory drugs, which have generally relatively low volumes, are overall well-predicted (GMFE 1.88), but with an inferior performance for the 2-fold range and a respectable 83% within 3-fold. We also tested the model against some of the in vivo data reported by Lombardo et al. (2013) utilizing the 20 overlapping compounds with 0 < fut < 1. The animals are naive to each of the compounds administered, but the statistical and scaling methods are training set dependent. Thus, the approach is a fair comparison with other scaling methods. We obtained a GMFE of 1.36 versus a GMFE of 1.44 for the in vivo prediction using three animal species. We point out, at any rate, that cost and ethical considerations, as well as time and amount of available material, weigh heavily in these comparisons, and they are clearly in favor of in silico and in vitro methods. Furthermore, even outside the use of computed descriptors, we note that when scaling of human PK prediction is needed (generally at later discovery stages), all experimental data needed should have been long generated for those and even earlier analogs. This approach positions itself as an orthogonal and inexpensive one between in silico methods (based on structure only) and methods such as Øie–Tozer and PBPK utilizing extended in vivo data.
Along the same lines discussed above and illustrated in Results, we performed a LCO test using computed logD7.4 and fi (7.4) and systematic removal of all analogs of a class to be predicted. The results (Table 7) are comparable to the results obtained with the model based on experimental values (Table 5), and whereas there is some decrease in accuracy for β-lactams and tricyclic antidepressant, there is an improvement on benzodiazepines. Furthermore, we have performed a test using the same 20 compounds described above with in vivo data in three species, and we found, albeit within the limit of the small set, and considering that these are recalculated from Model 1c, that its performance (GMFE 1.44) is on par with the in vivo (GMFE 1.44) and the in vitro Model 1 (GMFE 1.36). The data for the full set are available in Supplemental Table 4.
Clearly, the accuracy of the computational method used for logD7.4 and pKa calculation is of paramount importance as the performance may vary, from class to class and across a range of structures, which the model may or may not have been well-parameterized for. It is advisable to test the prediction with a few probe compounds with the same scaffold of the compounds of interest. Several available computational methods, whether commercial or in-house, are amenable to training with proprietary compounds data, and that is an improvement that should be taken advantage of, in the prediction of pKa and logD7.4 for any application. At any rate, good quality fup values will still be needed if the Øie–Tozer approach is used.
Lastly, we discuss observations on the range of applicability of the method toward the prediction of VDss, as shown by Figs. 1 and 2 in addition to Table 8. Figure 1 shows the observed versus predicted VDss correlation across the entire range, and it is apparent that there are underpredicted values as the (predicted) VDss increases. This may caution toward its application at very low fut values and generally very high VDss values and may require, in future developments, more and/or quadratic terms. The fut value ranges are identified by the shadings. Generally, compounds with very large VDss, such as tebufelone (12 l/kg), maprotiline (45 l/kg), and amiodarone (60 l/kg), are significantly underpredicted, as exemplified by some of the reported tabular data. Figure 1 and Table 8, in contrast, show the performance of Model 1 when compounds are segregated (but not excluded in casting the model) according to charge class. We note that anions, generally having lower VDss values, as well as zwitterions, are predicted quite well, with a low maximum FE value and very high number of compounds with 2- and 3-fold, thus reinforcing the observed high number of good prediction in the lower range of the plot in Fig. 1. The same data (charge classes) are presented in Fig. 2, where compounds are now colored according to their charge class.
To define the domain of extrapolative failures when using PK scaling methods, Jolivette and Ward (2005) took the approach of calculating descriptors such as hydrogen-bond donors and acceptor numbers as well as logP. They divided VDss values in three bins, using 0.7 and 3.5 l/kg as thresholds, based on the animal data, and differentiated the results among rat, dog, and monkey. Similarly, we looked at the predicted value for the 191 compounds set by Model 1 and identified similar thresholds (0.7 and 2.8 l/kg), to identify bias and accuracy of prediction based on range of values. In doing so, we overlaid two vertical lines onto the initial plot of Fig. 2 (colored by charge class) using those thresholds. The numerical data are shown in Table 9. In general, the performance decreased by all indicators (GMFE, percentage within fold error and bias) as VDss increased, but it remained reasonably good. This is a different way to show that indeed larger volumes are more difficult to predict (as in Fig. 1), and anions, generally residing within low(er) volume ranges, seem more easily predicted than cations. Lastly, we generated a three-dimensional plot using logD7.4, fup, and FE (compound colored by the latter quantity) to identify the numerical values that might yield less accurate prediction. This is presented in Fig. 3. We show that combination of very low fup and high logD7.4 tends to increase FE. That is, a highly bound and lipophilic compound will likely not be accurate, whereas each of the two experimental values may not, by itself, necessarily yield a high FE.
The main aim of this work was to explore the predictive power and the limitations of an in vitro method to predict VDss, which would yield a good performance with easily determined experimental parameters, and which could position itself between much costlier and resource-demanding in vivo and fully in silico (structure only) methods. We note that for many analogs, let alone compounds approaching clinical candidate status, logD7.4, pKa, and fup data should be available, and thus this prediction does not require variables other than those that should be routinely measured. We also note that, whereas fup is not a parameter that should be optimized, its determination is amenable to 96-well plate and it is relatively routinely performed to explain pharmacokinetic-pharmacodynamic (PK–PD) relationships as well as for application such us unbound concentration ratios in brain and plasma. We find this approach, using easily determined physicochemical descriptors (or computed with a control on accuracy), to be quite accurate and generally on par with other methods, orthogonal to several of them, easy to use, relatively inexpensive, intuitive, and, very importantly, animal sparing, and we believe it should find application in human PK prediction at early stages of discovery.
Authorship Contributions
Participated in research design: Berellini, Lombardo.
Performed data analysis: Berellini, Lombardo.
Wrote or contributed to the writing of the manuscript: Berellini, Lombardo.
Footnotes
- Received August 6, 2019.
- Accepted September 27, 2019.
↵This article has supplemental material available at dmd.aspetjournals.org.
Abbreviations
- FE
- fold error
- fup
- fraction unbound in plasma
- fut
- fraction unbound in tissues
- GMFE
- geometric mean FE
- LCO
- leave-class-out
- VDss
- volume of distribution at steady state
- Copyright © 2019 by The American Society for Pharmacology and Experimental Therapeutics