Abstract
The utilization of in vitro data to predict drug pharmacokinetics (PK) in vivo has been a consistent practice in early drug discovery for decades. However, its success is hampered by mispredictions attributed to uncharacterized biological phenomena/experimental artifacts. Predicted drug clearance (CL) from experimental data (i.e., intrinsic clearance: CLint; fraction unbound in plasma: fu,p) is often systematically underpredicted using the well-stirred model (WSM). The objective of this study was to evaluate using empirical scalars in the WSM to correct for CL mispredictions. Drugs (N = 28) were used to generate numerical scalars on CLint (α) and fu,p (β) to minimize the absolute average fold error (AAFE) for CL predictions. These scalars were validated using an additional dataset (N = 28 drugs) and applied to a nonredundant AstraZeneca (AZ) dataset available in the literature (N = 117 drugs) for a total of 173 compounds. CL predictions using the WSM were improved for most compounds using an α value of 3.66 (∼64% < 2-fold) compared with no scaling (∼46% < 2-fold). Similarly, using a β value of 0.55 or combination of α and β scalars (values of 1.74 and 0.66, respectively) resulted in a similar improvement in predictions (∼64% < 2-fold and ∼65% < 2-fold, respectively). For highly bound compounds (fu,p ≤ 0.01), AAFE was substantially reduced across all scaling methods. Using the β scalar alone or a combination of α and β appeared optimal and produced larger magnitude corrections for highly bound compounds. Some drugs are still disproportionally mispredicted; however, the improvements in prediction error and simplicity of applying these scalars suggest its utility for early-stage CL predictions.
SIGNIFICANCE STATEMENT In early drug discovery, prediction of human clearance using in vitro experimental data plays an essential role in triaging compounds prior to in vivo studies. These predictions have been systematically underestimated. Here we introduce empirical scalars calibrated on the extent of plasma protein binding that appear to improve clearance predictions across multiple datasets. This approach can be used in early phases of drug discovery prior to the availability of preclinical data for early quantitative predictions of human clearance.
Introduction
The utilization of in vitro data to predict in vivo pharmacokinetic parameters is widely adopted, and many groups have evaluated and developed approaches to predict in vivo clearance. These approaches use in vitro data such as intrinsic clearance (CLint), free fraction of drug in plasma (fu,p), and nonspecific binding in the in vitro matrices (fu,inc), as well as measures of drug properties such as lipophilicity (logD,pH=7.4) and ionization class (anionic, basic, neutral, or zwitterion) (Hallifax et al., 2010). Attempts to use these data to improve in vitro-in vivo extrapolation (IVIVE) through different scaling approaches have yielded varying degrees of success (Grime and Riley, 2006; Berezhkovskiy, 2011; Ring et al., 2011; Hallifax and Houston, 2012). Recently, researchers have expanded on these approaches to assess prediction accuracy and have evaluated the advantages and disadvantages among the various methods (Lombardo et al., 2014, 2018; Benet and Sodhi, 2020; Umehara et al., 2020; Poulin and Haddad, 2021).
Among the physiologically relevant liver models for CL predictions, the well-stirred model (WSM) is broadly adopted (Pang and Rowland, 1977a,b). The perquisite of the WSM is that the rate of enzyme-mediated elimination is slower compared with both the rate of distribution and kinetics of plasma protein binding and assumes that the unbound drug concentration in plasma and in the hepatic aqueous environment reach instantaneous equilibrium and undergo linear kinetics.
In its simplest form, the WSM can be expressed as: in which CLhep,b is the hepatic blood CL, Qhep is the hepatic blood flow, and CLint is the intrinsic clearance of the blood perfused liver.
When assuming that the concentration in blood cells is comparable to plasma, CLhep,b equates to hepatic plasma clearance (CLhep). Furthermore, as only the free drug is assumed to be metabolized, the equation can be rearranged as: in which CLhep is the hepatic plasma clearance, fu,p is the fraction unbound in plasma, and CLint,u is the intrinsic clearance of the free drug.
“Bottom-up” approaches to CLhep prediction aim to describe CLint,u, starting from in vitro measurements in systems such as liver microsomes or hepatocytes and correcting for unbound fraction. Although this approach is useful, its quantitative accuracy is still somewhat limited and the underlying mechanisms responsible for the systematically observed underpredictions are not fully understood. Although there is a lack of consensus as to the identity and magnitude of each reason leading to poor CL predictions, the tendency to often underpredict CLint could be rationalized as a loss of activity in the hepatocyte cultures compared with in vivo systems or by imprecise estimation of the appropriate physiologic scalars (Bowman and Benet, 2019). These possibilities suggest that a constant CLint multiplier could be derived to improve in vitro-in vivo correlations. This practical approach was shown to considerably improve IVIVE by AstraZeneca (Williamson et al., 2020). Internally at Genentech, we have observed a similar degree of improvement for clinical and preclinical predictions using the same approach; disappointingly, we have observed that in vitro estimates of in vivo CLint were still suboptimal for highly bound compounds (also witnessed in the AstraZeneca dataset).
Many recent reports have emphasized a potential role of plasma proteins in hepatic drug uptake, producing an abundance of in vivo and in vitro data supporting the notion that hepatic uptake and ultimately CLhep increase with increasing binding to plasma proteins, specifically albumin (Poulin and Haddad, 2015; Bowman et al., 2019; Chang et al., 2019; Kim et al., 2019; Francis et al., 2021). In particular, Francis et al. (2021) used data collected from 26 compounds in rat and human hepatocytes to derive an empirical equation to account for the impact of plasma protein on CLint,u,in vitro in an effort to improve IVIVE-based CL predictions.
Although the mechanism of such phenomena is still unclear, this could potentially arise from a transporter-albumin interaction or could be unrelated to transporter activity. Because of the overlapping chemical space between hepatic uptake transporters and albumin substrates (i.e., lipophilic anionic compounds), finding chemical probes to investigate each hypothesis has proven challenging. The fact that plasma protein-mediated hepatic uptake is currently unaccounted for in current models of hepatic clearance may support the derivation of a CLint scalar based on fu,p magnitude.
In this study, a trend analysis approach was used to derive two scalars, “α” for CLint and “β” for fu,p, from a training set of compounds with human intravenous PK data. Here we report the impact of including different scalars in the WSM and subsequent improvements in the prediction of human CLhep for 173 compounds.
Materials and Methods
Human CL and Physiochemical Properties Dataset
The internal Genentech (GNE) dataset was generated by collecting plasma protein binding (fu,p), fraction unbound in microsomes (fu,mic), and in vitro intrinsic clearance (CLint) parameters for 56 drugs. pKa and ionization class were calculated using the software MoKa (Molecular Discovery: https://www.moldiscovery.com/software/moka/). The base pKa model was augmented by including Roche and GNE compounds in the training set. Intravenous CL data were collected primarily from Lombardo et al. (2018) but also from the DrugBank database (https://www.drugbank.ca/). The definitions for primary route of elimination and experimental logD,pH=7.4 were also retrieved from various sources: Lombardo et al. (2014), DrugBank, and Benet et al. (2011). When experimental logD,pH=7.4 was not available, a GNE internal machine learning model was used for prediction. For compounds for which the route of elimination in human was found to be metabolic, further investigation concerning major metabolizing enzymes was performed using DrugBank; when the information was not available in DrugBank, individual searches were performed using alternative resources: Kyoto Encyclopedia of Genes and Genomes (KEGG: https://www.genome.jp/kegg/), US Food and Drug Administration (FDA: https://www.fda.gov/), and literature references (Laurenzana and Owens, 1997; Shet et al., 1997; Moridani et al., 2001; Li et al., 2002; McDonald and Rettie, 2007; Olkkola and Ahonen, 2008; He et al., 2009; Jornil et al., 2010; Yu et al., 2014; Salerno et al., 2017; MacLauchlin et al., 2018; Kogame et al., 2019). Human CL endpoints were capped to a liver blood flow value of 20.7 ml/min/kg; when mixed mechanisms of elimination were reported, the total CL was normalized according to the fraction of hepatic metabolism (moxifloxacin and zidovudine).
In addition, a total of 78 compounds for which the major route of elimination in humans is mediated by cytochrome P450 (CYP450; N = 51), phase II metabolism (N = 9), renal elimination (N = 15), or biliary elimination (N = 3) were included in a separate dataset to explore the relationship between elimination route and physiochemical properties. For six nonmetabolically eliminated compounds for which experimental fu,mic was lacking, values were predicted using GNE’s in-house internal machine learning model. Four of the 60 metabolically eliminated compounds are not in the final dataset due to significant extra hepatic elimination (zidovudine, moxifloxacin), nonlinear pharmacokinetics (tacrine), and significant disconnects between the plasma protein binding value reported in literature compared with the ones reported in house (Gammans et al., 1986; Madden et al., 1995; Moise et al., 2000).
External Validation Set
The data presented in the recent publication from AstraZeneca (AZ) scientists were used as an external validation set (Williamson et al., 2020). Compounds were included if the human CL value could be retrieved from Lombardo et al. (2018). Compounds that were already present in the GNE dataset were excluded to avoid duplication. It is important to highlight that in the AZ dataset, CLint,u of the free drug in hepatocytes is provided as a single value and not as a combination of apparent CLint and fu,mic, whereas in the GNE dataset, all of the fu,mic values for metabolically eliminated compounds are experimentally determined and presented separately.
Materials
The 56 compounds in the GNE dataset were obtained from commercial sources (Sigma-Aldrich, Cayman Chemical, Toronto Research Chemicals, etc.) via the internal compound management bank. Compounds were prepared as 10 mM or 1 mM stock concentrations in dimethyl sulfoxide (DMSO). Pooled male and female human (N = 10) hepatocytes were purchased from BioIVT (Westbury, NY); high-performance liquid chromatography (HPLC)-grade water was from J.T. Baker (Center Valley, PA); HPLC-grade acetonitrile was from EMD Millipore (Billerica, MA). Dulbecco’s Modified Eagle Medium (DMEM) and rapid equilibrium dialysis (RED) devices with inserts were obtained from Thermo Fisher Scientific Inc. (Rockford, IL). Pooled (N ≥ 10) male and female human liver microsomes were purchased from Corning (Woburn, MA). An Allegra × 12R centrifuge used in these studies was purchased from Beckman Coulter (Brea, CA). All other chemicals and reagents were of analytical grade and were obtained from Sigma-Aldrich (St. Louis, MO) unless otherwise specified.
Hepatocyte Stability
Metabolic hepatocyte stability was determined in house using cryopreserved primary hepatocytes. Drug and hepatocyte dilutions were performed in DMEM with 1 μM drug concentration (final DMSO concentration 0.1% v/v) and 0.5 million cells ml−1 under 37°C with 5% CO2. A master reaction plate containing drug and hepatocyte reaction mixture was prepared and aliquots were removed and quenched with acetonitrile and internal standard (IS) at 0, 1, 2, and 3 hours. The samples were centrifuged at 3700 rpm for 15 minutes, and the supernatant was diluted equally with water before liquid chromatography–tandem mass spectrometry (LC-MS/MS) analysis. If irregular time course was observed using the above protocol (e.g., increase in drug concentration or variability over time), a follow-up experiment was performed using a different quench/crash protocol where individual, total reactions were crashed for each time point (crash-in) (Winiwarter et al., 2019) using quadruplicate reaction plates containing the drug and hepatocyte mixture. Out of the 18 compounds for which both protocols were available, only two produced qualitatively different results (>3-fold change). These discrepancies were attributed to the high degree of nonspecific binding to the incubation plate, which resulted in high variability or increase in analyte concentrations over time, which was diminished upon using the crash-in method. Acetonitrile with IS was added to the separate reaction mixtures at 0, 1, 2, and 3 hours. The samples were centrifuged at 3700 rpm for 15 minutes, and the supernatant was diluted equally with water before LC-MS/MS analysis.
Microsomal Incubational Binding
Liver microsome (LM) binding experiments were performed in triplicate using a RED Device. LM stock solutions were diluted to 0.5 mg protein/ml with phosphate buffer saline (PBS: 133 mM potassium phosphate, 150 mM sodium chloride). Drugs were diluted to 1 μM in the LM solution. Subsequently, 500 μl of PBS was added to the receiver wells of the RED device and 300 μl of the drug-LM mixtures was added to the donor wells of the RED device. The RED device was sealed using a gas-permeable membrane and then placed in a shaking incubator (450 rpm; VWR Symphony) at 37°C with 5% CO2. After 6 hours, aliquots from the receiver and donor wells of the RED device were added to acetonitrile with IS and matrix equalized. The samples were centrifuged at 3700 rpm for 15 minutes, and the supernatant was diluted equally with water before LC-MS/MS analysis.
Plasma Protein Binding
Plasma protein binding experiments were performed in triplicate using a single-use RED device. Plasma was adjusted to pH 7.4 with 0.5 M phosphoric acid. Drugs were diluted to 5 μM in plasma. Five hundred microliters of PBS was added to the receiver wells of the RED device, and 300 μl of the drug-plasma mixtures was added to the donor wells of the RED device. The RED device was sealed using a gas-permeable membrane and then placed in a shaking incubator (450 rpm; VWR SymphonyTM) at 37°C with 5% CO2. After 6 or 24 hours, buffer and plasma aliquots from the receiver and donor wells of the RED device were matrix equalized with an equal volume of plasma or buffer, and ice-cold acetonitrile was added with IS. The samples were centrifuged at 3700 rpm for 15 minutes, and the supernatant was diluted equally with water before LC-MS/MS analysis.
LC-MS/MS Analysis
LC-MS/MS analysis was performed using a 5500+ QTRAP mass spectrometer coupled with a TurboIonSpray ESI ion source (AB SCIEX, Redwood City, CA) and Agilent 1200 series LC (Santa Clara, CA). Chromatographic separation of all analytes was achieved using a Kinetex C18 column (50 × 2.0 mm, 80 Å, 4 μm particle size) (Torrance, CA) along with mobile phase A consisting of 0.1% formic acid in HPLC-grade water and mobile phase B consisting of 0.1% formic acid in acetonitrile. A generic LC gradient was used for all analytes where the flow rate was set to 0.5 ml/min, the run time to 3.5 minutes, and the LC gradient as follows: 2% B for the first 1.5 minutes, ramped up to 40% B from 1.5 to 2.0 minutes, remained constant at 98% B from 2.0 to 3.0 minutes, and then decreased to 2% B from 3.0 to 3.5 minutes.
Data Analysis
For the hepatocyte stability experiments, half-life (t1/2) and CLint were determined for each compound after calculating the slope of the natural log plot of percent parent remaining profiles over time using the mass spectrometer (MS) peak area ratios normalized to initial time point peak area (t = 0 min). t1/2 was calculated using the following equation:
CLint was subsequently calculated using the scaling factors for hepatocytes to whole body intrinsic clearance (ml/min/kg):
V/P is defined as the incubation volume divided by the number of hepatocytes used in the incubation (μl/x 106 cells), million hepatocytes/gram liver is 135 × 106 cells/g liver, and gram liver/kg body weight is 25.7 g liver/kg body weight based on previously published literature data. For the AZ dataset, these values were originally reported as μl/min/x 106 cells and were scaled using the same physiologic scalars used internally for scaling human hepatocyte stability studies to calculate whole body intrinsic clearance (ml/min/kg). CLint,u values were calculated using reported unbound fraction in the incubation (fu,inc) for each compound in the AZ dataset (Williamson et al., 2020).
Fraction unbound in plasma (fu,p) and microsomes (fu,mic) were calculated using peak area ratios between the analyte and the IS from the receiver and the donor chambers using the following equation:
Derivation of Scalars for CLint and fu,p
The GNE dataset was divided into a training set (N = 28) and a validation set (N = 28) for the scalar definition. The subset sampling was performed by alternating compound assignment based on a decreasing value for their fu,p. An AZ dataset was used as an additional validation set (N = 117 compounds).
Three approaches to offset derivation were attempted, starting from the following formula:
The CLint offset method 1 is used to derive the scalar α while setting the scalar β equal to 1, method 2 is used to derive the scalar β while setting the scalar α equal to 1, and lastly method 3 is used to derive both the α and β scalars simultaneously. All three methods assume that the main reasons for in vitro to in vivo disconnects are consistent across hepatic-metabolized drugs. In addition, the application of an exponential β scalar on fu,p could also be mathematically rearranged to yield a fu,p-dependent scalar on the product of CLint,u·fu,p [i.e., fu,pβ–1·(CLint,u·fu,p)], analogous to the approach defined by Francis et al. (2021) in which CLint,u is scaled by a factor of plasma protein binding. A simulation between these two approaches is presented in Supplemental Fig. 3.
These assumptions ignore the potential activity of uptake transporters for practical reasons, not scientific ones. The impact of transporters on newer iterations of the WSM are discussed elsewhere (Endres et al., 2009; Pang et al., 2019). Blood to plasma partitioning (BPP) is also an important parameter within the WSM equation. In particular, BPP can greatly vary the predicted hepatic CL for high CLint compounds; for compounds that exhibit low CLint,u and/or fu,p, the hepatic CL predicted by the WSM can be approximated by CLint,u·fu,p, and is minimally affected by the BPP value. Although it is acknowledged that in some cases the availability of BPP data could significantly improve predictions, the improved accuracy observed for high CL compounds suggests that availability of measured BPP data would not change the general conclusions from this study. Therefore, for the purpose of this analysis, BPP was assumed to be 1.
The scalars were derived using a grid search across parameter values (α ranging from 1 to 5 with a step of 0.01, β ranging from 0 to 1 with a step of 0.01). The optimal values were selected based on the minimization of average fold error (AFE) and absolute average fold error (AAFE) in the prediction of human clearance CLpred; AAFE was defined as follows:
The grid search was performed using RStudio, and the datasets are provided in the Supplemental Material. The values were derived based on the GNE training set of 28 compounds, whereas the whole GNE dataset (N = 56) and the AZ (N = 117) dataset combined (N = 173) were used to assess the generalizability of the approach.
Due to the higher number of non–highly bound compounds (fu,p > 0.01) and due to the inherent differences between the two approaches being defined by variable binding characteristics, the analysis of data was stratified for high (fu,p ≤ 0.01) versus low/moderate binders (fu,p > 0.01); after the validation in the external GNE set, the datasets were combined to increase the representation of highly bound compounds. The three methods were compared with the WSM with no offset scalars (α and β = 1).
Results
Properties of the Dataset
The physiochemical properties of the GNE dataset (N = 56) and nonredundant compounds (N = 117) in the AZ dataset (Williamson et al., 2020) were investigated. The full compound list and parameters used for CL predictions are available in Supplemental Tables 1 and 2. Figures 1 and 2 display the ion class, extent of plasma protein binding, and major route of elimination for the GNE and AZ datasets, respectively. The distribution of compounds across physiochemical space was relatively consistent across the two datasets. In terms of fu,p, the GNE dataset had 11 compounds that were highly bound (fu,p ≤ 0.01) and the AZ dataset had 15 highly bound compounds (fu,p ≤ 0.01). All of the compounds in the GNE dataset were cleared primarily via phase I and II metabolic elimination (87.5% and 12.5%, respectively). Compounds in the AZ dataset were also primarily cleared via phase I (∼79.5%) and phase II (∼7.7%) metabolism; in addition, 15 compounds were cleared via alternative or unknown routes of elimination (∼12.8%).
A summary of the reported observed human plasma clearances (CLobs) is depicted in Figure 3, with medians of 4.7 ml/min/kg and 6.1 ml/min/kg for the GNE and AZ datasets, respectively. A statistical comparison between all individually reported mean observed CL values for both datasets (GNE and AZ) was performed to assess whether they were statistically different and whether there was bias for higher or lower reported CL values from each dataset. There were no statistically significant differences between the mean observed CL values for compounds used in both datasets (data not shown; two-way unpaired t test and Mann-Whitney test, P > 0.05). Supplemental Figure 1 shows the CLint,u corrected for microsomal binding versus the logD,pH=7.4 values for each compound stratified by mechanism of elimination. Incubational binding (fu,inc) was assumed to be comparable between human LM and human hepatocytes based on previously published data (Chen et al., 2017; Winiwarter et al., 2019). Although ionization class tended to be evenly distributed across the respective range of CLint,u values, it was apparent that for compounds primarily undergoing CYP450-mediated elimination, there were increases in elimination rate (CLint,u) with an increase in lipophilicity (logD,pH=7.4) (Benet et al., 2011; Varma et al., 2015). In the −0.5–1 logD,pH=7.4 range, a CLint,u greater than 3 ml/min/kg predicted compounds eliminated in the liver with no exception (all via hepatic metabolism, with the exception of one compound via biliary elimination), whereas in the same logD,pH=7.4 range, it was visually apparent that renal elimination was as likely as metabolic elimination for compounds with in vitro metabolic turnover of under 3 ml/min/kg based on this dataset (three compounds renally eliminated vs. five metabolically eliminated). Values of logD,pH=7.4 < −0.5 identified renally eliminated compounds in all cases except for one biliary-eliminated outlier; similarly, values of logD,pH=7.4 > 1 identified all metabolically eliminated compounds in all cases except for one biliary-eliminated outlier.
CL Predictions Using WSM Scalars
Using independent datasets from two different companies within the pharmaceutical industry, different empirical scalars were derived to correct for mispredicted compounds using the WSM for compounds with a range of physiochemical properties. These scalars were derived using a grid search in RStudio aimed at minimizing AFE and AAFE for a set of compounds randomly selected from the GNE dataset (GNE training, N = 28). The remaining GNE compounds in the dataset (GNE validation, N = 28) were used to validate each modeling method for our internally collected data, and each modeling scheme was subsequently applied to all of the compounds for each method (GNE + AZ datasets, N = 173). Prior to fitting these scalars, each dataset was used to predict total clearance without the use of any empirical scalars by using the well-stirred model (α = 1, β = 1). Figure 4 depicts the AFE for each compound for the GNE training, GNE validation, and combined datasets with no scalars applied. The majority of compounds were underpredicted when comparing CLpred to CLobs for each of the three datasets. Table 1 provides a summary of the fraction of compounds that were within/beyond 2-fold AAFE and coefficients of determination (R2) for each dataset and modeling method, and these are further stratified by degree of plasma protein binding in Table 2. Additionally, the correlation plots of CLpred versus CLobs of the GNE + AZ datasets for each of the scaling methods are provided in Supplemental Fig. 2. The percentage of compounds that were less than 2-fold AAFE CLpred values for the GNE training, GNE validation, and GNE + AZ datasets were 32.1%, 53.6%, and 45.7%, respectively. Subsequently, using the GNE training dataset and each of the modeling methods, α and β scalar values were individually and simultaneously fitted using the approaches outlined in the methods. Figure 5 shows the different datasets after fitting the offset α scalar (method 1) and fixing β to 1. After performing the analysis, an offset α value of 3.66 was obtained to minimize the AAFE for the GNE training dataset; this was consistent with previous reports (Williamson et al., 2020). As expected, the CLpred values were increased compared with the model predictions using no scalars due to the product of α⋅CLint,u·fu,p, where α > 1, which improved the prediction for a majority of the compounds that were originally underpredicted. The percentage of compounds that were less than 2-fold AAFE CLpred values for the GNE training, GNE validation, and GNE + AZ datasets were substantially reduced compared with the model prediction with no scalars (60.7%, 71.4%, and 63.6%, respectively) (Table 1). Using method 2, where α was fixed to 1 and β was fitted using GNE training dataset, a β value of 0.55 was determined. Figure 6 demonstrates changes in prediction accuracy for the GNE and AZ datasets. Similar to the α scalar, as expected the CLpred values were increased compared with the model predictions using no scalars due to the product of CLint,u·fu,pβ, where fu,p < 1, which improved the prediction for a majority of the compounds that were originally underpredicted. The percentage of compounds that were less than 2-fold AAFE CLpred values for the GNE training, GNE validation, and GNE + AZ datasets were also reduced compared with the model prediction with no scalars (60.7%, 75.0%, and 64.7%, respectively) (Table 1). Comparing method 1 to method 2, it is apparent that the CLpred values for the highly bound compounds were improved for all three datasets when using the exponential β scalar in contrast to the coefficient α scalar. Additionally, the β scalar derived using this fitting approach can also be mathematically rearranged to yield a plasma protein binding-dependent CLint,u scaling factor equivalent to fu,pβ−1·(CLint,u·fu,p). Where β = 0.55 using method 2, this scaling factor simplifies to fu,p−0.45, which produced changes in CLpred within 2-fold of the scaling factor derived by Francis et al. (2021) in the range of fu,p 0–0.0001 (Supplemental Fig. 3).
For method 3, when α and β are fit simultaneously to correct for prediction error the overall predictions do not substantially improve Fig. 7 compared with methods 1 and 2. Using this simultaneously fitting method, an α value of 1.74 and a β value of 0.66 were obtained in GNE training dataset. The percentage of compounds that were less than 2-fold AAFE CLpred values for the GNE training, GNE validation, and GNE + AZ datasets were 60.7%, 75.0%, and 64.2%, respectively. Figure 8 summarizes the AAFE values across all of the scaling methods implemented in this analysis.
Discussion
The utilization of mechanistic or empirical scalars for the prediction of in vivo hepatic clearance from in vitro data are well established and are still being perfected with a variety of methods aimed at improving predictions using the WSM (Grime and Riley, 2006; Berezhkovskiy, 2011; Ring et al., 2011; Hallifax and Houston, 2012). The “holy grail” of the field’s collective efforts has been to gain a mechanistic understanding of the observed IVIVE disconnects (typically an underestimation of in vivo clearance) and a prospective approach of predicting both preclinical and human clearance with accuracy and precision. With an eye on this goal, yet pragmatism in mind to drive drug discovery, we present an empirical approach deriving two scalars for the WSM. The first scalar, “α,” is a coefficient CLint scalar, which may be rationalized as accounting for loss of activity in the in vitro system relative to in vivo and/or as accounting for miscalculation of the physiologic scalar values used to translate in vitro half-life (t1/2) to apparent CLint. The second scaler, “β,” for fu,p may be rationalized as accounting for hepatic uptake mediated by plasma proteins not present in vitro and/or inaccuracies in the determination of accurate free fraction in vitro.
Recent research suggests that larger in vitro-in vivo disconnects may be observed for compounds that are highly bound to albumin (Poulin and Haddad, 2015; Bowman et al., 2019; Kim et al., 2019; Francis et al., 2021). Attempts have been made to mechanistically account for these disconnects to improve clearance prediction accuracy. For example, the ability of plasma protein, particularly albumin, to enhance the uptake of highly bound compounds or organic anion transporting polypeptide (OATP) transporter substrates has been debated among researchers (often termed albumin-facilitated uptake) (Bowman et al., 2019, 2020, 2021; Chang et al., 2019; Kim et al., 2019; Liang et al., 2020; Bi et al., 2021). Da-Silva et al. (2018) evaluated the impact of scaling three different in vitro hepatocyte-based assays (i.e., suspension, plated, and micropatterned coculture), correcting for unbound fraction as well as adjusted fu,p (fu,p,adjusted) to account for the presence of albumin-facilitated uptake phenomenon. In this case, the fu,p,adjusted equation aims to correct the experimentally determined fu,p values for drugs to account for the uptake of drug in cells for both bound and unbound forms based on albumin-facilitated uptake and pH gradient effect (Poulin and Haddad, 2015). Using this method, the uptake rate as a function of plasma protein binding was able to dramatically improve prediction accuracy compared with conventional fu,p-corrected predictions (AAFE of 1.4 vs. 7.4) for a small number of compounds. These corrections increase the fu,p,adjusted values in comparison with original fu,p values, which are not linearly proportional to binding isotherms. The empirical scalars derived in this study are meant to address systematic underpredictions typically observed when applying the WSM. Because these scalars do not directly account for transporter activity, additional corrections might be needed for compounds for which hepatic uptake is promoted or impaired due to transporter activity. However, in recognizing the dominance of fu,p on CL prediction accuracy, in this current work we take a simplified approach and incorporate an exponential β scalar meant to adjust CLint,u values in a mechanism-agnostic fashion and independent of ionization state (differentiating from the reported fu,p,adjusted scaling approach). In addition to the advancement in CL IVIVE discussed, Supplemental Fig. 1 highlights how in vitro data (CLint,u and measured lipophilicity logD,pH=7.4) can also be used to diagnose the route of elimination with almost no exception outside of the −0.5–1 interval, in agreement with previous findings (Benet et al., 2011).
It is evident that all of the methods explored in this study significantly improve prediction accuracy compared with using the WSM without any scalars. The differentiation between the three methods is harder to gauge due to the relatively low number of highly bound compounds for which human intravenous clearance data are available. Although the incorporation of scalars dependent on the magnitude of plasma protein binding appears promising, the available dataset is not sufficient to draw definitive conclusions about the superiority of one method over the others.
To highlight one outlier, mifepristone (fu,p = 0.0032) was predicted to be within 3-fold of the CLobs when not accounting for any scalars (AFE: ∼2.3); when applying methods 1–3, mifepristone CL was substantially overpredicted (method 1 using α-scalar AFE: ∼7.4; method 2 AFE: ∼18.2) and underpredicted (method 3 AFE: ∼0.06). Mifepristone is metabolized by CYP3A and has been shown to be a mechanism-based inhibitor of CYP3A4 but not CYP3A5 (Khan et al., 2002). In addition, its PK in humans is dependent on AAG binding, which limits its tissue availability (Heikinheimo et al., 2003). This observation may suggest that additional attention is required when interpreting in vitro/in vivo data for potent inhibitors of CYP3A and that caution is needed when applying a β scalar to compounds for which albumin is not the only major binding protein in plasma.
Another highly bound compound, edaravone (fu,p = 0.001), was poorly predicted in all cases (no scalar AFE: ∼0.004; method 1 AFE: ∼0.016; method 2 AFE: ∼0.097; method 3 AFE: ∼12.6). Despite our best efforts to only include compounds that can be adequately modeled with in vitro hepatocyte data, it is evident that this compound is known to be primarily eliminated via phase II metabolic pathways, including multiple renal/hepatic UDP-glucuronosyltransferases (UGTs) and to a lesser extent cytosolic sulfotransferases (SULTs) (FDA label: https://www.accessdata.fda.gov/drugsatfda_docs/label/2017/209176lbl.pdf). These complexities may explain disconnects observed for edaravone as well for other compounds in the dataset that can be difficult to anticipate during early phases of drug discovery.
In this work, all drugs were treated in a similar fashion independent of physiochemical properties. This is advantageous in early drug discovery where there may be several hundreds of compounds screened per therapeutic target with only limited data available. An apparent drawback to using this approach retrospectively is the lack of identifiability in any phenomena that are particularly responsible for inaccuracies in predictions, making it nonmechanistic in nature. In cases where these scalars are applied, one could imagine the α scalar on CLint,u could be responsible for correcting for loss of hepatocyte activity over time, especially for low extraction drugs. In addition, because many highly bound drugs are anionic in nature and organic anion transporting polypeptide (OATP) substrates, the β scalar is amenable for accounting for any unidentifiable phenomenon (including potentially albumin-facilitated uptake). Although it is apparent that the training set of compounds used in this study (GNE training, N = 28) had a much lower accuracy in terms of CL predictions compared with the GNE validation and GNE + AZ datasets, based on the relative magnitude of improvement for each individual dataset using each method, the utility of these scaling approaches for improving CL predictions is evident. The application of the scalars identified in this study to a dataset of 348 compounds in preclinical species (mouse, rat, dog, and cynomolgus monkey) (data not shown) appears to support the findings observed in the human dataset: AAFE from 4.3 (no scalars) to 2.4 (method 1), 2.1 (method 2), and 2.0 (method 3). Analogous to the human dataset, the gap in accuracy between method 1 and the other two methods corresponds to a marked increase in average error observed when predicting the 29 highly bound compounds available in this set: 4.6 (method 1), 2.1 (method 2), and 2.8 (method 3). Because the scalars used in the preclinical species are derived from humans, these preliminary results suggest that they address phenomena that are species independent in their nature, also supported by previous findings using rat and human hepatocytes (Francis et al., 2021). Although more work will be necessary to mechanistically understand the reasons why these scalars appear to improve accuracy in CL predictions across different drug classes, these observations have diagnostic value and may assist additional future investigations.
Acknowledgments
The authors would like to thank Simon Wong for reviewing and providing feedback to the manuscript.
Authorship Contributions
Participated in research design: Jones, Chang, Broccatelli.
Conducted experiments: Leung, Brown, Liu.
Contributed new reagents or analytic tools: Jones, Chang, Broccatelli.
Performed data analysis: Jones, Leung, Chang, Brown, Liu, Yan, Kenny, Broccatelli.
Wrote or contributed to the writing of the manuscript: Jones, Leung, Chang, Yan, Kenny, Broccatelli.
Footnotes
- Received November 16, 2021.
- Accepted May 12, 2022.
All of the authors were employees at Genentech at the time that they contributed to the manuscript. This work received no external funding.
The authors declare that they have no conflicts of interest with the contents of this article.
↵This article has supplemental material available at dmd.aspetjournals.org.
Abbreviations
- AAFE
- absolute average fold error
- AFE
- average fold error
- AZ
- AstraZeneca
- BPP
- blood to plasma partitioning
- CL
- clearance
- CLhep
- hepatic plasma clearance
- CLint
- intrinsic clearance
- CLobs
- observed human plasma clearance
- CLpred
- predicted human plasma clearance
- CYP450
- cytochrome P450
- FDA
- US Food and Drug Administration
- fu,inc
- fraction unbound in incubation
- fu,mic
- fraction unbound in microsomes
- fu,p
- fraction unbound in plasma
- fu,p,adjusted
- adjusted fraction unbound in plasma
- GNE
- Genentech
- HPLC
- high-performance liquid chromatography
- IS
- internal standard
- IVIVE
- in vitro-in vivo extrapolation
- LC-MS/MS
- liquid chromatography–tandem mass spectrometry
- LM
- liver microsome
- logD,pH=7.4
- lipophilicity
- PK
- pharmacokinetics
- R2
- coefficient of determination
- RED
- rapid equilibrium dialysis
- t1/2
- half-life
- WSM
- well-stirred model
- Copyright © 2022 by The American Society for Pharmacology and Experimental Therapeutics