Abstract
We report a trend analysis of human intravenous pharmacokinetic data on a data set of 1352 drugs. The aim in building this data set and its detailed analysis was to provide, as in the previous case published in 2008, an extended, robust, and accurate resource that could be applied by drug metabolism, clinical pharmacology, and medicinal chemistry scientists to a variety of scaling approaches. All in vivo data were obtained or derived from original references, either through the literature or regulatory agency reports, exclusively from studies utilizing intravenous administration. Plasma protein binding data were collected from other available sources to supplement these pharmacokinetic data. These parameters were analyzed concurrently with a range of physicochemical properties, and resultant trends and patterns within the data are presented. In addition, the date of first disclosure of each molecule was reported and the potential “temporal” impact on data trends was analyzed. The findings reported here are consistent with earlier described trends between pharmacokinetic behavior and physicochemical properties. Furthermore, the availability of a large data set of pharmacokinetic data in humans will be important to further pursue analyses of physicochemical properties, trends, and modeling efforts and should propel our deeper understanding (especially in terms of clearance) of the absorption, distribution, metabolism, and excretion behavior of drug compounds.
Introduction
The last 20 years or so have seen the flourishing of prediction approaches of human pharmacokinetics for new compounds and very recent work, in the form of perspective articles on in silico absorption, distribution, metabolism, and excretion (Lombardo et al., 2017) and several modeling endeavors (Berellini et al., 2012; Gombar and Hall, 2013; Lombardo and Jing, 2016), highlights that observation. Many reports, of which we cite a few examples, have focused on scaling techniques (i.e., approaches and techniques that use animal pharmacokinetic data) (Caldwell et al., 2004; Ward and Smith, 2004a,b; Jolivette and Ward, 2005; Evans et al., 2006; Mahmood et al., 2006; Martinez et al., 2006; Tang and Mayersohn, 2006; Fagerholm, 2007; McGinnity et al., 2007; Lombardo et al., 2013a,b) as well as in vitro data (Obach et al., 1997; Lombardo et al., 2002, 2004; Nestorov et al., 2002; Riley et al., 2005; Grime and Riley, 2006). The growing availability and acceptance of computational chemistry methodologies has generated many examples of prediction of human pharmacokinetics and/or general absorption, distribution, metabolism, excretion, and toxicity properties (Veber et al., 2002; Cruciani et al., 2005; Ghafourian et al., 2006; Gleeson et al., 2006; Lombardo et al., 2006, 2017; Gleeson, 2007; Gunturi and Narayanan, 2007; Norinder and Bergström, 2007; Berellini et al., 2009, 2012; Lombardo and Jing, 2016). An essential pillar of these methods, beyond a careful choice of the nature and type of descriptors (e.g., two- or three-dimensional, fragmental or continuous) and the theory behind statistical approaches, is the availability of curated data and relatively large databases that have been carefully assembled. The difficulties in building a human pharmacokinetics database are manyfold. At the onset there is the difficulty of finding a vast array of intravenous data within any single company, and each publically available study (each yielding a data point) is diverse and separate in time as well as in the experimental approach(es) taken by different investigators. The reader will easily recognize, among the variables, the number and types of study subjects (e.g., healthy vs. diseased, sex, age, etc.), routes of administration (all intravenous, in this case) and doses, sample collection times, methods of sample analysis, and types of pharmacokinetic parameters reported. This database spans many decades, and the improvements in technology and sensitivity of analytical methods, as well as our understanding of pharmacokinetics, clinical trials design, and other trends in related sciences, have all influenced and shaped the data available. We discuss the data assembly process and the pharmacokinetic parameters we have evaluated in their collection, in detail, in the Materials and Methods.
We have defined above the availability of quality data as a “pillar” of modeling work. A key aspect in the development of models for the prediction of human clearance (CL), volume of distribution (VD), and absolute oral bioavailability is that data used are obtained from studies based on intravenously administered doses. We also note here that definitions vary and the methods of calculation adopted need to be consistent. We offer, as an example, the “variable” nature of VD, which may be taken as steady-state VD (indicated as Vss or VDss), central VD (generally indicated as Vc or VDc), or terminal phase VD (generally indicated as VDβ or VDz, with or without the letter D). This is another aspect we have concentrated our attention toward, as in the previous report, and we discuss it in the Materials and Methods. As an example, the extensive set of human pharmacokinetic data reported in Appendix II of The Pharmacological Basis of Therapeutics (Goodman and Gilman, 2017) is frequently cited as a source of data for model construction. This data set, highly curated by many successive groups of very experienced scientists in the field, was not intended for the development of structure-pharmacokinetic relationships. It aimed, instead, at offering a guide for dosing regimens through understanding of the pharmacokinetics basis to health care professionals and medical students. As such, and in many cases, the data have been reported using oral administration with volume parameters that include terminal phase data and not only VDss. The performance of models may likely be confounded if this set is used.
The objectives of this study were 3-fold. First, we sought to greatly expand on the already existing large publicly available database we reported earlier (Obach et al., 2008) of carefully scrutinized human pharmacokinetic parameters, which would be easy to consult and reference and could be used by scientists at early stages of drug research for the construction of predictive pharmacokinetics models. Second, we aimed to gain some insight into the relationships between chemical properties, as derived from structural attributes (i.e., computed descriptors), and human pharmacokinetic parameters. Third, we aimed to try to infer, with the addition of “time data” represented by the year of first disclosure, whether “temporal trends” and/or changes could be discerned across the many decades of drug and pharmacokinetic research. This is similar to the development of several notable guidelines, such as the “rule of 5” (Lipinski, 1997) or the rotatable bond limits described by Veber et al. (2002) stemming from similar observations, although the aim of both reports was to derive guidelines toward improved oral absorption.
In pursuit of the first objective, we have carefully examined a vast set of scientific literature (up to very recent publications) for the human pharmacokinetic parameters clearance in plasma (CLp), half-life (t1/2), VDss, and mean residence time (MRT) measured after intravenous administration. We were able to expand the data set by successfully obtaining human intravenous pharmacokinetic data as to double the original work. Plasma protein binding data are included for the majority of these compounds, albeit not for all of them, despite very extensive data searches through multiple sources. This database, now composed of 1352 compounds, is provided as a data table and as an Excel spreadsheet (with the Chemical Abstracts Service number, simplified molecular-input line-entry system descriptor, original reference, and all of the computed parameters used in this work) as Supplemental Material accessible to all readers. There are no biologics, large proteins, or monoclonal antibodies in the present or previous data set, but several compounds, represented largely by therapeutic peptides, have a molecular weight (MW) in excess of 600 Da and up to 7150 Da (mipomersen). The database should offer a solid starting point to scientists aiming to develop correlative analysis and/or computational models toward the prediction of human pharmacokinetics of new compounds. Furthermore, using this database, we examined the relationships between human intravenous pharmacokinetic parameters [VDss,u, VDss, CLp,u, CLp, MRT, fu, and t1/2] and various computed fundamental physicochemical parameters [logD, charge, polar surface area (PSA), etc.] and we offer initial and general consideration on their span and median values. Here, fu denotes the fraction unbound in plasma and the subscript “u” has the usual meaning of unbound.
Materials and Methods
As readers will recognize, there are several ways to report data, which often are without specification (as mentioned above for VD) of the modality of calculation. This leads to the use of various units and symbols, whether a compartmental analysis is adopted versus a noncompartmental approach. Therefore, author reports of the description and details concerning methods can vary considerably. The examination of each individual report and the extraction of fundamental pharmacokinetic parameters required careful scrutiny. The result of these efforts, however, has been the extension of the original data set to more than a doubled number of compounds from our previous publication (Obach et al., 2008). This work was conducted using the same criteria adopted for our previous work, which may be consulted for a detailed account of calculation methods and equations used (Obach et al., 2008). In addition, we searched SciFinder for the date of the first published report for each molecule. The dates entered may or may not reflect the actual discovery (e.g., because disclosure in a patent may have followed a few years after synthesis or isolation) nor might they always be expected to represent the introduction into therapy, which may have followed at a significantly later point. However, using a general scheme of “binning” by decade, we sought potentially useful and informative temporal trends. As in the previous case, we did not include any data from oral, intramuscular, or any other dosing route wherein the total dose may not have entered the systemic circulation, since the calculation of pharmacokinetic parameters CL and VD requires the dose available to the central (plasma) compartment. Intravenous data were either from rapid bolus injection or infusions. In addition, we tried to carefully identify compounds, or dose ranges, where nonlinearity was reported, and we used the lowest dose or lowest range where linearity was observed and good analytical data were reported. We did not check on nor did we consider the relatively rare event of reversible metabolism (e.g., N-oxide formation and reduction) having a significant impact on the overall values for any of the compounds, and we used the data as the average values reported by the authors.
Careful evaluation of reported VDss calculations (this being a volume term that is more generally related to an overall distributional behavior) was performed along the same lines described in the equations detailed in the previous work, and we refer the reader to that work (Obach et al., 2008). As an additional source, prescribing information and/or biopharmaceutics reviews available online through the U.S. Food and Drug Administration website (in which the agency reviewed and approved the data listed) were sometimes the only source of VDss data. As in the previous work, a 70 kg average weight was assumed if not reported or a midpoint of the range given was taken. Similarly, when data were reported using body surface area, we used in all cases a 70 kg value against a body surface area of 1.73 m2. We also extensively searched for protein binding values in human plasma (or serum). The notes and comments in the full set of data (available in the Supplemental Material) offer more specific indication on particular compounds where, for example, digitization of concentration versus time plots was performed to calculate otherwise unavailable pharmacokinetic parameters.
Some of the physicochemical parameters calculated for all compounds and described in the Results were calculated via MoKa software (version 2.6.6; Molecular Discovery, Hertfordshire, UK). These parameters include logP and logD7.4 as well as pKa values to determine the ionization state, the latter assigned on the basis of the most abundant species at pH 7.4. In a few cases (15 of 1352 compounds), some compounds did not yield these physicochemical descriptors we sought to calculate because of either relatively large molecules (e.g., mipomersen, MW 7150 Da) or the presence of metal ions (e.g., Gd or Pt) in the compounds. These compounds are included in the data table of pharmacokinetic parameters and general characteristics but are omitted from descriptor statistics considerations. The number of rotatable bonds (NRB), the number of hydrogen bond donor (HBD) and hydrogen bond acceptor (HBA) atoms, and N and O atom PSA (in Å2) were calculated via Vortex software (version 2017.08.69034.51-s, Herts, UK.).
Results
Characteristics of the Pharmacokinetic Values
As described in the Materials and Methods, we undertook extensive mining of the scientific literature. Re-analysis of concentration versus time data was performed in some cases, allowing the collection of human intravenous pharmacokinetic parameters for a total of 1352 compounds. In some cases, we could not find a complete set of pharmacokinetic parameters but the compound was included, as there was either clearance or VDss and we deemed the information useful. In addition, we were able to find plasma protein binding data for many of the compounds (920 of 1352). These parameters are reported in Supplemental Table 1 and in a spreadsheet containing all of the values along with literature references, comments, and notes (both files are included in the Supplemental Material). The data span considerable ranges (Fig. 1). Secretin (a peptide) had the lowest VDss value (0.03 l/kg), whereas hydroxychloroquine had the highest value (700 l/kg). As previously observed, the majority (1171 of 1315 data points; 89%) were comprised within 0.1 and 10 l/kg, as shown in Table 1. The mean and median values were 3.8 and 0.9 l/kg, across all data, or essentially identical to the values observed in the previous work (4.2 and 0.96 l/kg, respectively). Forty-three percent of the compounds (561 of 1315) had VDss values at or below 0.7 l/kg (Table 1), generally considered to be the value for total body water, and the percentage was essentially identical to the value reported for the set of 670 compounds (41%). Finally, 8% of compounds (108 of 1315) had VDss values of 10 l/kg or greater, an indication of extensive partitioning into tissues, a percentage identical to the previously reported value. Thus, the general statistics and characteristics of the VDss values are the same as in our previous report from 10 years ago (Obach et al., 2008).
Distribution of human pharmacokinetic values for the 1352 compounds included in this analysis. (A) VDss. (B) Clearance. (C) t1/2. (D) fu.
Characteristics of the human intravenous pharmacokinetic parameters and plasma protein binding for 1352 compounds in the data set
Plasma clearance values (Fig. 1B) ranged from 0.004 ml/min per kilogram for 7-hydroxystaurosporine to 1070 ml/min per kilogram for artesunate (the compounds with the lowest and highest clearance in this set, respectively); these limits are held by the same two compounds as in the previous work, with mean and median values of 12.2 and 4.5 ml/min per kilogram, respectively. Both values did increase slightly from values of 10 and 4 ml/min per kilogram, respectively, in the previous work. This study had a very similar proportion to what was previously observed, in which 68% of the compounds (919 of 1350) resided in a range between 1 and 15 ml/min per kilogram as shown in Table 1. In addition, 16% (215 of 1350) possessed clearance values below 1 ml/min per kilogram (very low clearance), which is identical to the value of 16% reported for the previous set. In this set, 135 compounds had CL values greater than liver blood flow (Table 1) taken as 21 ml/min per kilogram, as opposed to 56 compounds in the previous set, which is slightly more than double the number of the compounds. However, considering the doubling of the data set, this cannot be construed as an indication of a trend toward a higher “acceptance” and progression to clinical studies of high-clearance compounds. Regardless of the statistical implications, these values are suggestive of the possibility of blood-to-plasma ratios greater than unity or of extrahepatic clearance mechanisms. Several compounds possessing high CL values can be classified among systemic and short-acting anesthetics (e.g., propofol), local anesthetics (e.g., articaine or prilocaine), pain medications (e.g., remifentanil, butorphanol, dezocine, or pentazocine), or cytotoxic cancer chemotherapeutics (e.g., amifostine, laromustine, or carmustine), which are drug classes that are frequently administered via the intravenous route to optimize therapy or use. The latter few compounds exert their action by acute cytotoxicity, which would be dangerous if prolonged; thus, the short half-life is beneficial toward controlling side effects. Prodrugs, on the other hand, are by definition compounds in which a pharmacologically active metabolite is being formed from the parent and they thus are expected, and desired, to have high CL (e.g., dolasetron, esmolol, etc.) to yield, the active moiety as rapidly and completely as possible.
Terminal phase half-life values ranged from 1.2 minutes (perflutren) to 56 days (almitrin), as shown in Fig. 1C, with 63% (843 of 1335) residing between 1 and 12 hours (Table 1). The bisphosphonates may be prone to underestimated half-lives, since they sequester into bone and, by doing so, would not be detectable in blood or plasma. The average t1/2 was 17.1 hours and the median value was 4.5 hours. Half-life values, being a derived parameter, do not lend themselves to a direct correlation with physiologic properties represented by volume of total body water or hepatic or renal blood flow. However, they can, if there is no considerable difference between pharmacodynamics and pharmacokinetics, be classified into ranges largely based on dosing frequency values. In pharmaceutical research, scientists very often seek therapeutics that are amenable to dosing four times a day, which is considered convenient from a patient compliance standpoint. When we look at t1/2 (or MRT) and use the above requirement, approximately three-fourths of the compounds in the data set (1015 of 1335) have t1/2 values below 12 hours (Table 1). Therefore, they would likely require a more frequent dosing regimen. Plotting MRT versus clearance (plot not shown), we estimated that a value of <1 ml/min per kilogram for clearance would be needed to achieve at least an MRT value of 4 hours for acidic compounds, because acidic compounds (with significant exceptions) yield a generally lower VDss.
The range and distribution of plasma protein binding values is provided in Fig. 1D and Table 1. The values ranged between no binding (several compounds) and 0.0002 fraction unbound (amiodarone), with mean and median free fractions of 0.35 and 0.2, respectively. Only 60% of the compounds (560 of 900) in the set had fu values greater than 0.1, and about 12%, or one in eight compounds (111 of 920), could be considered highly bound (fu < 0.01; Table 1).
Characteristics of the Computed Physicochemical Values
The 1352 compounds in this data set span a wide range of fundamental computed physicochemical characteristics (Fig. 2). The typical drug-like space for MW (200–600 Da) is illustrated in Fig. 2A and is covered by 78% of the compounds (1060 of 1352), with a median value of 371 Da and a range from 42 (cyanamide) to 7150 Da (mipomersen). Furthermore, the median and mean NRBs were calculated as 8 and 5, respectively, and the median and mean values of calculated PSA were 89 and 130 Å2 (PSA counting nitrogen and oxygen atoms only), which are below the upper limit reported (10 for NRB and 140 Å2 for PSA) by Veber et al. (2002) for compounds with a good probability of being orally bioavailable in rats. The binning is illustrated in Fig. 2, B and C, for NRB and PSA, respectively.
Distribution of computed physicochemical properties for the 1352 compounds included in this analysis. (A) MW. (B) NRB. (C) PSA (N and O atoms only). (D) Ionization state. (E) logP. (F) logD7.4. (G) Number of HBAs. (H) Number of HBDs.
Among the compounds for which the pKa could be calculated (some large peptides and ion-containing compounds were not amenable to computation), there were 313 anionic, 472 cationic, 457 neutral, and 97 zwitterionic compounds (Fig. 2D). These were categorized by calculating the most abundant species (anionic for acids, cationic for bases, both for zwitterions) at pH 7.4, using the calculated pKa values from MoKa.
Thus, zwitterionic compounds represent a percentage of compounds well below 10% of the entire data set. The same was true for our previous data set (Obach et al., 2008); however, although the basic and acidic compounds essentially doubled (from 267 to 159, respectively) and neutral compounds almost tripled (from 173), zwitterions increased only by 50%.
Lipophilicity, expressed as clogP and clogD7.4, yielded 2 and 0.7 as median values, whereas the corresponding mean values were 1.6 for ClogP and 0.1 for clogD7.4. Figure 2, E and F, shows the bins and number of compounds for clogP and clogD7.4. The median and mean values for numbers of HBAs were 6 and 9, whereas the corresponding values for HBDs were 2 and 3 to 4; the binning is shown in Fig. 2, G and H. The median and mean clogP values, and similarly the number of HBAs and HBDs, although coming from compounds dosed intravenously in all cases, were well below the limits set (5, 10, and 5, respectively) by the well known Lipinski rule of 5 (Lipinski et al., 2001). It could be argued that the rule of 5 sets a threshold for potential issues with oral absorption, whereas these compounds were all dosed intravenously. However, it is true that most of them were chosen for development as oral drugs, for which intravenous data were also generated.
Trends in the Data Set
VDss versus Physicochemical Properties.
The data set was mined for discernible trends between the computed physicochemical properties and VDss values in humans. No single physicochemical descriptor (property) yielded a relationship that could be, on its own, predictive of VDss. However, as observed in the past, “composite” contributions to VDss by various physiochemical properties could be observed through trends in the data. Overlap and scatter in the data, however, made observation of trends difficult when values were partitioned by means (as exemplified in Fig. 3), but they could be observed when using median values. VDss values yielded observable trends with clogD7.4, PSA, and number of HBAs and HBDs (Fig. 3, B, C, E, and F), as well as charge type (Fig. 3A), which have varying degrees of relatedness to each other. High PSA, high numbers of HBAs/HBDs, and low lipophilicity offer recognizable trends: median VDss values trend higher for low PSA, higher for low numbers of HBAs/HBDs, and high with lipophilicity (respective panels in Fig. 3). Acidic compounds (anions) yield generally lower median VDss values than zwitterionic and neutral compounds, which are, in turn, lower than basic compounds (Fig. 3A). However, there is a very large overlap and the trend can only be taken as fairly broad, as it is possible to encounter many acidic compounds that show a fairly sizable VDss value, often well above total body water, taken as reference at a value of approximately 0.7 l/kg. A median value of around 0.2 l/kg in VDss for the acids was observed, but an upward trend was discernible for free VDss with increasing logD7.4 for bases, neutrals, and zwitterions (see Fig. 6B).
Relationship between median VDss values and computed physicochemical parameters. The median value is indicated by the horizontal line in the gray box, and the lower and upper limits of the box represent the first and third quartiles, respectively. The black points represent the compounds. (A) VDss vs. ionization state. (B) VDss vs. logD7.4. (C) VDss vs. PSA. (D) VDss vs. NRB. (E) VDss vs. number of HBAs. (F) VDss vs. number of HBDs.
CL versus Physicochemical Properties.
Discernible trends, as in the case of VDss, could be readily observed between clearance and some of the physicochemical properties. However, the quantitative prediction of clearance on the basis of any single property was not supported by our observations, since we could not find a strong enough relationship allowing such prediction. Decreases in median CL were observed with increases in PSA or with an increase in HBAs and HBDs (Fig. 4, C, E, and F). Only a weak trend could be observed between median CL and lipophilicity. Free clearance showed generally a more discernible trend toward higher values for basic and neutral compounds than for acids or zwitterions (Fig. 6D, lower right) although data overlap to a great extent.
Relationship between median clearance values and computed physicochemical parameters. The median value is indicated by the horizontal line in the gray box, and the lower and upper limits of the box represent the first and third quartiles, respectively. The black points represent the compounds. (A) Clearance vs. ionization state. (B) Clearance vs. logD7.4. (C) Clearance vs. PSA. (D) Clearance vs. NRB. (E) Clearance vs. number of HBAs. (F) Clearance vs. number of HBDs.
Protein Binding versus Physicochemical Properties.
Two relationships, one between protein binding and lipophilicity and one between binding and charge class, could be discerned (Fig. 5, A and B). Lipophilicity showed an increasing (direct) trend across all charge types. Another observation, as may be expected, was a discernible lower median fu value for anionic (acidic) compounds versus basic and neutral compounds and the much higher median fu value for zwitterionic compounds. This may be due to their amphiprotic nature, resulting in a possible lower affinity for albumin or α1-acid glycoprotein as well as other plasma proteins. However, as shown in Fig. 5A, there is a great deal of overlap among all classes, due to a wide distribution and, contrary to the accepted perception of a significantly lower fu for acidic compounds, this cannot be generalized.
Relationship between median fraction unbound in plasma (fu) values and computed physicochemical parameters. The median value is indicated by the horizontal line in the gray box, and the lower and upper limits of the box represent the first and third quartiles, respectively. The black points represent the compounds. (A) fu vs. ionization state. (B) fu vs. logD7.4. (C) fu vs. PSA. (D) fu vs. NRB. (E) fu vs. number of HBAs. (F) fu vs. number of HBDs.
Free VDss and CL versus logD7.4.
We examined VDss data after correcting for free fractions for all compounds with available fu (i.e., free VDss = VDss/fu), and this transformation greatly expands the range of values, together with offering a perhaps better indication of the extent of tissue binding (Fig. 6, A and B).
Relationship between total and free VDss or total and free clearance values and computed logD7.4. Acidic compounds are represented by red dots, basic compounds by blue dots, neutral compounds by green dots, and zwitterionic compounds by yellow dots. (A) Total VDss. (B) Free VDss. (C) Total clearance. (D) Free clearance.
Total and free VDss were plotted versus logD7.4 (logD7.4 range, −6 to 6). There was no apparent trend using total values (Fig. 6A, red dots) for acidic compounds, whereas an upward trend was recognizable for neutral and basic compounds. Once the values were corrected via fu, a clearer trend was apparent (Fig. 6B), which reveals a shift toward higher values of free VDss with logD7.4 and more apparent for acidic compounds. The data are therefore suggestive of plasma protein binding having the tendency to dominate the distribution behavior of negatively charged compounds.
CL values were similarly examined before and after correction for fu (i.e., free CL = CL/fu), as shown Fig. 6, C and D. As in the case of VDss correction, the data sets are smaller, reflecting the lower number of compounds for which we found fu values. At first glance, the effect seems to parallel the trend observed for VDss but the profile seems flatter for clearance using total values than in the case of total VDss. However, the potential involvement of uptake and efflux transporters notwithstanding (Waters and Lombardo, 2010), it should be borne in mind that VDss is generally dominated by physicochemical properties, largely fraction ionized, and lipophilic, even after removal of the protein binding aspect (Lombardo et al., 2002, 2004). As opposed to VD, CL is largely dependent on affinities and intrinsic activities for specific enzymes and transporters. Affinities and intrinsic activities will be somewhat dependent on basic physicochemical properties, but also on the interactions of specific substituents and fragments with macromolecules involved in drug metabolism and disposition.
Time-Dependent Variations.
A plot of year of first appearance, binned according to the year and colored by VDss (Fig. 7, left), CL (Fig. 7, right), fraction unbound in plasma (Fig. 8, left), and clogD7.4 (Fig. 8, right) does not seem to indicate any particular trend for the first two properties (using the same number of compounds with fu data available), whereas some increase toward more lipophilic compound and lower fu (as they are generally inversely correlated) could be discerned, as shown in Fig. 8. In particular, the bin representing compounds clogD7.4 between −1 and 0 (Fig. 8, right) was reduced to a very low 4% in the period from 2000 to the present, possibly influenced by the significantly fewer number of total compounds we could retrieve from the literature.
Distribution of total VDss and clearance values utilizing value ranges reported in Table 1 against vertical bins based on year of first disclosure. The bins after 1960 span approximately 2 decades, and the colors indicate property range values in ascending order from blue to red or brown (bottom to top).
Trends in the distribution of fu and computed logD7.4 values utilizing the ranges reported in Table 1 for fu and the same ranges as in Fig. 2 for computed logD7.4. The value ranges and frequency of compounds are reported against vertical bins based on year of first disclosure. The bins after 1960 span approximately 2 decades, and the colors indicate property range values in ascending order from blue to red or brown (bottom to top).
A plot of year of first appearance in the literature (Fig. 9) binned as approximately 2 decades per section, and with compounds reported prior to 1960 as one bin, shows a significant trend, with compounds having a MW above the median value of the present set (371 Da) increasing steadily. The rate was 20% above the previous 20 years starting from 1960 to 1980 and reached 80% in the period from 2000 to the present. This finding was also coupled with a significant reduction in compounds reported, which showed a decrease by a 3-fold margin from the 2 decades spanning 1980–2000.
Trend of MW utilizing the median value (371 Da) of the entire data set as threshold vs. vertical bins reflecting years of first disclosure.
Discussion
In this work, we sought to expand the human pharmacokinetic data set that we originally described in 2008 (Obach et al., 2008). The number of compounds in the current set is approximately doubled, and thus merited a renewed evaluation of the overall trends and relationships between the pharmacokinetic parameters and fundamental physicochemical properties. As before, the set of 1352 compounds encompasses a wide variety of drugs in a broad range of therapeutic areas and, consequently, a wide variety of structural characteristic, pharmacokinetic values, and physicochemical descriptors. The data were carefully curated (as described in detail in the Materials and Methods) and strictly from intravenous administration. Thus, these data should not only be of use for this analysis, but they are also available (in the Supplemental Material) for others to use to develop other relationships and models. Overall, trends in the pharmacokinetic parameters between the original data set and this doubled data set were the same; ranges, means, and medians were largely unchanged. In this work, we also examined the impact of the year of first disclosure, as reported in SciFinder, although those data do not necessarily reflect the year of discovery (likely earlier) or the introduction of the compound into therapy, which may have happened at a significantly later point. It is possible that, by binning the time ranges (see the Results) by 2 decades (or overall before 1960), we may have attenuated such differences and grouped compounds in reasonably comparable “periods.” We will discuss time-related findings below.
Among the pharmacokinetic parameters collected, VDss is the one that has the most marked relationships with physicochemical properties. VD is largely a function of differential partitioning between plasma and other tissues, which in turn is a function of nonspecific binding to tissue components and plasma proteins, such as albumin. Such nonspecific interactions are largely dependent on the physicochemical characteristics of the drug. Charge state has an influence on cationic compounds showing generally greater VDss values, but there is considerable overlap that shows that other factors such as lipophilicity also have an influence (Fig. 3). The relationship to lipophilicity becomes stronger when VDss is corrected for plasma free fraction (Fig. 6). VDss shows the same proportion of compounds (43%) with values < 0.7 l/kg, taken as total body water, whereas only a small proportion is confined to blood volume (taken as 0.1 l/kg). Therefore, a large proportion does exceed the total body water value, but this is only an indication of the resulting “distributional average” of the compound in the body, and it does not inform about the presence of the compound in a particular organ or at an intended target.
In Fig. 6, we show the trend and correlation between computed logD7.4 with either VDss or clearance, both total (Fig. 6, A and B) and free (Fig. 6, C and D). We note that although there is a seemingly clearer trend in the case of the free value of both parameters, the one for VDss starts being more noticeable even when examined via total VDss, whereas the corresponding plot for total clearance does not allow discernment of much of a trend. In fact, Fig. 3B (median and distribution of total VDss) does show an increment of the median value for total VDss versus computed logD7.4, whereas the corresponding plot (Fig. 4B) for total clearance does not. If we compare more broadly Fig. 3, B–F, for VDss and the corresponding Fig. 4, B–F, for clearance, we note that not much of a correlation is discernible between (total) clearance and the computed descriptors shown in the case of clearance. At the same time (Fig. 3), VDss shows trends between median values and computed descriptors, which are opposite, as expected, in the case of logD7.4 versus PSA as well as HBAs and HBDs. In the case of the correlation of free parameters with computed logD7.4, the VDss plot (Fig. 6B) seems to yield a higher positive slope than does the plot for free clearance versus computed logD7.4 (Fig. 6D).
Unlike VD, clearance is driven by interactions between drugs and the drug-metabolizing enzymes and/or drug transporters involved in their clearance, as well as plasma protein binding. The interaction of individual drugs with enzymes and transporters is more a function of specific ligand-protein interactions as opposed to nonspecific interactions; thus, relationships between gross physicochemical properties and clearance should not be as apparent as they are for VD (see Fig. 4). Compounds with low free fractions could also have lower CL, so a slight relationship between free CL and lipophilicity can be observed after correction of CL to free CL (Fig. 6). Overall, relationships between free CL and physicochemical parameters are not nearly as discernible as they are for free VD. In addition, investigators have reported computational models in which they used continuous physicochemical descriptors for models of VD (Gleeson et al., 2006; Berellini et al., 2009; Lombardo and Jing, 2016), while they found it necessary to use structural descriptors (i.e., fragments) to improve the predictive power of clearance models even though the prediction of the general clearance mechanism (metabolic vs. renal vs. biliary) did not require the latter descriptors for a good performance (Berellini et al., 2012; Lombardo et al., 2014). Therefore, lipophilicity alone does not describe the clearance behavior of drug compounds, although it is certainly an important component for the elimination of xenobiotics via more polar and water-soluble compounds.
In general, the behavior of metabolic enzymes (and transporters) can be quite complex, and many examples in which identical lipophilicity yields a very different clearance outcome can be found in the literature. Smith (1997) pointed out that clearance differences are related to the propensity toward N-demethylation in a small series of benzodiazepines rather than bulk lipophilicity. Similarly, Stepan et al. (2011) described the discovery of a sizable series of γ-secretase inhibitors, where substitution, regioisomerism, and stereochemistry were responsible for large variations in scaled in vitro hepatic clearance in many cases, whereas changes in experimental ElogD were barely discernible or not at all measurable. Due to the complexity, redundancy, and promiscuity of metabolic enzymes and transporters, layered upon selectivity and safety considerations, the outcome is clearly very complex and multidimensional. Along the same lines are the comments of other researchers such as Broccatelli et al. (2018, p. 524), who stated that “t1/2 optimization via lipophilicity reduction without addressing a metabolic soft-spot is unlikely to work.”
In addition to the pharmacokinetic parameters, we also collected plasma protein binding data for as many of the compounds in the data set as available. This was done primarily to be able to correct total VD and CL values to free values for comparison with physicochemical properties. However, with these data, we also could compare free fraction values to physicochemical properties. Plasma binding is mostly driven by albumin (at approximately 600 μM or 42 g/l; Davies and Morris, 1993) and α1-acid glycoprotein (at approximately 43 μM or 1.8 g/l; Davies and Morris, 1993), with the former mostly associated with binding anionic drugs and the latter, primarily but not exclusively, associated with binding cationic drugs (Meijer and Van der Sluijs, 1987; Israili and Dayton, 2001; Ghuman et al., 2005; Kremer et al., 1988). Protein binding correlated with lipophilicity (Fig. 5). We also pointed out in the Results that the generally accepted notion of a much higher binding for acidic compounds is not strongly supported by “clustering” for anionic, cationic, and neutral molecules, although the anionic compounds, acidic in nature, do seem to show a lower overall median fu than basic (cationic) compounds. There is a great deal of overlap and many “exceptions” exist to the perceived much greater binding of anionic compounds to plasma proteins. Zwitterionic compounds do show a discernibly different median, likely due to their ability to interact with a broader set of proteins but, once more, with a great deal of overlap and not much clustering. Lipophilicity (Fig. 5B) and number of HBDs (Fig. 5F) do show a negative (increasing logD7.4 decreases fu) and a positive (increasing number of HBDs increases fu) trend, respectively, although very few compounds are present in the 7–10 bin for HBDs and a much lower median is observed for the uppermost bin, probably due to the presence of larger molecules and opposing factors. The flexibility of a molecule, expressed as the NRB (Fig. 5D), does seem to show a negative correlation, perhaps only for molecules within the lower three bins, and flexibility may play a detrimental role toward free fraction, similar to the effect on absorption. However, this effect, if real, manifests itself well before the classic threshold outlined by Veber et al. (2002) of 10 rotatable bonds. We also note that the original observation was reported with the aim of exploring the effect of flexibility on absorption, and the data set was based on permeability across artificial membranes, so the significance in this context is not clear.
Finally, we also gathered data to indicate whether human pharmacokinetic, and even physicochemical, properties have been changing over time. We used the date of first disclosure of a compound, which is not an entirely accurate description of when a compound was first synthesized or discovered but it offered the best surrogate for the analysis. What is interesting is that while lipophilicity of the drugs in our data set increased in more recent years (Fig. 8, right), values of CL and VD have generally remained constant (Fig. 7), albeit VD values have increased a bit. We also highlight the findings illustrated in Fig. 9, which show a significant trend toward higher MW in more recent times. A possible explanation for this behavior may be represented by the exploration of different and more complex drug space, such as protein-protein interaction and thus the pursuit of larger molecules needed to disrupt shallower protein-protein interaction clefts. This may include exploration of peptide drugs with perhaps a significant deviation from the earlier oral drug paradigm, to achieve modulation of otherwise inaccessible therapeutic targets. Another possibility is represented by the expansion of techniques and trends in combinatorial chemistry, which has influenced upward the MW of compound libraries across the industry perhaps most notably in the 1980s and 1990s. This may have manifested a bit later with larger compounds entering clinical trials. Increasing MW and lipophilicity in new drugs may be due to the desire to impart increased potency and/or greater target selectivity. Such desired properties can require the generation of larger, more lipophilic drugs. As mentioned above, increased lipophilicity can yield increased plasma protein binding, but also increased tissue binding and increased metabolic intrinsic clearance. These properties may all “cancel” each other out and thereby CL and VD may not change, as shown in Fig. 7.
In conclusion, we have summarized a human pharmacokinetic data set that is, to our knowledge, currently the largest of its kind. We have exhaustively searched the scientific literature and other sources for bona fide human intravenous pharmacokinetic studies and scrutinized the methods and data presented. These data have proven valuable in examining relationships between fundamental human pharmacokinetic parameters VDss and CL with various basic physicochemical properties. The data set in this report approximately doubles our previous report from a decade ago (Obach et al., 2008), yet the relationships between human pharmacokinetic parameters and physicochemical properties have remained largely unchanged. These data (available in the Supplemental Material) can be used and mined by others interested in deriving relationships between structure and human pharmacokinetics. Our own efforts are ongoing to establish whether trends relating particular structural entities and human pharmacokinetic parameters can be determined.
Authorship Contributions
Participated in research design: Lombardo, Berellini, Obach.
Conducted experiments: Lombardo, Berellini, Obach.
Performed data analysis: Lombardo, Berellini, Obach.
Wrote or contributed to the writing of the manuscript: Lombardo, Berellini, Obach.
Note Added in Proof
Note Added in Proof—After further data refinement for new work on the dataset, we found and corrected some errors in the data reported in the case of some compounds in the Fast Forward version published August 16, 2018. These changes have no impact on the analysis or conclusion. The supplemental material providing the data has now been corrected.
Footnotes
- Received June 8, 2018.
- Accepted August 9, 2018.
↵
This article has supplemental material available at dmd.aspetjournals.org.
Abbreviations
- CL
- clearance
- fu
- fraction unbound in plasma
- HBA
- hydrogen bond acceptor
- HBD
- hydrogen bond donor
- MRT
- mean residence time
- MW
- molecular weight
- NRB
- number of rotatable bonds
- PSA
- polar surface area
- t1/2
- half-life
- VD
- volume of distribution
- Copyright © 2018 by The American Society for Pharmacology and Experimental Therapeutics