Abstract
Drug-induced liver injury (DILI) is one of the most important reasons for drug development failure at both preapproval and postapproval stages. There has been increased interest in developing predictive in vivo, in vitro, and in silico models to identify compounds that cause idiosyncratic hepatotoxicity. In the current study, we applied machine learning, a Bayesian modeling method with extended connectivity fingerprints and other interpretable descriptors. The model that was developed and internally validated (using a training set of 295 compounds) was then applied to a large test set relative to the training set (237 compounds) for external validation. The resulting concordance of 60%, sensitivity of 56%, and specificity of 67% were comparable to results for internal validation. The Bayesian model with extended connectivity functional class fingerprints of maximum diameter 6 (ECFC_6) and interpretable descriptors suggested several substructures that are chemically reactive and may also be important for DILI-causing compounds, e.g., ketones, diols, and α-methyl styrene type structures. Using Smiles Arbitrary Target Specification (SMARTS) filters published by several pharmaceutical companies, we evaluated whether such reactive substructures could be readily detected by any of the published filters. It was apparent that the most stringent filters used in this study, such as the Abbott alerts, which captures thiol traps and other compounds, may be of use in identifying DILI-causing compounds (sensitivity 67%). A significant outcome of the present study is that we provide predictions for many compounds that cause DILI by using the knowledge we have available from previous studies. These computational models may represent cost-effective selection criteria before in vitro or in vivo experimental studies.
Footnotes
S.E. consults for various pharmaceutical and software companies including Merck, although he did not receive any payment for this study. J.J.X. is currently employed by Merck, was previously employed by Pfizer, and has stock ownership in both companies as well as in other biopharmaceutical companies.
The structures of all compounds in the test and training sets as well as the set of recently approved drugs are available in sdf format online, and the Bayesian model protocols used in Discovery Studio are available from the authors upon request.
Article, publication date, and citation information can be found at http://dmd.aspetjournals.org.
doi:10.1124/dmd.110.035113.
↵
The online version of this article (available at http://dmd.aspetjournals.org) contains supplemental material.
-
ABBREVIATIONS:
- DILI
- drug-induced liver injury
- HIAT
- human hepatocyte imaging assay technology
- ECFC_6
- extended connectivity functional class fingerprints of maximum diameter 6
- SMARTS
- Smiles Arbitrary Target Specification
- FCFP
- functional class fingerprint
- ROC
- receiver operator characteristic
- PCA
- principal component analysis
- AUC
- area under the curve.
- Received June 22, 2010.
- Accepted September 15, 2010.
- Copyright © 2010 by The American Society for Pharmacology and Experimental Therapeutics
DMD articles become freely available 12 months after publication, and remain freely available for 5 years.Non-open access articles that fall outside this five year window are available only to institutional subscribers and current ASPET members, or through the article purchase feature at the bottom of the page.
|