ADMET Evaluation in Drug Discovery. 17. Development of Quantitative and Qualitative Prediction Models for Chemical-Induced Respiratory Toxicity

As a dangerous endpoint, respiratory toxicity can cause serious adverse health effects and even death. Meanwhile, it is a common and traditional issue in occupational and environmental protection. Pharmaceutical and chemical industries have a strong urge to develop precise and convenient computational tools to evaluate the respiratory toxicity of compounds as early as possible. Most of the reported theoretical models were developed based on the respiratory toxicity datasets with one single symptom, such as respiratory sensitisation, and therefore these models may not afford reliable predictions for toxic compounds with other respiratory symptoms, such as pneumonia or rhinitis. In this study, based on a diverse dataset of mouse intraperitoneal respiratory toxicity characterised by multiple symptoms, a number of quantitative and qualitative predictions models with high reliability were developed by machine learning approaches. First, a four-tier dimension reduction strategy was employed to find an optimal set of twenty molecular descriptors for model building. Then, six machine learning approaches were used to develop the prediction models, including relevance vector machine (RVM), support vector machine (SVM), regularised random forest (RRF), eXtreme gradient boosting (XGBoost), naïve Bayes (NB) and linear discriminant analysis (LDA). Among all of the models, the SVM regression model shows the most accurate quantitative predictions for the test set (q2ext =0.707), and the XGBoost classification model achieves the most accurate qualitative predictions for the test set (MCC of 0.644, AUC of 0.893 and global accuracy of 82.62%). The application domains were analysed, and all of the tested compounds fall within the application domain coverage. In addition, the structural features of the compounds and important fragments with large prediction errors were analysed. In conclusion, the SVM regression model and the XGBoost classification model can be employed as accurate prediction tools for respiratory toxicity.

Authors: Lei T, Chen F, Liu H, Sun H, Kang Y, Li D, Li Y, Hou T. ;Full Source: Molecular Pharmacology. 2017 Jun 9. doi: 10.1021/acs.molpharmaceut.7b00317. [Epub ahead of print] ;