The methods used were as follows decision trees, one nearest

The methods used were as follows decision trees, one nearest Bioactive compound neighbour and k nearest neighbour approach, support vector machines, and partial least square projections to latent structures. The first four methods induce non linear models, whereas PLS is a linear method. When using PLS we created both linear and non linear models. in the latter case the dataset included cross terms derived from kinase and inhibitor descriptions. The predictive abilities for new inhibitor kinase combi nations and new kinases as assessed by outer loop cross validation are presented in Table 1. The most predictive models Inhibitors,Modulators,Libraries were obtained using SVM, where for all three z scale based description methods the P2 values fell in the range 0. 70 0. 73 and Inhibitors,Modulators,Libraries the P2kin values in the range 0. 67 0. 70.

The PLS and k NN models performed almost as good. Models Inhibitors,Modulators,Libraries based on AAC DC descriptors performed clearly worse than the z scale based descriptions, but also here the SVM model was the most predictive. the P2 being 0. 68 and P2kin being 0. 64, whereas the values of these parameters for PLS model were only 0. 58 and 0. 53. The inferior performance for the AAC DC descriptions is not surprising. In fact it seems quite unlikely that the fraction of any single dipeptide would show significant correlation with the functional properties of the kinases. Such correlations, however, can become evident for larger sets of dipeptide combinations, giv ing an advantage to the SVM model which by the use of its non linear kernel can approximate high complexity Inhibitors,Modulators,Libraries interaction effects between the descriptors.

The differ ence between the performances of SVM and PLS models is even larger when proteins are described by CTD or by SO PAA descriptors. the P2kin for PLS models using these two sets of descriptors being, respectively, 0. 45 and 0. 44, compared to 0. 60 and 0. 63 for the SVM models. For any set of descriptors Inhibitors,Modulators,Libraries the k NN method outper formed 1 NN. However, the optimal num ber of neighbours found to be used by the cross validation inner loop was quite low, and ranged in all cases 3 to 5. The predictions of k NN models are thus based on local subsets of the data set, and for this reason it would be problematic to use these models to draw any general conclusions on the molecular properties that determine kinase inhibitor complementarity. Finally, as expected, PLS modelling without use of kinase inhibitor cross terms explained only a minor part of the activity variation. the P2kin for all three z scale exploiting models being 0. 32. This result shows that the non linear part which describes Sunitinib VEGFR kinase inhibitor selectivity dominate over the linear part that describes the average activity of a ligand for the protein series and the average activity of all ligands for a particu lar protein.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>