Details

Tutorials in Chemoinformatics


Tutorials in Chemoinformatics


1. Aufl.

von: Alexandre Varnek

89,99 €

Verlag: Wiley
Format: PDF
Veröffentl.: 14.06.2017
ISBN/EAN: 9781119137979
Sprache: englisch
Anzahl Seiten: 496

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

<p><b>30 tutorials and more than 100 exercises in chemoinformatics, supported by online software and data sets</b></p> <p>Chemoinformatics is widely used in both academic and industrial chemical and biochemical research worldwide. Yet, until this unique guide, there were no books offering practical exercises in chemoinformatics methods. <i>Tutorials in Chemoinformatics</i> contains more than 100 exercises in 30 tutorials exploring key topics and methods in the field. It takes an applied approach to the subject with a strong emphasis on problem-solving and computational methodologies.</p> <p>Each tutorial is self-contained and contains exercises for students to work through using a variety of software packages. The majority of the tutorials are divided into three sections devoted to theoretical background, algorithm description and software applications, respectively, with the latter section providing step-by-step software instructions. Throughout, three types of software tools are used: in-house programs developed by the authors, open-source programs and commercial programs which are available for free or at a modest cost to academics. The in-house software and data sets are available on a dedicated companion website.</p> <p>Key topics and methods covered in <i>Tutorials in Chemoinformatics</i> include:</p> <ul> <li>Data curation and standardization</li> <li>Development and use of chemical databases</li> <li>Structure encoding by molecular descriptors, text strings and binary fingerprints</li> <li>The design of diverse and focused libraries</li> <li>Chemical data analysis and visualization</li> <li>Structure-property/activity modeling (QSAR/QSPR)</li> <li>Ensemble modeling approaches, including bagging, boosting, stacking and random subspaces</li> <li>3D pharmacophores modeling and pharmacological profiling using shape analysis</li> <li>Protein-ligand docking</li> <li>Implementation of algorithms in a high-level programming language</li> </ul> <p><i>Tutorials in Chemoinformatics</i> is an ideal supplementary text for advanced undergraduate and graduate courses in chemoinformatics, bioinformatics, computational chemistry, computational biology, medicinal chemistry and biochemistry. It is also a valuable working resource for medicinal chemists, academic researchers and industrial chemists looking to enhance their chemoinformatics skills.</p>
<p>List of Contributors xv</p> <p>Preface xvii</p> <p>About the Companion Website xix<i>            </i></p> <p><b>Part 1 Chemical Databases 1</b></p> <p><b>1 Data Curation 3<br /> </b><i>Gilles Marcou and Alexandre Varnek</i></p> <p>Theoretical Background 3</p> <p>Software 5</p> <p>Step‐by‐Step Instructions 7</p> <p>Conclusion 34</p> <p>References 36</p> <p><b>2 Relational Chemical Databases: Creation, Management, and Usage 37<br /> </b><i>Gilles Marcou and Alexandre Varnek</i></p> <p>Theoretical Background 37</p> <p>Step‐by‐Step Instructions 41</p> <p>Conclusion 65</p> <p>References 65</p> <p><b>3 Handling of Markush Structures 67<br /> </b><i>Timur Madzhidov, Ramil Nugmanov, and Alexandre Varnek</i></p> <p>Theoretical Background 67</p> <p>Step‐by‐Step Instructions 68</p> <p>Conclusion 73</p> <p>References 73</p> <p><b>4 Processing of SMILES, InChI, and Hashed Fingerprints 75<br /> </b><i>João Montargil Aires de Sousa</i></p> <p>Theoretical Background 75</p> <p>Algorithms 76</p> <p>Step‐by‐Step Instructions 78</p> <p>Conclusion 80</p> <p>References 81</p> <p><b>Part 2 Library Design 83</b></p> <p><b>5 Design of Diverse and Focused Compound Libraries 85<br /> </b><i>Antonio de la Vega de Leon, Eugen Lounkine, Martin Vogt, and Jürgen Bajorath</i></p> <p>Introduction 85</p> <p>Data Acquisition 86</p> <p>Implementation 86</p> <p>Compound Library Creation 87</p> <p>Compound Library Analysis 90</p> <p>Normalization of Descriptor Values 91</p> <p>Visualizing Descriptor Distributions 92</p> <p>Decorrelation and Dimension Reduction 94</p> <p>Partitioning and Diverse Subset Calculation 95</p> <p>Partitioning 95</p> <p>Diverse Subset Selection 97</p> <p>Combinatorial Libraries 98</p> <p>Combinatorial Enumeration of Compounds 98</p> <p>Retrosynthetic Approaches to Library Design 99</p> <p>References 101</p> <p><b>Part 3 Data Analysis and Visualization 103</b></p> <p><b>6 Hierarchical Clustering in R 105<br /> </b><i>Martin Vogt and Jürgen Bajorath</i></p> <p>Theoretical Background 105</p> <p>Algorithms 106</p> <p>Instructions 107</p> <p>Hierarchical Clustering Using Fingerprints 108</p> <p>Hierarchical Clustering Using Descriptors 111</p> <p>Visualization of the Data Sets 113</p> <p>Alternative Clustering Methods 116</p> <p>Conclusion 117</p> <p>References 118</p> <p><b>7 Data Visualization and Analysis Using Kohonen Self‐Organizing Maps 119<br /> </b><i>João Montargil Aires de Sousa</i></p> <p>Theoretical Background 119</p> <p>Algorithms 120</p> <p>Instructions 121</p> <p>Conclusion 126</p> <p>References 126</p> <p><b>Part 4 Obtaining and Validation QSAR/QSPR Models 127</b></p> <p><b>8 Descriptors Generation Using the CDK Toolkit and Web Services 129<br /> </b><i>João Montargil Aires de Sousa</i></p> <p>Theoretical Background 129</p> <p>Algorithms 130</p> <p>Step‐by‐Step Instructions 131</p> <p>Conclusion 133</p> <p>References 134</p> <p><b>9 QSPR Models on Fragment Descriptors 135<br /> </b><i>Vitaly Solov’ev and Alexandre Varnek</i></p> <p>Abbreviations 135</p> <p>Data 136</p> <p>ISIDA_QSPR Input 137</p> <p>Data Split Into Training and Test Sets 139</p> <p>Substructure Molecular Fragment (SMF) Descriptors 139</p> <p>Regression Equations 142</p> <p>Forward and Backward Stepwise Variable Selection 142</p> <p>Parameters of Internal Model Validation 143</p> <p>Applicability Domain (AD) of the Model 143</p> <p>Storage and Retrieval Modeling Results 144</p> <p>Analysis of Modeling Results 144</p> <p>Root‐Mean Squared Error (RMSE) Estimation 148</p> <p>Setting the Parameters 151</p> <p>Analysis of n‐Fold Cross‐Validation Results 151</p> <p>Loading Structure‐Data File 153</p> <p>Descriptors and Fitting Equation 154</p> <p>Variables Selection 155</p> <p>Consensus Model 155</p> <p>Model Applicability Domain 155</p> <p>n‐Fold External Cross‐Validation 155</p> <p>Saving and Loading of the Consensus Modeling Results 155</p> <p>Statistical Parameters of the Consensus Model 156</p> <p>Consensus Model Performance as a Function of Individual Models Acceptance Threshold 157</p> <p>Building Consensus Model on the Entire Data Set 158</p> <p>Loading Input Data 159</p> <p>Loading Selected Models and Choosing their Applicability Domain 160</p> <p>Reporting Predicted Values 160</p> <p>Analysis of the Fragments Contributions 161</p> <p>References 161</p> <p><b>10 Cross‐Validation and the Variable Selection Bias 163<br /> </b><i>Igor I. Baskin, Gilles Marcou, Dragos Horvath, and Alexandre Varnek</i></p> <p>Theoretical Background 163</p> <p>Step‐by‐Step Instructions 165</p> <p>Conclusion 172</p> <p>References 173</p> <p><b>11 Classification Models 175<br /> </b><i>Igor I. Baskin, Gilles Marcou, Dragos Horvath, and Alexandre Varnek</i></p> <p>Theoretical Background 176</p> <p>Algorithms 178</p> <p>Step‐by‐Step Instructions 180</p> <p>Conclusion 191</p> <p>References 192</p> <p><b>12 Regression Models 193<br /> </b><i>Igor I. Baskin, Gilles Marcou, Dragos Horvath, and Alexandre Varnek</i></p> <p>Theoretical Background 194</p> <p>Step‐by‐Step Instructions 197</p> <p>Conclusion 207</p> <p>References 208</p> <p><b>13 Benchmarking Machine‐Learning Methods 209<br /> </b><i>Igor I. Baskin, Gilles Marcou, Dragos Horvath, and Alexandre Varnek</i></p> <p>Theoretical Background 209</p> <p>Step‐by‐Step Instructions 210</p> <p>Conclusion 222</p> <p>References 222</p> <p><b>14 Compound Classification Using the scikit‐learn Library 223<br /> </b><i>Jenny Balfer, Jürgen Bajorath, and Martin Vogt</i></p> <p>Theoretical Background 224</p> <p>Algorithms 225</p> <p>Step‐by‐Step Instructions 230</p> <p>Naïve Bayes 230</p> <p>Decision Tree 231</p> <p>Support Vector Machine 234</p> <p>Notes on Provided Code 237</p> <p>Conclusion 238</p> <p>References 239</p> <p><b>Part 5 Ensemble Modeling 241</b></p> <p><b>15 Bagging and Boosting of Classification Models 243<br /> </b><i>Igor I. Baskin, Gilles Marcou, Dragos Horvath, and Alexandre Varnek</i></p> <p>Theoretical Background 243</p> <p>Algorithm 244</p> <p>Step by Step Instructions 245</p> <p>Conclusion 247</p> <p>References 247</p> <p><b>16 Bagging and Boosting of Regression Models 249<br /> </b><i>Igor I. Baskin, Gilles Marcou, Dragos Horvath, and Alexandre Varnek</i></p> <p>Theoretical Background 249</p> <p>Algorithm 249</p> <p>Step‐by‐Step Instructions 250</p> <p>Conclusion 255</p> <p>References 255</p> <p><b>17 Instability of Interpretable Rules 257<br /> </b><i>Igor I. Baskin, Gilles Marcou, Dragos Horvath, and Alexandre Varnek</i></p> <p>Theoretical Background 257</p> <p>Algorithm 258</p> <p>Step‐by‐Step Instructions 258</p> <p>Conclusion 261</p> <p>References 261</p> <p><b>18 Random Subspaces and Random Forest 263<br /> </b><i>Igor I. Baskin, Gilles Marcou, Dragos Horvath, and Alexandre Varnek</i></p> <p>Theoretical Background 264</p> <p>Algorithm 264</p> <p>Step‐by‐Step Instructions 265</p> <p>Conclusion 269</p> <p>References 269</p> <p><b>19 Stacking 271<br /> </b><i>Igor I. Baskin, Gilles Marcou, Dragos Horvath, and Alexandre Varnek</i></p> <p>Theoretical Background 271</p> <p>Algorithm 272</p> <p>Step‐by‐Step Instructions 273</p> <p>Conclusion 277</p> <p>References 278</p> <p><b>Part 6 3D Pharmacophore Modeling 279</b></p> <p><b>20 3D Pharmacophore Modeling Techniques in Computer‐Aided Molecular Design Using LigandScout 281<br /> </b><i>Thomas Seidel, Sharon D. Bryant, Gökhan Ibis, Giulio Poli, and Thierry Langer</i></p> <p>Introduction 281</p> <p>Theory: 3D Pharmacophores 283</p> <p>Representation of Pharmacophore Models 283</p> <p>Hydrogen‐Bonding Interactions 285</p> <p>Hydrophobic Interactions 285</p> <p>Aromatic and Cation‐π Interactions 286</p> <p>Ionic Interactions 286</p> <p>Metal Complexation 286</p> <p>Ligand Shape Constraints 287</p> <p>Pharmacophore Modeling 288</p> <p>Manual Pharmacophore Construction 288</p> <p>Structure‐Based Pharmacophore Models 289</p> <p>Ligand‐Based Pharmacophore Models 289</p> <p>3D Pharmacophore‐Based Virtual Screening 291</p> <p>3D Pharmacophore Creation 291</p> <p>Annotated Database Creation 291</p> <p>Virtual Screening‐Database Searching 292</p> <p>Hit‐List Analysis 292</p> <p>Tutorial: Creating 3D‐Pharmacophore Models Using LigandScout 294</p> <p>Creating Structure‐Based Pharmacophores From a Ligand‐Protein Complex 294</p> <p>Description: Create a Structure‐Based Pharmacophore Model 296</p> <p>Create a Shared Feature Pharmacophore Model From Multiple Ligand‐Protein Complexes 296</p> <p>Description: Create a Shared Feature Pharmacophore and Align it to Ligands 297</p> <p>Create Ligand‐Based Pharmacophore Models 298</p> <p>Description: Ligand‐Based Pharmacophore Model Creation 300</p> <p>Tutorial: Pharmacophore‐Based Virtual Screening Using LigandScout 301</p> <p>Virtual Screening, Model Editing, and Viewing Hits in the Target Active Site 301</p> <p>Description: Virtual Screening and Pharmacophore Model Editing 302</p> <p>Analyzing Screening Results with Respect to the Binding Site 303</p> <p>Description: Analyzing Hits in the Active Site Using LigandScout 305</p> <p>Parallel Virtual Screening of Multiple Databases Using LigandScout 305</p> <p>Virtual Screening in the Screening Perspective of LigandScout 306</p> <p>Description: Virtual Screening Using LigandScout 306</p> <p>Conclusions 307</p> <p>Acknowledgments 307</p> <p>References 307</p> <p><b>Part 7 The Protein 3D‐Structures in Virtual Screening 311</b></p> <p><b>21 The Protein 3D‐Structures in Virtual Screening 313<br /> </b><i>Inna Slynko and Esther Kellenberger</i></p> <p>Introduction 313</p> <p>Description of the Example Case 314</p> <p>Thrombin and Blood Coagulation 314</p> <p>Active Thrombin and Inactive Prothrombin 314</p> <p>Thrombin as a Drug Target 314</p> <p>Thrombin Three‐Dimensional Structure: The 1OYT PDB File 315</p> <p>Modeling Suite 315</p> <p>Overall Description of the Input Data Available on the Editor Website 315</p> <p>Exercise 1: Protein Analysis and Preparation 316</p> <p>Step 1: Identification of Molecules Described in the 1OYT PDB File 316</p> <p>Step 2: Protein Quality Analysis of the Thrombin/Inhibitor PDB Complex Using MOE Geometry Utility 320</p> <p>Step 3: Preparation of the Protein for Drug Design Applications 321</p> <p>Step 4: Description of the Protein‐Ligand Binding Mode 325</p> <p>Step 5: Detection of Protein Cavities 328</p> <p>Exercise 2: Retrospective Virtual Screening Using the Pharmacophore Approach 330</p> <p>Step 1: Description of the Test Library 332</p> <p>Step 2.1: Pharmacophore Design, Overview 333</p> <p>Step 2.2: Pharmacophore Design, Flexible Alignment of Three Thrombin Inhibitors 334</p> <p>Step 2.3: Pharmacophore Design, Query Generation 335</p> <p>Step 3: Pharmacophore Search 337</p> <p>Exercise 3: Retrospective Virtual Screening Using the Docking Approach 341</p> <p>Step 1: Description of the Test Library 341</p> <p>Step 2: Preparation of the Input 341</p> <p>Step 3: Re‐Docking of the Crystallographic Ligand 341</p> <p>Step 4: Virtual Screening of a Database 345</p> <p>General Conclusion 350</p> <p>References 351</p> <p><b>Part 8 Protein‐Ligand Docking 353</b></p> <p><b>22 Protein‐Ligand Docking 355<br /> </b><i>Inna Slynko, Didier Rognan, and Esther Kellenberger</i></p> <p>Introduction 355</p> <p>Description of the Example Case 356</p> <p>Methods 356</p> <p>Ligand Preparation 359</p> <p>Protein Preparation 359</p> <p>Docking Parameters 360</p> <p>Description of Input Data Available on the Editor Website 360</p> <p>Exercises 362</p> <p>A Quick Start with LeadIT 362</p> <p>Re‐Docking of Tacrine into AChE 362</p> <p>Preparation of AChE From 1ACJ PDB File 362</p> <p>Docking of Neutral Tacrine, then of Positively Charged Tacrine 363</p> <p>Docking of Positively Charged Tacrine in AChE in Presence of Water 365</p> <p>Cross‐Docking of Tacrine‐Pyridone and Donepezil Into AChE 366</p> <p>Preparation of AChE From 1ACJ PDB File 366</p> <p>Cross‐Docking of Tacrine‐Pyridone Inhibitor and Donepezil in AChE in Presence of Water 367</p> <p>Re‐Docking of Donepezil in AChE in Presence of Water 370</p> <p>General Conclusions 372</p> <p>Annex: Screen Captures of LeadIT Graphical Interface 372</p> <p>References 375</p> <p><b>Part 9 Pharmacophorical Profiling Using Shape Analysis 377</b></p> <p><b>23 Pharmacophorical Profiling Using Shape Analysis 379<br /> </b><i>Jérémy Desaphy, Guillaume Bret, Inna Slynko, Didier Rognan, and Esther Kellenberger</i></p> <p>Introduction 379</p> <p>Description of the Example Case 380</p> <p>Aim and Context 380</p> <p>Description of the Searched Data Set 381</p> <p>Description of the Query 381</p> <p>Methods 381</p> <p>Rocs 381</p> <p>VolSite and Shaper 384</p> <p>Other Programs for Shape Comparison 384</p> <p>Description of Input Data Available on the Editor Website 385</p> <p>Exercises 387</p> <p>Preamble: Practical Considerations 387</p> <p>Ligand Shape Analysis 387</p> <p>What are ROCS Output Files? 387</p> <p>Binding Site Comparison 388</p> <p>Conclusions 390</p> <p>References 391</p> <p><b>Part 10 Algorithmic Chemoinformatics 393</b></p> <p><b>24 Algorithmic Chemoinformatics 395<br /> </b><i>Martin Vogt, Antonio de la Vega de Leon, and Jürgen Bajorath</i></p> <p>Introduction 395</p> <p>Similarity Searching Using Data Fusion Techniques 396</p> <p>Introduction to Virtual Screening 396</p> <p>The Three Pillars of Virtual Screening 397</p> <p>Molecular Representation 397</p> <p>Similarity Function 397</p> <p>Search Strategy (Data Fusion) 397</p> <p>Fingerprints 397</p> <p>Count Fingerprints 397</p> <p>Fingerprint Representations 399</p> <p>Bit Strings 399</p> <p>Feature Lists 399</p> <p>Generation of Fingerprints 399</p> <p>Similarity Metrics 402</p> <p>Search Strategy 404</p> <p>Completed Virtual Screening Program 405</p> <p>Benchmarking VS Performance 406</p> <p>Scoring the Scorers 407</p> <p>How to Score 407</p> <p>Multiple Runs and Reproducibility 408</p> <p>Adjusting the VS Program for Benchmarking 408</p> <p>Analyzing Benchmark Results 410</p> <p>Conclusion 414</p> <p>Introduction to Chemoinformatics Toolkits 415</p> <p>Theoretical Background 415</p> <p>A Note on Graph Theory 416</p> <p>Basic Usage: Creating and Manipulating Molecules in RDKit 417</p> <p>Creation of Molecule Objects 417</p> <p>Molecule Methods 418</p> <p>Atom Methods 418</p> <p>Bond Methods 419</p> <p>An Example: Hill Notation for Molecules 419</p> <p>Canonical SMILES: The Canon Algorithm 420</p> <p>Theoretical Background 420</p> <p>Recap of SMILES Notation 420</p> <p>Canonical SMILES 421</p> <p>Building a SMILES String 422</p> <p>Canonicalization of SMILES 425</p> <p>The Initial Invariant 427</p> <p>The Iteration Step 428</p> <p>Summary 431</p> <p>Substructure Searching: The Ullmann Algorithm 432</p> <p>Theoretical Background 432</p> <p>Backtracking 433</p> <p>A Note on Atom Order 436</p> <p>The Ullmann Algorithm 436</p> <p>Sample Runs 440</p> <p>Summary 441</p> <p>Atom Environment Fingerprints 441</p> <p>Theoretical Background 441</p> <p>Implementation 443</p> <p>The Hashing Function 443</p> <p>The Initial Atom Invariant 444</p> <p>The Algorithm 444</p> <p>Summary 447</p> <p>References 447</p> <p>Index 449</p>
<p><b>Edited by</b> <p><b>Alexandre Varnek, PhD,</b> is a professor of theoretical chemistry at The University of Strasbourg, France where he heads the Laboratory of Chemoinformatics, and is Director of two MSc programs: Chemoinformatics and In Silico Drug Design. Professor Varnek's research focuses on developing new approaches and tools for virtual screening and "in silico" design of new compounds and chemical reactions.
<p><b>30 tutorials and more than 100 exercises in chemoinformatics, supported by online software and data sets</b> <p>Chemoinformatics is widely used in both academic and industrial chemical and biochemical research worldwide. Yet, until this unique guide, there were no books offering practical exercises in chemoinformatics methods. <i>Tutorials in Chemoinformatics</i> contains more than 100 exercises in 30 tutorials exploring key topics and methods in the field. It takes an applied approach to the subject with a strong emphasis on problem-solving and computational methodologies. <p>Each tutorial is self-contained and contains exercises for students to work through using a variety of software packages. The majority of the tutorials are divided into three sections devoted to theoretical background, algorithm description and software applications, respectively, with the latter section providing step-by-step software instructions. Throughout, three types of software tools are used: in-house programs developed by the authors, open-source programs and commercial programs which are available for free or at a modest cost to academics. The in-house software and data sets are available on a dedicated companion website. <p>Key topics and methods covered in <i>Tutorials in Chemoinformatics</i> include: <ul> <li>Data curation and standardization</li> <li>Development and use of chemical databases</li> <li>Structure encoding by molecular descriptors, text strings and binary fingerprints</li> <li>The design of diverse and focused libraries</li> <li>Chemical data analysis and visualization</li> <li>Structure-property/activity modeling (QSAR/QSPR)</li> <li>Ensemble modeling approaches, including bagging, boosting, stacking and random subspaces</li> <li>3D pharmacophores modeling and pharmacological profiling using shape analysis</li> <li>Protein-ligand docking</li> <li>Implementation of algorithms in a high-level programming language</li> </ul> <p><i>Tutorials in Chemoinformatics</i> is an ideal supplementary text for advanced undergraduate and graduate courses in chemoinformatics, bioinformatics, computational chemistry, computational biology, medicinal chemistry and biochemistry. It is also a valuable working resource for medicinal chemists, academic researchers and industrial chemists looking to enhance their chemoinformatics skills. <br>

Diese Produkte könnten Sie auch interessieren:

Hot-Melt Extrusion
Hot-Melt Extrusion
von: Dennis Douroumis
PDF ebook
136,99 €
Hot-Melt Extrusion
Hot-Melt Extrusion
von: Dennis Douroumis
EPUB ebook
136,99 €
Kunststoffe
Kunststoffe
von: Wilhelm Keim
PDF ebook
99,99 €