Details

Pattern Recognition in Computational Molecular Biology


Pattern Recognition in Computational Molecular Biology

Techniques and Approaches
Wiley Series in Bioinformatics 1. Aufl.

von: Mourad Elloumi, Costas Iliopoulos, Jason T. L. Wang, Albert Y. Zomaya, Yi Pan

123,99 €

Verlag: Wiley
Format: PDF
Veröffentl.: 30.11.2015
ISBN/EAN: 9781119078852
Sprache: englisch
Anzahl Seiten: 656

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

<p><b>A comprehensive overview of high-performance pattern recognition techniques and approaches to Computational Molecular Biology</b></p> <p>This book surveys the developments of techniques and approaches on pattern recognition related to Computational Molecular Biology. Providing a broad coverage of the field, the authors cover fundamental and technical information on these techniques and approaches, as well as discussing their related problems. The text consists of twenty nine chapters, organized into seven parts: <i>Pattern Recognition in Sequences</i><i>, Pattern Recognition in Secondary Structures</i><i>, Pattern Recognition in Tertiary Structures, Pattern Recognition in Quaternary Structures, Pattern Recognition in Microarrays</i><i>, Pattern Recognition in Phylogenetic Trees, </i>and <i>Pattern Recognition in Biological Networks</i>.</p> <ul> <li>Surveys the development of techniques and approaches on pattern recognition in biomolecular data</li> <li>Discusses pattern recognition in primary, secondary, tertiary and quaternary structures, as well as microarrays, phylogenetic trees and biological networks</li> <li>Includes case studies and examples to further illustrate the concepts discussed in the book</li> </ul> <i>Pattern Recognition in Computational Molecular Biology: Techniques and Approaches</i> is a reference for practitioners and professional researches in Computer Science, Life Science, and Mathematics. This book also serves as a supplementary reading for graduate students and young researches interested in Computational Molecular Biology.
<p>LIST OF CONTRIBUTORS xxi</p> <p>PREFACE xxvii</p> <p><b>I PATTERN RECOGNITION IN SEQUENCES 1</b></p> <p><b>1 COMBINATORIAL HAPLOTYPING PROBLEMS 3</b><br /><i>Giuseppe Lancia</i></p> <p>1.1 Introduction / 3</p> <p>1.2 Single Individual Haplotyping / 5</p> <p>1.2.1 The Minimum Error Correction Model / 8</p> <p>1.2.2 Probabilistic Approaches and Alternative Models / 10</p> <p>1.3 Population Haplotyping / 12</p> <p>1.3.1 Clark’s Rule / 14</p> <p>1.3.2 Pure Parsimony / 15</p> <p>1.3.3 Perfect Phylogeny / 19</p> <p>1.3.4 Disease Association / 21</p> <p>1.3.5 Other Models / 22</p> <p>References / 23</p> <p><b>2 ALGORITHMIC PERSPECTIVES OF THE STRING BARCODING PROBLEMS 28</b><br /><i>Sima Behpour and Bhaskar DasGupta</i></p> <p>2.1 Introduction / 28</p> <p>2.2 Summary of Algorithmic Complexity Results for Barcoding Problems / 32</p> <p>2.2.1 Average Length of Optimal Barcodes / 33</p> <p>2.3 Entropy-Based Information Content Technique for Designing</p> <p>Approximation Algorithms for String Barcoding Problems / 34</p> <p>2.4 Techniques for Proving Inapproximability Results for String Barcoding Problems / 36</p> <p>2.4.1 Reductions from Set Covering Problem / 36</p> <p>2.4.2 Reduction from Graph-Coloring Problem / 38</p> <p>2.5 Heuristic Algorithms for String Barcoding Problems / 39</p> <p>2.5.1 Entropy-Based Method with a Different Measure for Information Content / 39</p> <p>2.5.2 Balanced Partitioning Approach / 40</p> <p>2.6 Conclusion / 40</p> <p>Acknowledgments / 41</p> <p>References / 41</p> <p><b>3 ALIGNMENT-FREE MEASURES FOR WHOLE-GENOME COMPARISON 43</b><br /><i>Matteo Comin and Davide Verzotto</i></p> <p>3.1 Introduction / 43</p> <p>3.2 Whole-Genome Sequence Analysis / 44</p> <p>3.2.1 Background on Whole-Genome Comparison / 44</p> <p>3.2.2 Alignment-Free Methods / 45</p> <p>3.2.3 Average Common Subword / 46</p> <p>3.2.4 Kullback–Leibler Information Divergence / 47</p> <p>3.3 Underlying Approach / 47</p> <p>3.3.1 Irredundant Common Subwords / 48</p> <p>3.3.2 Underlying Subwords / 49</p> <p>3.3.3 Efficient Computation of Underlying Subwords / 50</p> <p>3.3.4 Extension to Inversions and Complements / 53</p> <p>3.3.5 A Distance-Like Measure Based on Underlying Subwords / 53</p> <p>3.4 Experimental Results / 54</p> <p>3.4.1 Genome Data sets and Reference Taxonomies / 54</p> <p>3.4.2 Whole-Genome Phylogeny Reconstruction / 56</p> <p>3.5 Conclusion / 61</p> <p>Author’s Contributions / 62</p> <p>Acknowledgments / 62</p> <p>References / 62</p> <p><b>4 A MAXIMUM LIKELIHOOD FRAMEWORK FOR MULTIPLE SEQUENCE LOCAL ALIGNMENT 65</b><br /><i>Chengpeng Bi</i></p> <p>4.1 Introduction / 65</p> <p>4.2 Multiple Sequence Local Alignment / 67</p> <p>4.2.1 Overall Objective Function / 67</p> <p>4.2.2 Maximum Likelihood Model / 68</p> <p>4.3 Motif Finding Algorithms / 70</p> <p>4.3.1 DEM Motif Algorithm / 70</p> <p>4.3.2 WEM Motif Finding Algorithm / 70</p> <p>4.3.3 Metropolis Motif Finding Algorithm / 72</p> <p>4.3.4 Gibbs Motif Finding Algorithm / 73</p> <p>4.3.5 Pseudo-Gibbs Motif Finding Algorithm / 74</p> <p>4.4 Time Complexity / 75</p> <p>4.5 Case Studies / 75</p> <p>4.5.1 Performance Evaluation / 76</p> <p>4.5.2 CRP Binding Sites / 76</p> <p>4.5.3 Multiple Motifs in Helix–Turn–Helix Protein Structure / 78</p> <p>4.6 Conclusion / 80</p> <p>References / 81</p> <p><b>5 GLOBAL SEQUENCE ALIGNMENT WITH A BOUNDED NUMBER OF GAPS 83</b><br /><i>Carl Barton, Tomáš Flouri, Costas S. Iliopoulos, and Solon P. Pissis</i></p> <p>5.1 Introduction / 83</p> <p>5.2 Definitions and Notation / 85</p> <p>5.3 Problem Definition / 87</p> <p>5.4 Algorithms / 88</p> <p>5.5 Conclusion / 94</p> <p>References / 95</p> <p><b>II PATTERN RECOGNITION IN SECONDARY STRUCTURES 97</b></p> <p><b>6 A SHORT REVIEW ON PROTEIN SECONDARY STRUCTURE PREDICTION METHODS 99</b><br /><i>Renxiang Yan, Jiangning Song, Weiwen Cai, and Ziding Zhang</i></p> <p>6.1 Introduction / 99</p> <p>6.2 Representative Protein Secondary Structure Prediction Methods / 102</p> <p>6.2.1 Chou–Fasman / 103</p> <p>6.2.2 GOR / 104</p> <p>6.2.3 PHD / 104</p> <p>6.2.4 PSIPRED / 104</p> <p>6.2.5 SPINE-X / 105</p> <p>6.2.6 PSSpred / 105</p> <p>6.2.7 Meta Methods / 105</p> <p>6.3 Evaluation of Protein Secondary Structure Prediction Methods / 106</p> <p>6.3.1 Measures / 106</p> <p>6.3.2 Benchmark / 106</p> <p>6.3.3 Performances / 107</p> <p>6.4 Conclusion / 110</p> <p>Acknowledgments / 110</p> <p>References / 111</p> <p><b>7 A GENERIC APPROACH TO BIOLOGICAL SEQUENCE SEGMENTATION PROBLEMS: APPLICATION TO PROTEIN SECONDARY STRUCTURE PREDICTION 114</b><br /><i>Yann Guermeur and Fabien Lauer</i></p> <p>7.1 Introduction / 114</p> <p>7.2 Biological Sequence Segmentation / 115</p> <p>7.3 MSVMpred / 117</p> <p>7.3.1 Base Classifiers / 117</p> <p>7.3.2 Ensemble Methods / 118</p> <p>7.3.3 Convex Combination / 119</p> <p>7.4 Postprocessing with A Generative Model / 119</p> <p>7.5 Dedication to Protein Secondary Structure Prediction / 120</p> <p>7.5.1 Biological Problem / 121</p> <p>7.5.2 MSVMpred2 / 121</p> <p>7.5.3 Hidden Semi-Markov Model / 122</p> <p>7.5.4 Experimental Results / 122</p> <p>7.6 Conclusions and Ongoing Research / 125</p> <p>Acknowledgments / 126</p> <p>References / 126</p> <p><b>8 STRUCTURAL MOTIF IDENTIFICATION AND RETRIEVAL: A GEOMETRICAL APPROACH 129</b><br /><i>Virginio Cantoni, Marco Ferretti, Mirto Musci, and Nahumi Nugrahaningsih</i></p> <p>8.1 Introduction / 129</p> <p>8.2 A Few Basic Concepts / 130</p> <p>8.2.1 Hierarchy of Protein Structures / 130</p> <p>8.2.2 Secondary Structure Elements / 131</p> <p>8.2.3 Structural Motifs / 132</p> <p>8.2.4 Available Sources for Protein Data / 134</p> <p>8.3 State of the Art / 135</p> <p>8.3.1 Protein Structure Motif Search / 135</p> <p>8.3.2 Promotif / 136</p> <p>8.3.3 Secondary-Structure Matching / 137</p> <p>8.3.4 Multiple Structural Alignment by Secondary Structures / 138</p> <p>8.4 A Novel Geometrical Approach to Motif Retrieval / 138</p> <p>8.4.1 Secondary Structures Cooccurrences / 138</p> <p>8.4.2 Cross Motif Search / 143</p> <p>8.4.3 Complete Cross Motif Search / 146</p> <p>8.5 Implementation Notes / 149</p> <p>8.5.1 Optimizations / 149</p> <p>8.5.2 Parallel Approaches / 150</p> <p>8.6 Conclusions and Future Work / 151</p> <p>Acknowledgment / 152</p> <p>References / 152</p> <p><b>9 GENOME-WIDE SEARCH FOR PSEUDOKNOTTED NONCODING RNAs: A COMPARATIVE STUDY 155</b><br /><i>Meghana Vasavada, Kevin Byron, Yang Song, and Jason T.L. Wang</i></p> <p>9.1 Introduction / 155</p> <p>9.2 Background / 156</p> <p>9.2.1 Noncoding RNAs and Their Secondary Structures / 156</p> <p>9.2.2 Pseudoknotted ncRNA Search Tools / 157</p> <p>9.3 Methodology / 157</p> <p>9.4 Results and Interpretation / 161</p> <p>9.5 Conclusion / 162</p> <p>References / 163</p> <p><b>III PATTERN RECOGNITION IN TERTIARY STRUCTURES 165</b></p> <p><b>10 MOTIF DISCOVERY IN PROTEIN 3D-STRUCTURES USING GRAPH MINING TECHNIQUES 167</b><br /><i>Wajdi Dhifli and Engelbert Mephu Nguifo</i></p> <p>10.1 Introduction / 167</p> <p>10.2 From Protein 3D-Structures to Protein Graphs / 169</p> <p>10.2.1 Parsing Protein 3D-Structures into Graphs / 169</p> <p>10.3 Graph Mining / 172</p> <p>10.4 Subgraph Mining / 173</p> <p>10.5 Frequent Subgraph Discovery / 173</p> <p>10.5.1 Problem Definition / 174</p> <p>10.5.2 Candidates Generation / 176</p> <p>10.5.3 Frequent Subgraph Discovery Approaches / 177</p> <p>10.5.4 Variants of Frequent Subgraph Mining: Closed and Maximal Subgraphs / 178</p> <p>10.6 Feature Selection / 179</p> <p>10.6.1 Relevance of a Feature / 179</p> <p>10.7 Feature Selection for Subgraphs / 180</p> <p>10.7.1 Problem Statement / 180</p> <p>10.7.2 Mining Top-k Subgraphs / 180</p> <p>10.7.3 Clustering-Based Subgraph Selection / 181</p> <p>10.7.4 Sampling-Based Approaches / 181</p> <p>10.7.5 Approximate Subgraph Mining / 181</p> <p>10.7.6 Discriminative Subgraph Selection / 182</p> <p>10.7.7 Other Significant Subgraph Selection Approaches / 182</p> <p>10.8 Discussion / 183</p> <p>10.9 Conclusion / 185</p> <p>Acknowledgments / 185</p> <p>References / 186</p> <p><b>11 FUZZY AND UNCERTAIN LEARNING TECHNIQUES FOR THE ANALYSIS AND PREDICTION OF PROTEIN TERTIARY STRUCTURES 190</b><br /><i>Chinua Umoja, Xiaxia Yu, and Robert Harrison</i></p> <p>11.1 Introduction / 190</p> <p>11.2 Genetic Algorithms / 192</p> <p>11.2.1 GA Model Selection in Protein Structure Prediction / 196</p> <p>11.2.2 Common Methodology / 198</p> <p>11.3 Supervised Machine Learning Algorithm / 201</p> <p>11.3.1 Artificial Neural Networks / 201</p> <p>11.3.2 ANNs in Protein Structure Prediction / 202</p> <p>11.3.3 Support Vector Machines / 203</p> <p>11.4 Fuzzy Application / 204</p> <p>11.4.1 Fuzzy Logic / 204</p> <p>11.4.2 Fuzzy SVMs / 204</p> <p>11.4.3 Adaptive-Network-Based Fuzzy Inference Systems / 205</p> <p>11.4.4 Fuzzy Decision Trees / 206</p> <p>11.5 Conclusion / 207</p> <p>References / 208</p> <p><b>12 PROTEIN INTER-DOMAIN LINKER PREDICTION 212</b><br /><i>Maad Shatnawi, Paul D. Yoo, and Sami Muhaidat</i></p> <p>12.1 Introduction / 212</p> <p>12.2 Protein Structure Overview / 213</p> <p>12.3 Technical Challenges and Open Issues / 214</p> <p>12.4 Prediction Assessment / 215</p> <p>12.5 Current Approaches / 216</p> <p>12.5.1 DomCut / 216</p> <p>12.5.2 Scooby-Domain / 217</p> <p>12.5.3 FIEFDom / 218</p> <p>12.5.4 Chatterjee et al. (2009) / 219</p> <p>12.5.5 Drop / 219</p> <p>12.6 Domain Boundary Prediction Using Enhanced General Regression Network / 220</p> <p>12.6.1 Multi-Domain Benchmark Data Set / 220</p> <p>12.6.2 Compact Domain Profile / 221</p> <p>12.6.3 The Enhanced Semi-Parametric Model / 222</p> <p>12.6.4 Training, Testing, and Validation / 225</p> <p>12.6.5 Experimental Results / 226</p> <p>12.7 Inter-Domain Linkers Prediction Using Compositional Index and Simulated Annealing / 227</p> <p>12.7.1 Compositional Index / 228</p> <p>12.7.2 Detecting the Optimal Set of Threshold Values Using Simulated Annealing / 229</p> <p>12.7.3 Experimental Results / 230</p> <p>12.8 Conclusion / 232</p> <p>References / 233</p> <p><b>13 PREDICTION OF PROLINE CIS–TRANS ISOMERIZATION 236</b><br /><i>Paul D. Yoo, Maad Shatnawi, Sami Muhaidat, Kamal Taha, and Albert Y. Zomaya</i></p> <p>13.1 Introduction / 236</p> <p>13.2 Methods / 238</p> <p>13.2.1 Evolutionary Data Set Construction / 238</p> <p>13.2.2 Protein Secondary Structure Information / 239</p> <p>13.2.3 Method I: Intelligent Voting / 239</p> <p>13.2.4 Method II: Randomized Meta-Learning / 241</p> <p>13.2.5 Model Validation and Testing / 242</p> <p>13.2.6 Parameter Tuning / 242</p> <p>13.3 Model Evaluation and Analysis / 243</p> <p>13.4 Conclusion / 245</p> <p>References / 245</p> <p><b>IV PATTERN RECOGNITION IN QUATERNARY STRUCTURES 249</b></p> <p><b>14 PREDICTION OF PROTEIN QUATERNARY STRUCTURES 251</b><br /><i>Akbar Vaseghi, Maryam Faridounnia, Soheila Shokrollahzade, Samad Jahandideh, and Kuo-Chen Chou</i></p> <p>14.1 Introduction / 251</p> <p>14.2 Protein Structure Prediction / 255</p> <p>14.2.1 Secondary Structure Prediction / 255</p> <p>14.2.2 Modeling of Tertiary Structure / 256</p> <p>14.3 Template-Based Predictions / 257</p> <p>14.3.1 Homology Modeling / 257</p> <p>14.3.2 Threading Methods / 257</p> <p>14.3.3 Ab initio Modeling / 257</p> <p>14.4 Critical Assessment of Protein Structure Prediction / 258</p> <p>14.5 Quaternary Structure Prediction / 258</p> <p>14.6 Conclusion / 261</p> <p>Acknowledgments / 261</p> <p>References / 261</p> <p><b>15 COMPARISON OF PROTEIN QUATERNARY STRUCTURES BY GRAPH APPROACHES 266</b><br /><i>Sheng-Lung Peng and Yu-Wei Tsay</i></p> <p>15.1 Introduction / 266</p> <p>15.2 Similarity in the Graph Model / 268</p> <p>15.2.1 Graph Model for Proteins / 270</p> <p>15.3 Measuring Structural Similarity VIA MCES / 272</p> <p>15.3.1 Problem Formulation / 273</p> <p>15.3.2 Constructing P-Graphs / 274</p> <p>15.3.3 Constructing Line Graphs / 276</p> <p>15.3.4 Constructing Modular Graphs / 276</p> <p>15.3.5 Maximum Clique Detection / 277</p> <p>15.3.6 Experimental Results / 277</p> <p>15.4 Protein Comparison VIA Graph Spectra / 279</p> <p>15.4.1 Graph Spectra / 279</p> <p>15.4.2 Matrix Selection / 281</p> <p>15.4.3 Graph Cospectrality and Similarity / 283</p> <p>15.4.4 Cospectral Comparison / 283</p> <p>15.4.5 Experimental Results / 284</p> <p>15.5 Conclusion / 287</p> <p>References / 287</p> <p><b>16 STRUCTURAL DOMAINS IN PREDICTION OF BIOLOGICAL PROTEIN–PROTEIN INTERACTIONS 291</b><br /><i>Mina Maleki, Michael Hall, and Luis Rueda</i></p> <p>16.1 Introduction / 291</p> <p>16.2 Structural Domains / 293</p> <p>16.3 The Prediction Framework / 293</p> <p>16.4 Feature Extraction and Prediction Properties / 294</p> <p>16.4.1 Physicochemical Properties / 296</p> <p>16.4.2 Domain-Based Properties / 298</p> <p>16.5 Feature Selection / 299</p> <p>16.5.1 Filter Methods / 299</p> <p>16.5.2 Wrapper Methods / 301</p> <p>16.6 Classification / 301</p> <p>16.6.1 Linear Dimensionality Reduction / 301</p> <p>16.6.2 Support Vector Machines / 303</p> <p>16.6.3 k-Nearest Neighbor / 303</p> <p>16.6.4 Naive Bayes / 304</p> <p>16.7 Evaluation and Analysis / 304</p> <p>16.8 Results and Discussion / 304</p> <p>16.8.1 Analysis of the Prediction Properties / 304</p> <p>16.8.2 Analysis of Structural DDIs / 307</p> <p>16.9 Conclusion / 309</p> <p>References / 310</p> <p><b>V PATTERN RECOGNITION IN MICROARRAYS 315</b></p> <p><b>17 CONTENT-BASED RETRIEVAL OF MICROARRAY EXPERIMENTS 317</b><br /><i>Hasan O?gul</i></p> <p>17.1 Introduction / 317</p> <p>17.2 Information Retrieval: Terminology and Background / 318</p> <p>17.3 Content-Based Retrieval / 320</p> <p>17.4 Microarray Data and Databases / 322</p> <p>17.5 Methods for Retrieving Microarray Experiments / 324</p> <p>17.6 Similarity Metrics / 327</p> <p>17.7 Evaluating Retrieval Performance / 329</p> <p>17.8 Software Tools / 330</p> <p>17.9 Conclusion and Future Directions / 331</p> <p>Acknowledgment / 332</p> <p>References / 332</p> <p><b>18 EXTRACTION OF DIFFERENTIALLY EXPRESSED GENES IN MICROARRAY DATA 335</b><br /><i>Tiratha Raj Singh, Brigitte Vannier, and Ahmed Moussa</i></p> <p>18.1 Introduction / 335</p> <p>18.2 From Microarray Image to Signal / 336</p> <p>18.2.1 Signal from Oligo DNA Array Image / 336</p> <p>18.2.2 Signal from Two-Color cDNA Array / 337</p> <p>18.3 Microarray Signal Analysis / 337</p> <p>18.3.1 Absolute Analysis and Replicates in Microarrays / 338</p> <p>18.3.2 Microarray Normalization / 339</p> <p>18.4 Algorithms for De Gene Selection / 339</p> <p>18.4.1 Within–Between DE Gene (WB-DEG) Selection Algorithm / 340</p> <p>18.4.2 Comparison of the WB-DEGs with Two Classical DE Gene Selection Methods on Latin Square Data / 341</p> <p>18.5 Gene Ontology Enrichment and Gene Set Enrichment Analysis / 343</p> <p>18.6 Conclusion / 345</p> <p>References / 345</p> <p><b>19 CLUSTERING AND CLASSIFICATION TECHNIQUES FOR GENE EXPRESSION PROFILE PATTERN ANALYSIS 347</b><br /><i>Emanuel Weitschek, Giulia Fiscon, Valentina Fustaino, Giovanni Felici, and Paola Bertolazzi</i></p> <p>19.1 Introduction / 347</p> <p>19.2 Transcriptome Analysis / 348</p> <p>19.3 Microarrays / 349</p> <p>19.3.1 Applications / 349</p> <p>19.3.2 Microarray Technology / 350</p> <p>19.3.3 Microarray Workflow / 350</p> <p>19.4 RNA-Seq / 351</p> <p>19.5 Benefits and Drawbacks of RNA-Seq and Microarray Technologies / 353</p> <p>19.6 Gene Expression Profile Analysis / 356</p> <p>19.6.1 Data Definition / 356</p> <p>19.6.2 Data Analysis / 357</p> <p>19.6.3 Normalization and Background Correction / 357</p> <p>19.6.4 Genes Clustering / 359</p> <p>19.6.5 Experiment Classification / 361</p> <p>19.6.6 Software Tools for Gene Expression Profile Analysis / 362</p> <p>19.7 Real Case Studies / 364</p> <p>19.8 Conclusions / 367</p> <p>References / 368</p> <p><b>20 MINING INFORMATIVE PATTERNS IN MICROARRAY DATA 371</b><br /><i>Li Teng</i></p> <p>20.1 Introduction / 371</p> <p>20.2 Patterns with Similarity / 373</p> <p>20.2.1 Similarity Measurement / 374</p> <p>20.2.2 Clustering / 376</p> <p>20.2.3 Biclustering / 379</p> <p>20.2.4 Types of Biclusters / 380</p> <p>20.2.5 Measurement of the Homogeneity / 383</p> <p>20.2.6 Biclustering Algorithms with Different Searching Schemes / 387</p> <p>20.3 Conclusion / 391</p> <p>References / 391</p> <p><b>21 ARROW PLOT AND CORRESPONDENCE ANALYSIS MAPS FOR VISUALIZING THE EFFECTS OF BACKGROUND CORRECTION AND NORMALIZATION METHODS ON MICROARRAY DATA 394</b><br /><i>Carina Silva, Adelaide Freitas, Sara Roque, and Lisete Sousa</i></p> <p>21.1 Overview / 394</p> <p>21.1.1 Background Correction Methods / 395</p> <p>21.1.2 Normalization Methods / 396</p> <p>21.1.3 Literature Review / 397</p> <p>21.2 Arrow Plot / 399</p> <p>21.2.1 DE Genes Versus Special Genes / 399</p> <p>21.2.2 Definition and Properties of the ROC Curve / 400</p> <p>21.2.3 AUC and Degenerate ROC Curves / 401</p> <p>21.2.4 Overlapping Coefficient / 402</p> <p>21.2.5 Arrow Plot Construction / 403</p> <p>21.3 Significance Analysis of Microarrays / 404</p> <p>21.4 Correspondence Analysis / 405</p> <p>21.4.1 Basic Principles / 405</p> <p>21.4.2 Interpretation of CA Maps / 406</p> <p>21.5 Impact of the Preprocessing Methods / 407</p> <p>21.5.1 Class Prediction Context / 408</p> <p>21.5.2 Class Comparison Context / 408</p> <p>21.6 Conclusions / 412</p> <p>Acknowledgments / 413</p> <p>References / 413</p> <p><b>VI PATTERN RECOGNITION IN PHYLOGENETIC TREES 417</b></p> <p><b>22 PATTERN RECOGNITION IN PHYLOGENETICS: TREES AND NETWORKS 419</b><br /><i>David A. Morrison</i></p> <p>22.1 Introduction / 419</p> <p>22.2 Networks and Trees / 420</p> <p>22.3 Patterns and Their Processes / 424</p> <p>22.4 The Types of Patterns / 427</p> <p>22.5 Fingerprints / 431</p> <p>22.6 Constructing Networks / 433</p> <p>22.7 Multi-Labeled Trees / 435</p> <p>22.8 Conclusion / 436</p> <p>References / 437</p> <p><b>23 DIVERSE CONSIDERATIONS FOR SUCCESSFUL PHYLOGENETIC TREE RECONSTRUCTION: IMPACTS FROM MODEL MISSPECIFICATION, RECOMBINATION, HOMOPLASY, AND PATTERN RECOGNITION 439</b><br /><i>Diego Mallo, Agustín Sánchez-Cobos, and Miguel Arenas</i></p> <p>23.1 Introduction / 440</p> <p>23.2 Overview on Methods and Frameworks for Phylogenetic Tree Reconstruction / 440</p> <p>23.2.1 Inferring Gene Trees / 441</p> <p>23.2.2 Inferring Species Trees / 442</p> <p>23.3 Influence of Substitution Model Misspecification on Phylogenetic Tree Reconstruction / 445</p> <p>23.4 Influence of Recombination on Phylogenetic Tree Reconstruction / 446</p> <p>23.5 Influence of Diverse Evolutionary Processes on Species Tree Reconstruction / 447</p> <p>23.6 Influence of Homoplasy on Phylogenetic Tree Reconstruction: The Goals of Pattern Recognition / 449</p> <p>23.7 Concluding Remarks / 449</p> <p>Acknowledgments / 450</p> <p>References / 450</p> <p><b>24 AUTOMATED PLAUSIBILITY ANALYSIS OF LARGE PHYLOGENIES 457</b><br /><i>David Dao, Tomáš Flouri, and Alexandros Stamatakis</i></p> <p>24.1 Introduction / 457</p> <p>24.2 Preliminaries / 459</p> <p>24.3 A Naïve Approach / 462</p> <p>24.4 Toward a Faster Method / 463</p> <p>24.5 Improved Algorithm / 467</p> <p>24.5.1 Preprocessing / 467</p> <p>24.5.2 Computing Lowest Common Ancestors / 468</p> <p>24.5.3 Constructing the Induced Tree / 468</p> <p>24.5.4 Final Remarks / 471</p> <p>24.6 Implementation / 473</p> <p>24.6.1 Preprocessing / 473</p> <p>24.6.2 Reconstruction / 473</p> <p>24.6.3 Extracting Bipartitions / 474</p> <p>24.7 Evaluation / 474</p> <p>24.7.1 Test Data Sets / 474</p> <p>24.7.2 Experimental Results / 475</p> <p>24.8 Conclusion / 479</p> <p>Acknowledgment / 481</p> <p>References / 481</p> <p><b>25 A NEW FAST METHOD FOR DETECTING AND VALIDATING HORIZONTAL GENE TRANSFER EVENTS USING PHYLOGENETIC TREES AND AGGREGATION FUNCTIONS 483</b><br /><i>Dunarel Badescu, Nadia Tahiri, and Vladimir Makarenkov</i></p> <p>25.1 Introduction / 483</p> <p>25.2 Methods / 485</p> <p>25.2.1 Clustering Using Variability Functions / 485</p> <p>25.2.2 Other Variants of Clustering Functions Implemented in the Algorithm / 487</p> <p>25.2.3 Description of the New Algorithm / 488</p> <p>25.2.4 Time Complexity / 491</p> <p>25.3 Experimental Study / 491</p> <p>25.3.1 Implementation / 491</p> <p>25.3.2 Synthetic Data / 491</p> <p>25.3.3 Real Prokaryotic (Genomic) Data / 495</p> <p>25.4 Results and Discussion / 501</p> <p>25.4.1 Analysis of Synthetic Data / 501</p> <p>25.4.2 Analysis of Prokaryotic Data / 502</p> <p>25.5 Conclusion / 502</p> <p>References / 503</p> <p><b>VII PATTERN RECOGNITION IN BIOLOGICAL NETWORKS 505</b></p> <p><b>26 COMPUTATIONAL METHODS FOR MODELING BIOLOGICAL INTERACTION NETWORKS 507</b><br /><i>Christos Makris and Evangelos Theodoridis</i></p> <p>26.1 Introduction / 507</p> <p>26.2 Measures/Metrics / 508</p> <p>26.3 Models of Biological Networks / 511</p> <p>26.4 Reconstructing and Partitioning Biological Networks / 511</p> <p>26.5 PPI Networks / 513</p> <p>26.6 Mining PPI Networks—Interaction Prediction / 517</p> <p>26.7 Conclusions / 519</p> <p>References / 519</p> <p><b>27 BIOLOGICAL NETWORK INFERENCE AT MULTIPLE SCALES: FROM GENE REGULATION TO SPECIES INTERACTIONS 525</b><br /><i>Andrej Aderhold, V Anne Smith, and Dirk Husmeier</i></p> <p>27.1 Introduction / 525</p> <p>27.2 Molecular Systems / 528</p> <p>27.3 Ecological Systems / 528</p> <p>27.4 Models and Evaluation / 529</p> <p>27.4.1 Notations / 529</p> <p>27.4.2 Sparse Regression and the LASSO / 530</p> <p>27.4.3 Bayesian Regression / 530</p> <p>27.4.4 Evaluation Metric / 531</p> <p>27.5 Learning Gene Regulation Networks / 532</p> <p>27.5.1 Nonhomogeneous Bayesian Regression / 533</p> <p>27.5.2 Gradient Estimation / 534</p> <p>27.5.3 Simulated Bio-PEPA Data / 534</p> <p>27.5.4 Real mRNA Expression Profile Data / 535</p> <p>27.5.5 Method Evaluation and Learned Networks / 536</p> <p>27.6 Learning Species Interaction Networks / 540</p> <p>27.6.1 Regression Model of Species interactions / 540</p> <p>27.6.2 Multiple Global Change-Points / 541</p> <p>27.6.3 Mondrian Process Change-Points / 542</p> <p>27.6.4 Synthetic Data / 544</p> <p>27.6.5 Simulated Population Dynamics / 544</p> <p>27.6.6 Real World Plant Data / 546</p> <p>27.6.7 Method Evaluation and Learned Networks / 546</p> <p>27.7 Conclusion / 550</p> <p>References / 550</p> <p><b>28 DISCOVERING CAUSAL PATTERNS WITH STRUCTURAL EQUATION MODELING: APPLICATION TO TOLL-LIKE RECEPTOR SIGNALING PATHWAY IN CHRONIC LYMPHOCYTIC LEUKEMIA 555</b><br /><i>Athina Tsanousa, Stavroula Ntoufa, Nikos Papakonstantinou, Kostas Stamatopoulos, and Lefteris Angelis</i></p> <p>28.1 Introduction / 555</p> <p>28.2 Toll-Like Receptors / 557</p> <p>28.2.1 Basics / 557</p> <p>28.2.2 Structure and Signaling of TLRs / 558</p> <p>28.2.3 TLR Signaling in Chronic Lymphocytic Leukemia / 559</p> <p>28.3 Structural Equation Modeling / 560</p> <p>28.3.1 Methodology of SEM Modeling / 560</p> <p>28.3.2 Assumptions / 561</p> <p>28.3.3 Estimation Methods / 562</p> <p>28.3.4 Missing Data / 562</p> <p>28.3.5 Goodness-of-Fit Indices / 563</p> <p>28.3.6 Other Indications of a Misspecified Model / 565</p> <p>28.4 Application / 566</p> <p>28.5 Conclusion / 580</p> <p>References / 581</p> <p><b>29 ANNOTATING PROTEINS WITH INCOMPLETE LABEL INFORMATION 585</b><br /><i>Guoxian Yu, Huzefa Rangwala, and Carlotta Domeniconi</i></p> <p>29.1 Introduction / 585</p> <p>29.2 Related Work / 587</p> <p>29.3 Problem Formulation / 589</p> <p>29.3.1 The Algorithm / 591</p> <p>29.4 Experimental Setup / 592</p> <p>29.4.1 Data sets / 592</p> <p>29.4.2 Comparative Methods / 593</p> <p>29.4.3 Experimental Protocol / 594</p> <p>29.4.4 Evaluation Criteria / 594</p> <p>29.5 Experimental Analysis / 596</p> <p>29.5.1 Replenishing Missing Functions / 596</p> <p>29.5.2 Predicting Unlabeled Proteins / 600</p> <p>29.5.3 Component Analysis / 604</p> <p>29.5.4 Run Time Analysis / 604</p> <p>29.6 Conclusions / 605</p> <p>Acknowledgments / 606</p> <p>References / 606</p> <p>INDEX 609</p>
<p><b>Mourad Elloumi, PhD</b>, is Professor in Computer Science at the University of Tunis-El Manar, Tunisia. Dr. Elloumi is the author/co-author of more than 50 publications in international journals and conference proceedings related to Algorithmics, Computational Molecular Biology, and Knowledge Discovery and Data Mining.</p> <p><b>Costas S. Iliopoulos, PhD,</b> is Professor of AlgorithmDesign at King's College London, UK. Dr. Iliopoulos co-authored over 300 peer-reviewed articles in pattern matching and combinatorics of strings. He serves on the editorial board of the <i>Journal of Discrete Algorithms</i>, <i>Computer Mathematics & Combinatorial Computing,</i> and <i>System Biology & Biomedical Technologies</i>.</p> <p><b>Jason T. L. Wang, PhD,</b> is Professor of Computer Science at the New Jersey Institute of Technology, USA. Dr. Wang has published extensively on Data Mining and Computational Molecular Biology, and has been a member of program committees for over 200 conferences and workshops in these and related areas.</p> <p><b>Albert Y. Zomaya, PhD, </b>is the Chair Professor of High Performance Computing & Networking in the School of Information Technologies, University of Sydney, Australia. <i>Dr. Zomaya</i> published <i>more than</i> 500 scientific papers and articles and is author, co-author or editor of <i>more than 20 books</i>. Dr. Zomaya is Fellow of AAAS, IEEE, and IET.</p>
<p><b>A comprehensive overview of high-performance pattern recognition techniques and approaches to Computational Molecular Biology</b></p> <p>This book surveys the developments of techniques and approaches on pattern recognition related to Computational Molecular Biology. Providing a broad coverage of the field, the authors cover fundamental and technical information on these techniques and approaches, as well as discussing their related problems. The text consists of twenty nine chapters, organized into seven parts<i>: Pattern Recognition in Sequences, Pattern Recognition in Secondary Structures, Pattern Recognition in Tertiary Structures, Pattern Recognition in Quaternary Structures, Pattern Recognition in Microarrays, Pattern Recognition in Phylogenetic Trees, and Pattern Recognition in Biological Networks</i>.</p> <ul> <li>Surveys the development of techniques and approaches on pattern recognition in biomolecular data</li> <li>Discusses pattern recognition in primary, secondary, tertiary and quaternary structures, as well as microarrays, phylogenetic trees and biological networks</li> <li>Includes case studies and examples to further illustrate the concepts discussed in the book</li> </ul> <p><i>Pattern Recognition in Computational Molecular Biology: Techniques and Approaches</i> is a reference for practitioners and professional researches in Computer Science, Life Science, and Mathematics. This book also serves as a supplementary reading for graduate students and young researches interested in Computational Molecular Biology.</p>

Diese Produkte könnten Sie auch interessieren:

Bandwidth Efficient Coding
Bandwidth Efficient Coding
von: John B. Anderson
EPUB ebook
114,99 €
Digital Communications with Emphasis on Data Modems
Digital Communications with Emphasis on Data Modems
von: Richard W. Middlestead
PDF ebook
171,99 €
Bandwidth Efficient Coding
Bandwidth Efficient Coding
von: John B. Anderson
PDF ebook
114,99 €