Details

Molecular Data Analysis Using R


Molecular Data Analysis Using R


1. Aufl.

von: Csaba Ortutay, Zsuzsanna Ortutay

89,99 €

Verlag: Wiley-Blackwell
Format: PDF
Veröffentl.: 29.12.2016
ISBN/EAN: 9781119165033
Sprache: englisch
Anzahl Seiten: 352

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

This book addresses the difficulties experienced by wet lab researchers with the statistical analysis of molecular biology related data.  The authors explain how to use R and Bioconductor for the analysis of experimental data in the field of molecular biology.  The content is based upon two university courses for bioinformatics and experimental biology students (Biological Data Analysis with R and High-throughput Data Analysis with R). The material is divided into chapters based upon the experimental methods used in the laboratories.  <br /><br />Key features include:<br />• Broad appeal--the authors target their material to researchers in several levels, ensuring that the basics are always covered.<br />• First book to explain how to use R and Bioconductor for the analysis of several types of experimental data in the field of molecular biology.<br />• Focuses on R and Bioconductor, which are widely used for data analysis. One great benefit of R and Bioconductor is that there is a vast user community and very active discussion in place, in addition to the practice of sharing codes. Further, R is the platform for implementing new analysis approaches, therefore novel methods are available early for R users.
<p>Foreword, xiii</p> <p>Preface, xv</p> <p>Acknowledgements, xix</p> <p>About the Companion Website, xxi</p> <p><b>1 Introduction to R statistical environment, 1</b></p> <p>Why R?, 1</p> <p>Installing R, 2</p> <p>Interacting with R, 2</p> <p>Graphical interfaces and integrated development environment (IDE) integration, 3</p> <p>Scripting and sourcing, 3</p> <p>The R history and the R environment file, 4</p> <p>Packages and package repositories, 4</p> <p>Comprehensive R Archive Network, 5</p> <p>Bioconductor, 6</p> <p>Working with data, 7</p> <p>Basic operations in R, 8</p> <p>Some basics of graphics in R, 10</p> <p>Getting help in R, 12</p> <p>Files for practicing, 13</p> <p>Study exercises and questions, 14</p> <p>References, 14</p> <p>Webliography, 15</p> <p><b>2 Simple sequence analysis, 17</b></p> <p>Sequence files, 17</p> <p>FASTA sequence format, 18</p> <p>GenBank flat file format, 19</p> <p>Reading sequence files into R, 20</p> <p>Obtaining sequences from remote databases, 21</p> <p>Seqinr package, 21</p> <p>Ape package, 22</p> <p>Descriptive statistics of nucleotide sequences, 24</p> <p>Descriptive statistics of proteins, 28</p> <p>Aligned sequences, 31</p> <p>Visualization of genes and transcripts in a professional way, 34</p> <p>Files for practicing, 37</p> <p>Study exercises and questions, 38</p> <p>References, 38</p> <p>Webliography, 39</p> <p>Packages, 40</p> <p><b>3 Annotating gene groups, 41</b></p> <p>Enrichment analysis: an overview, 41</p> <p>Overview of two different methods, 41</p> <p>Enrichment analysis results, 42</p> <p>Common aspects of the two different approaches, 43</p> <p>Overrepresentation analysis, 46</p> <p>Hypergeometric test using GOstats, 47</p> <p>ORA analysis using topGO, 48</p> <p>Enrichment analysis of microarray sets with topGO, 51</p> <p>Gene set enrichment analysis, 52</p> <p>GSEA with R, 56</p> <p>Files for practicing, 61</p> <p>Study exercises and questions, 61</p> <p>References, 62</p> <p>Webliography, 62</p> <p>Packages, 63</p> <p><b>4 Next-generation sequencing: introduction and genomic applications, 65</b></p> <p>High-throughput sequencing background, 65</p> <p>Experimental background, 66</p> <p>Single-end and paired-end sequencing reads, 67</p> <p>Assemble reads, 69</p> <p>How many reads? Depth of coverage, 71</p> <p>Storing data in files, 72</p> <p>FASTQ, 72</p> <p>SAM and BAM files, 76</p> <p>Variant call format files, 77</p> <p>General data analysis workflow, 77</p> <p>Data processing considerations, 78</p> <p>Quality checking and screening read sequences, 80</p> <p>Quality checking for one file, 82</p> <p>Quality inspection for multiple files in a project, 82</p> <p>Quality filtering of FASTQ files, 83</p> <p>Handling alignment files and genomic variants, 84</p> <p>Alignment and variation visualization, 88</p> <p>Simple handling of VCF files, 89</p> <p>Genomic applications: low- and medium-depth sequencing, 91</p> <p>Aneuploidity sequencing and copy number variation identification, 92</p> <p>SNP identification and validation, 92</p> <p>Exome sequencing, 93</p> <p>Genomic region resequencing, 93</p> <p>Full genome and metagenome sequencing, 94</p> <p>Files for practicing, 94</p> <p>Study exercises and questions, 94</p> <p>References, 95</p> <p>Webliography, 97</p> <p>Packages, 97</p> <p><b>5 Quantitative transcriptomics: qRT-PCR, 99</b></p> <p>Transcriptome, 99</p> <p>Polymerase chain reaction, 100</p> <p>Standards for qPCR, 102</p> <p>R packages, 104</p> <p>Understanding delta Ct, 104</p> <p>Calculation of delta Ct, 105</p> <p>Requirements for real delta Ct calculations, 107</p> <p>Absolute quantification, 110</p> <p>Value prediction, the professional way, 114</p> <p>Relative quantification using the ddCt method, 115</p> <p>Comparison of two conditions, 116</p> <p>Comparison of multiple experimental conditions, 118</p> <p>Quality control with melting curve, 121</p> <p>Files for practicing, 123</p> <p>Study exercises and questions, 123</p> <p>References, 123</p> <p>Webliography, 124</p> <p>Packages, 124</p> <p><b>6 Advanced transcriptomics: gene expression microarrays, 125</b></p> <p>Microarray analysis: probes and samples, 125</p> <p>Experimental background, 126</p> <p>Archiving and publishing microarray data, 128</p> <p>Minimum information standard, 128</p> <p>Data preprocessing, 128</p> <p>Accessing data from CEL files, 129</p> <p>Quality control, 131</p> <p>Normalization, 132</p> <p>Differential gene expression, 133</p> <p>Annotating results, 136</p> <p>Creating normalized expression set from Illumina data, 138</p> <p>Automated data access from GEO, 140</p> <p>Files for practicing, 142</p> <p>Study exercises and questions, 142</p> <p>References, 143</p> <p>Webliography, 144</p> <p>Packages, 144</p> <p><b>7 Next-generation sequencing in transcriptomics: RNA-seq experiments, 145</b></p> <p>High-throughput RNA sequencing background, 145</p> <p>Experimental background, 145</p> <p>RNA-seq applications, 146</p> <p>Differential expression with different resolutions, 147</p> <p>Preparing count tables, 148</p> <p>Alignment files to read counts, 148</p> <p>Differential expression in simple comparison, 151</p> <p>A naive t-test approach, 151</p> <p>Single factor analysis with edgeR, 153</p> <p>Differential expression with DESeq, 156</p> <p>Complex experimental arrangements, 159</p> <p>Experimental factors and design matrix, 160</p> <p>GLM with edgeR, 161</p> <p>GLMs with DESeq, 162</p> <p>Heatmap visualization, 163</p> <p>Files for practicing, 164</p> <p>Study exercises and questions, 164</p> <p>References, 165</p> <p>Webliography, 166</p> <p>Packages, 166</p> <p><b>8 Deciphering the regulome: from ChIP to ChIP-seq, 167</b></p> <p>Chromatin immunoprecipitation, 167</p> <p>Experimental background, 168</p> <p>Fragment analysis, 168</p> <p>ChIP data in ENCODE, 169</p> <p>ChIP with tiling microarrays, 169</p> <p>High-throughput sequencing of ChIP fragments, 176</p> <p>Connecting annotation to peaks, 181</p> <p>Analysis of binding site motifs, 182</p> <p>Files for practicing, 186</p> <p>Study exercises and questions, 187</p> <p>References, 187</p> <p>Webliography, 188</p> <p>Packages, 189</p> <p><b>9 Inferring regulatory and other networks from gene expression data, 191</b></p> <p>Gene regulatory networks, 191</p> <p>Data for gene network inference, 192</p> <p>Reconstruction of co-expression networks, 193</p> <p>Gene regulatory network inference focusing of master regulators, 201</p> <p>Integrated interpretation of genes with GeneAnswers, 207</p> <p>Files for practicing, 211</p> <p>Study exercises and questions, 212</p> <p>References, 213</p> <p>Packages, 214</p> <p><b>10 Analysis of biological networks, 215</b></p> <p>A gentle introduction to networks, 215</p> <p>Networks and their components and features, 215</p> <p>Random networks, 220</p> <p>Biological networks, 221</p> <p>Files for storing network information, 223</p> <p>Important network metrics in biology, 227</p> <p>Distance-based measures, 228</p> <p>Degree and related measures, 230</p> <p>Vulnerability, 231</p> <p>Community structure of a network, 234</p> <p>Graph visualization, 236</p> <p>Cytoscape, 240</p> <p>Files for practicing, 241</p> <p>Study exercises and questions, 241</p> <p>References, 242</p> <p>Webliography, 243</p> <p>Packages, 243</p> <p><b>11 Proteomics: mass spectrometry, 245</b></p> <p>Mass spectrometry and proteomics: why and how?, 245</p> <p>File formats for MS data, 246</p> <p>Accessing the raw data of published studies, 247</p> <p>Identification of peptides in the samples, 249</p> <p>Peptide mass fingerprinting, 249</p> <p>Peptide identification by using MS/MS spectra, 250</p> <p>Quantitative proteomics, 254</p> <p>Getting protein-specific annotation, 258</p> <p>Files for practicing, 259</p> <p>Study exercises and questions, 259</p> <p>References, 259</p> <p>Webliography, 260</p> <p>Packages, 260</p> <p><b>12 Measuring protein abundance with ELISA, 261</b></p> <p>Enzyme-linked immunosorbent assays, 261</p> <p>Accessing ELISA data, 264</p> <p>Concentration calculation with a standard curve, 264</p> <p>Preparing reference data, 267</p> <p>Fitting linear model, 268</p> <p>Fitting of a logistic model, 269</p> <p>Concentration calculations by employing models, 270</p> <p>Comparative calculations using concentrations, 271</p> <p>Files for practicing, 277</p> <p>Study exercises and questions, 277</p> <p>References, 277</p> <p>Packages, 278</p> <p><b>13 Flow cytometry: counting and sorting stained cells, 279</b></p> <p>Theoretical aspects of flow cytometry, 279</p> <p>Experiment types: diagnosis versus discovery, 280</p> <p>Measurement arrangements, 281</p> <p>Fluorescent dyes, 281</p> <p>Tubes versus plates, 285</p> <p>Instruments, 285</p> <p>What about data?, 287</p> <p>Files, 287</p> <p>Workflows, 288</p> <p>Data preprocessing, 289</p> <p>Handling all samples together, 290</p> <p>Compensation, 292</p> <p>Quality assurance, 292</p> <p>Using workflow objects and transformation, 296</p> <p>Normalization, 298</p> <p>Cell population identification, 299</p> <p>Manual gating, 300</p> <p>Automatic gating, 304</p> <p>Relating cell populations to external variables, 305</p> <p>Reporting results, 307</p> <p>MIFlowCyt, 307</p> <p>FlowRepository.org, 308</p> <p>Files for practicing, 308</p> <p>Study exercises and questions, 309</p> <p>References, 309</p> <p>Webliography, 310</p> <p>Packages, 310</p> <p>Glossary, 311</p> <p>Index, 323</p>
<p><b>Csaba Ortutay</b> is a bioinformatician from Finland who has taught several bioinformatics courses at different European universities (Finland, Ireland, and Hungary) for over a decade. He is also active as a researcher publishing in the field of computational immunology.</p> <p><b>Zsuzsanna Ortutay</b> is a molecular immunologist at the University of Tampere, Finland, frequently utilizing diverse molecular lab methods.</p>
<p>This book addresses the difficulties experienced by wet lab researchers with the statistical analysis of molecular biology related data.  The authors explain how to use R and Bioconductor for the analysis of experimental data in the field of molecular biology. The content is based upon two university courses for bioinformatics and experimental biology students (Biological Data Analysis with R and High-throughput Data Analysis with R). The material is divided into chapters based upon the experimental methods used in the laboratories.</p> <p><b>Key features include:</b></p> <ul> <li>Broad appeal--the authors target their material to researchers in several levels, ensuring that the basics are always covered.</li> <li>First book to explain how to use R and Bioconductor for the analysis of several types of experimental data in the field of molecular biology.</li> <li>Focuses on R and Bioconductor, which are widely used for data analysis. One great benefit of R and Bioconductor is that there is a vast user community and very active discussion in place, in addition to the practice of sharing codes.</li> </ul> <p>Further, R is the platform for implementing new analysis approaches, therefore novel methods are available early for R users.</p>

Diese Produkte könnten Sie auch interessieren:

Verpacktes Leben - Verpackte Technik
Verpacktes Leben - Verpackte Technik
von: Udo Küppers, Helmut Tributsch
PDF ebook
97,99 €
Neuroendocrinology
Neuroendocrinology
von: David A. Lovejoy
PDF ebook
73,99 €
Bioinformatics
Bioinformatics
von: Frédéric Dardel, François Képès
PDF ebook
67,99 €