Details

The Big R-Book


The Big R-Book

From Data Science to Learning Machines and Big Data
1. Aufl.

von: Philippe J. S. De Brouwer

111,99 €

Verlag: Wiley
Format: PDF
Veröffentl.: 29.09.2020
ISBN/EAN: 9781119632764
Sprache: englisch
Anzahl Seiten: 928

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

<p><b>Introduces professionals and scientists to statistics and machine learning using the programming language R</b></p> <p>Written by and for practitioners, this book provides an overall introduction to R, focusing on tools and methods commonly used in data science, and placing emphasis on practice and business use. It covers a wide range of topics in a single volume, including big data, databases, statistical machine learning, data wrangling, data visualization, and the reporting of results. The topics covered are all important for someone with a science/math background that is looking to quickly learn several practical technologies to enter or transition to the growing field of data science. </p> <p><i>The Big R-Book for Professionals: From Data Science to Learning Machines and Reporting with R</i> includes nine parts, starting with an introduction to the subject and followed by an overview of R and elements of statistics. The third part revolves around data, while the fourth focuses on data wrangling. Part 5 teaches readers about exploring data. In Part 6 we learn to build models, Part 7 introduces the reader to the reality in companies, Part 8 covers reports and interactive applications and finally Part 9 introduces the reader to big data and performance computing. It also includes some helpful appendices.</p> <ul> <li>Provides a practical guide for non-experts with a focus on business users</li> <li>Contains a unique combination of topics including an introduction to R, machine learning, mathematical models, data wrangling, and reporting</li> <li>Uses a practical tone and integrates multiple topics in a coherent framework</li> <li>Demystifies the hype around machine learning and AI by enabling readers to understand the provided models and program them in R</li> <li>Shows readers how to visualize results in static and interactive reports</li> <li>Supplementary materials includes PDF slides based on the book’s content, as well as all the extracted R-code and is available to everyone on a Wiley Book Companion Site</li> </ul> <p><i>The Big R-Book</i> is an excellent guide for science technology, engineering, or mathematics students who wish to make a successful transition from the academic world to the professional. It will also appeal to all young data scientists, quantitative analysts, and analytics professionals, as well as those who make mathematical models.</p>
<p>Foreword xxv</p> <p>About the Author xxvii</p> <p>Acknowledgements xxix</p> <p>Preface xxxi</p> <p>About the Companion Site xxxv</p> <p><b>I Introduction 1</b></p> <p><b>1 The Big Picture with Kondratiev and Kardashev 3</b></p> <p><b>2 The Scientific Method and Data 7</b></p> <p><b>3 Conventions 11</b></p> <p><b>II Starting with R and Elements of Statistics 19</b></p> <p><b>4 The Basics of R 21</b></p> <p>4.1 Getting Started with R 23</p> <p>4.2 Variables 26</p> <p>4.3 Data Types 28</p> <p>4.3.1 The Elementary Types 28</p> <p>4.3.2 Vectors 29</p> <p>4.3.3 Accessing Data from a Vector 29</p> <p>4.3.4 Matrices 32</p> <p>4.3.5 Arrays 38</p> <p>4.3.6 Lists 41</p> <p>4.3.7 Factors 45</p> <p>4.3.8 Data Frames 49</p> <p>4.3.9 Strings or the Character-type 54</p> <p>4.4 Operators 57</p> <p>4.4.1 Arithmetic Operators 57</p> <p>4.4.2 Relational Operators 57</p> <p>4.4.3 Logical Operators 58</p> <p>4.4.4 Assignment Operators 59</p> <p>4.4.5 Other Operators 61</p> <p>4.5 Flow Control Statements 63</p> <p>4.5.1 Choices 63</p> <p>4.5.2 Loops 65</p> <p>4.6 Functions 69</p> <p>4.6.1 Built-in Functions 69</p> <p>4.6.2 Help with Functions 69</p> <p>4.6.3 User-defined Functions 70</p> <p>4.6.4 Changing Functions 70</p> <p>4.6.5 Creating Function with Default Arguments 71</p> <p>4.7 Packages 72</p> <p>4.7.1 Discovering Packages in R 72</p> <p>4.7.2 Managing Packages in R 73</p> <p>4.8 Selected Data Interfaces 75</p> <p>4.8.1 CSV Files 75</p> <p>4.8.2 Excel Files 79</p> <p>4.8.3 Databases 79</p> <p><b>5 Lexical Scoping and Environments 81</b></p> <p>5.1 Environments in R 81</p> <p>5.2 Lexical Scoping in R 83</p> <p><b>6 The Implementation of OO 87</b></p> <p>6.1 Base Types 89</p> <p>6.2 S3 Objects 91</p> <p>6.2.1 Creating S3 Objects 94</p> <p>6.2.2 Creating Generic Methods 96</p> <p>6.2.3 Method Dispatch 97</p> <p>6.2.4 Group Generic Functions 98</p> <p>6.3 S4 Objects 100</p> <p>6.3.1 Creating S4 Objects 100</p> <p>6.3.2 Using S4 Objects 101</p> <p>6.3.3 Validation of Input 105</p> <p>6.3.4 Constructor functions 107</p> <p>6.3.5 The Data slot 108</p> <p>6.3.6 Recognising Objects, Generic Functions, and Methods 108</p> <p>6.3.7 CreatingS4Generics 110</p> <p>6.3.8 Method Dispatch 111</p> <p>6.4 The Reference Class, refclass, RC or R5 Model 113</p> <p>6.4.1 Creating RC Objects 113</p> <p>6.4.2 Important Methods and Attributes 117</p> <p>6.5 Conclusions about the OO Implementation 119</p> <p><b>7 Tidy R with the Tidyverse 121</b></p> <p>7.1 The Philosophy of the Tidyverse 121</p> <p>7.2 Packages in the Tidyverse 124</p> <p>7.2.1 The Core Tidyverse 124</p> <p>7.2.2 The Non-core Tidyverse 125</p> <p>7.3 Working with the Tidyverse 127</p> <p>7.3.1 Tibbles 127</p> <p>7.3.2 Piping with R 132</p> <p>7.3.3 Attention Points When Using the Pipe 133</p> <p>7.3.4 Advanced Piping 134</p> <p>7.3.5 Conclusion 137</p> <p><b>8 Elements of Descriptive Statistics 139</b></p> <p>8.1 Measures of Central Tendency 139</p> <p>8.1.1 Mean 139</p> <p>8.1.2 The Median 142</p> <p>8.1.3 The Mode 143</p> <p>8.2 Measures of Variation or Spread 145</p> <p>8.3 Measures of Covariation 147</p> <p>8.3.1 The Pearson Correlation 147</p> <p>8.3.2 The Spearman Correlation 148</p> <p>8.3.3 Chi-square Tests 149</p> <p>8.4 Distributions 150</p> <p>8.4.1 Normal Distribution 150</p> <p>8.4.2 Binomial Distribution 153</p> <p>8.5 Creating an Overview of Data Characteristics 155</p> <p><b>9 Visualisation Methods 159</b></p> <p>9.1 Scatterplots 161</p> <p>9.2 Line Graphs 163</p> <p>9.3 Pie Charts 165</p> <p>9.4 Bar Charts 167</p> <p>9.5 Boxplots 171</p> <p>9.6 Violin Plots 173</p> <p>9.7 Histograms 176</p> <p>9.8 Plotting Functions 179</p> <p>9.9 Maps and Contour Plots 180</p> <p>9.10 Heat-maps 181</p> <p>9.11 Text Mining 184</p> <p>9.11.1 Word Clouds 184</p> <p>9.11.2 Word Associations 188</p> <p>9.12 Colours in R 191</p> <p><b>10 Time Series Analysis 197</b></p> <p>10.1 Time Series in R 197</p> <p>10.1.1 The Basics of Time Series in R 197</p> <p>10.2 Forecasting 200</p> <p>10.2.1 Moving Average 200</p> <p>10.2.2 Seasonal Decomposition 206</p> <p><b>11 Further Reading 211</b></p> <p><b>III Data Import 213</b></p> <p><b>12 A Short History of Modern Database Systems 215</b></p> <p><b>13 RDBMS 219</b></p> <p><b>14 SQL 223</b></p> <p>14.1 Designing the Database 223</p> <p>14.2 Building the Database Structure 226</p> <p>14.2.1 Installing a RDBMS 226</p> <p>14.2.2 Creating the Database 228</p> <p>14.2.3 Creating the Tables and Relations 229</p> <p>14.3 Adding Data to the Database 235</p> <p>14.4 Querying the Database 239</p> <p>14.4.1 The Basic Select Query 239</p> <p>14.4.2 More Complex Queries 240</p> <p>14.5 Modifying the Database Structure 244</p> <p>14.6 Selected Features of SQL 249</p> <p>14.6.1 Changing Data 249</p> <p>14.6.2 Functions in SQL 249</p> <p><b>15 Connecting R to an SQL Database 253</b></p> <p><b>IV Data Wrangling 257</b></p> <p><b>16 Anonymous Data 261</b></p> <p><b>17 Data Wrangling in the tidyverse 265</b></p> <p>17.1 Importing the Data 266</p> <p>17.1.1 Importing from an SQLRDBMS 266</p> <p>17.1.2 Importing Flat Files in the Tidyverse 267</p> <p>17.2 Tidy Data 275</p> <p>17.3 Tidying Up Data with tidyr 277</p> <p>17.3.1 Splitting Tables 278</p> <p>17.3.2 Convert Headers to Data 281</p> <p>17.3.3 Spreading One Column Over Many 284</p> <p>17.3.4 Split One Columns into Many 285</p> <p>17.3.5 Merge Multiple Columns Into One 286</p> <p>17.3.6 Wrong Data 287</p> <p>17.4 SQL-like Functionality via dplyr 288</p> <p>17.4.1 Selecting Columns 288</p> <p>17.4.2 Filtering Rows 289</p> <p>17.4.3 Joining 290</p> <p>17.4.4 Mutating Data 293</p> <p>17.4.5 Set Operations 296</p> <p>17.5 String Manipulation in the tidyverse 299</p> <p>17.5.1 Basic String Manipulation 300</p> <p>17.5.2 Pattern Matching with Regular Expressions 302</p> <p>17.6 Dates with lubridate 314</p> <p>17.6.1 ISO 8601 Format 315</p> <p>17.6.2 Time-zones 317</p> <p>17.6.3 Extract Date and Time Components 318</p> <p>17.6.4 Calculating with Date-times 319</p> <p>17.7 Factors with Forcats 325</p> <p><b>18 Dealing with Missing Data 333</b></p> <p>18.1 Reasons for Data to be Missing 334</p> <p>18.2 Methods to Handle Missing Data 336</p> <p>18.2.1 Alternative Solutions to Missing Data 336</p> <p>18.2.2 Predictive Mean Matching(PMM) 338</p> <p>18.3 R Packages to Deal with Missing Data 339</p> <p>18.3.1 mice 339</p> <p>18.3.2 missForest 340</p> <p>18.3.3 Hmisc 341</p> <p><b>19 Data Binning 343</b></p> <p>19.1 What is Binning and Why Use It 343</p> <p>19.2 Tuning the Binning Procedure 347</p> <p>19.3 More Complex Cases: Matrix Binning 352</p> <p>19.4 Weight of Evidence and Information Value 359</p> <p>19.4.1 Weight of Evidence(WOE) 359</p> <p>19.4.2 Information Value(IV) 359</p> <p>19.4.3 WOE and IV in R 359</p> <p><b>20 Factoring Analysis and Principle Components 363</b></p> <p>20.1 Principle Components Analysis (PCA) 364</p> <p>20.2 Factor Analysis 368</p> <p><b>V Modelling 373</b></p> <p><b>21 Regression Models 375</b></p> <p>21.1 Linear Regression 375</p> <p>21.2 Multiple Linear Regression 379</p> <p>21.2.1 Poisson Regression 379</p> <p>21.2.2 Non-linear Regression 381</p> <p>21.3 Performance of Regression Models 384</p> <p>21.3.1 Mean Square Error (MSE) 384</p> <p>21.3.2 <i>R</i>-Squared 384</p> <p>21.3.3 Mean Average Deviation(MAD) 386</p> <p><b>22 Classification Models 387</b></p> <p>22.1 Logistic Regression 388</p> <p>22.2 Performance of Binary Classification Models 390</p> <p>22.2.1 The Confusion Matrix and Related Measures 391</p> <p>22.2.2 ROC 393</p> <p>22.2.3 The AUC 396</p> <p>22.2.4 The Gini Coefficient 397</p> <p>22.2.5 Kolmogorov-Smirnov (KS) for Logistic Regression 398</p> <p>22.2.6 Finding an Optimal Cut-off 399</p> <p><b>23 Learning Machines 405</b></p> <p>23.1 Decision Tree 407</p> <p>23.1.1 Essential Background 407</p> <p>23.1.2 Important Considerations 412</p> <p>23.1.3 Growing Trees with the Package rpart 414</p> <p>23.1.4 Evaluating the Performance of a Decision Tree 424</p> <p>23.2 Random Forest 428</p> <p>23.3 Artificial Neural Networks (ANNs) 434</p> <p>23.3.1 The Basics of ANNs in R 434</p> <p>23.3.2 Neural Networks in R 436</p> <p>23.3.3 The Work-flow to for Fitting a NN 438</p> <p>23.3.4 Cross Validate the NN 444</p> <p>23.4 Support Vector Machine 447</p> <p>23.4.1 Fitting a SVM in R 447</p> <p>23.4.2 Optimizing the SVM 449</p> <p>23.5 Unsupervised Learning and Clustering 450</p> <p>23.5.1 k-Means Clustering 450</p> <p>23.5.2 Visualizing Clusters in Three Dimensions 462</p> <p>23.5.3 Fuzzy Clustering 464</p> <p>23.5.4 Hierarchical Clustering 466</p> <p>23.5.5 Other Clustering Methods 468</p> <p><b>24 Towards a Tidy Modelling Cycle with modelr 469</b></p> <p>24.1 Adding Predictions 470</p> <p>24.2 Adding Residuals 471</p> <p>24.3 Bootstrapping Data 472</p> <p>24.4 Other Functions of modelr 474</p> <p><b>25 Model Validation 475</b></p> <p>25.1 Model Quality Measures 476</p> <p>25.2 Predictions and Residuals 477</p> <p>25.3 Bootstrapping 479</p> <p>25.3.1 Bootstrapping in Base R 479</p> <p>25.3.2 Bootstrapping in the tidyverse with modelr 481</p> <p>25.4 Cross-Validation 483</p> <p>25.4.1 Elementary Cross Validation 483</p> <p>25.4.2 Monte Carlo Cross Validation 486</p> <p>25.4.3 <i>k</i>-Fold Cross Validation 488</p> <p>25.4.4 Comparing Cross Validation Methods 489</p> <p>25.5 Validation in a Broader Perspective 492</p> <p><b>26 Labs 495</b></p> <p>26.1 Financial Analysis with quantmod 495</p> <p>26.1.1 The Basics of quantmod 495</p> <p>26.1.2 Types of Data Available in quantmod 496</p> <p>26.1.3 Plotting with quantmod 497</p> <p>26.1.4 The quantmod Data Structure 500</p> <p>26.1.5 Support Functions Supplied by quantmod 502</p> <p>26.1.6 Financial Modelling in quantmod 504</p> <p><b>27 Multi Criteria Decision Analysis (MCDA) 511</b></p> <p>27.1 What and Why 511</p> <p>27.2 General Work-flow 513</p> <p>27.3 Identify the Issue at Hand: Steps 1 and 2 516</p> <p>27.4 Step3: the Decision Matrix 518</p> <p>27.4.1 Construct a Decision Matrix 518</p> <p>27.4.2 Normalize the Decision Matrix 520</p> <p>27.5 Step 4: Delete Inefficient and Unacceptable Alternatives 521</p> <p>27.5.1 Unacceptable Alternatives 521</p> <p>27.5.2 Dominance – Inefficient Alternatives 521</p> <p>27.6 Plotting Preference Relationships 524</p> <p>27.7 Step5: MCDA Methods 526</p> <p>27.7.1 Examples of Non-compensatory Methods 526</p> <p>27.7.2 The Weighted Sum Method(WSM) 527</p> <p>27.7.3 Weighted Product Method(WPM) 530</p> <p>27.7.4 ELECTRE 530</p> <p>27.7.5 PROMethEE 540</p> <p>27.7.6 PCA(Gaia) 553</p> <p>27.7.7 Outranking Methods 557</p> <p>27.7.8 Goal Programming 558</p> <p>27.8 Summary MCDA 561</p> <p><b>VI Introduction to Companies 563</b></p> <p><b>28 Financial Accounting (FA) 567</b></p> <p>28.1 The Statements of Accounts 568</p> <p>28.1.1 Income Statement 568</p> <p>28.1.2 Net Income: The P&L statement 568</p> <p>28.1.3 Balance Sheet 569</p> <p>28.2 The Value Chain 571</p> <p>28.3 Further, Terminology 573</p> <p>28.4 Selected Financial Ratios 575</p> <p><b>29 Management Accounting 583</b></p> <p>29.1 Introduction 583</p> <p>29.1.1 Definition of Management Accounting (MA) 583</p> <p>29.1.2 Management Information Systems (MIS) 584</p> <p>29.2 Selected Methods in MA 585</p> <p>29.2.1 Cost Accounting 585</p> <p>29.2.2 Selected Cost Types 587</p> <p>29.3 Selected Use Cases of MA 590</p> <p>29.3.1 Balanced Scorecard 590</p> <p>29.3.2 Key Performance Indicators (KPIs) 591</p> <p><b>30 Asset Valuation Basics 597</b></p> <p>30.1 Time Value of Money 598</p> <p>30.1.1 Interest Basics 598</p> <p>30.1.2 Specific Interest Rate Concepts 598</p> <p>30.1.3 Discounting 600</p> <p>30.2 Cash 601</p> <p>30.3 Bonds 602</p> <p>30.3.1 Features of a Bond 602</p> <p>30.3.2 Valuation of Bonds 604</p> <p>30.3.3 Duration 606</p> <p>30.4 The Capital Asset Pricing Model (CAPM) 610</p> <p>30.4.1 The CAPM Framework 610</p> <p>30.4.2 The CAPM and Risk 612</p> <p>30.4.3 Limitations and Shortcomings of the CAPM 612</p> <p>30.5 Equities 614</p> <p>30.5.1 Definition 614</p> <p>30.5.2 Short History 614</p> <p>30.5.3 Valuation of Equities 615</p> <p>30.5.4 Absolute Value Models 616</p> <p>30.5.5 Relative Value Models 625</p> <p>30.5.6 Selection of Valuation Methods 630</p> <p>30.5.7 Pitfalls in Company Valuation 631</p> <p>30.6 Forwards and Futures 638</p> <p>30.7 Options 640</p> <p>30.7.1 Definitions 640</p> <p>30.7.2 Commercial Aspects 642</p> <p>30.7.3 Short History 643</p> <p>30.7.4 Valuation of Options at Maturity 644</p> <p>30.7.5 The Black and Scholes Model 649</p> <p>30.7.6 The Binomial Model 654</p> <p>30.7.7 Dependencies of the Option Price 660</p> <p>30.7.8 The Greeks 664</p> <p>30.7.9 Delta Hedging 665</p> <p>30.7.10 Linear Option Strategies 667</p> <p>30.7.11 Integrated Option Strategies 674</p> <p>30.7.12 Exotic Options 678</p> <p>30.7.13 Capital Protected Structures 680</p> <p><b>VII Reporting 683</b></p> <p><b>31 A Grammar of Graphics with ggplot2 687</b></p> <p>31.1 TheBasicsofggplot2 688</p> <p>31.2 Over-plotting 692</p> <p>31.3 CaseStudyforggplot2 696</p> <p><b>32 R Markdown 699</b></p> <p><b>33 knitr and LATEX 703</b></p> <p><b>34 An Automated Development Cycle 707</b></p> <p><b>35 Writing and Communication Skills 709</b></p> <p><b>36 Interactive Apps 713</b></p> <p>36.1 Shiny 715</p> <p>36.2 Browser Born Data Visualization 719</p> <p>36.2.1 HTML-widgets 719</p> <p>36.2.2 Interactive Maps with leaflet 720</p> <p>36.2.3 Interactive Data Visualisation with ggvis 721</p> <p>36.2.4 googleVis 723</p> <p>36.3 Dashboards 725</p> <p>36.3.1 The Business Case: a Diversity Dashboard 726</p> <p>36.3.2 A Dashboard with flexdashboard 731</p> <p>36.3.3 A Dashboard with shinydashboard 737</p> <p><b>VIII Bigger and Faster R 741</b></p> <p><b>37 Parallel Computing 743</b></p> <p>37.1 Combine foreach and doParallel 745</p> <p>37.2 Distribute Calculations over LAN with Snow 748</p> <p>37.3 Using the GPU 752</p> <p>37.3.1 Getting Started with gpuR 754</p> <p>37.3.2 On the Importance of Memory use 757</p> <p>37.3.3 Conclusions for GPU Programming 759</p> <p><b>38 R and Big Data 761</b></p> <p>38.1 Use a Powerful Server 763</p> <p>38.1.1 Use R on a Server 763</p> <p>38.1.2 Let the Database Server do the Heavy Lifting 763</p> <p>38.2 Using more Memory than we have RAM 765</p> <p><b>39 Parallelism for Big Data 767</b></p> <p>39.1 Apache Hadoop 769</p> <p>39.2 Apache Spark 771</p> <p>39.2.1 Installing Spark 771</p> <p>39.2.2 Running Spark 773</p> <p>39.2.3 SparkR 776</p> <p>39.2.4 sparklyr 788</p> <p>39.2.5 SparkR or sparklyr 791</p> <p><b>40 The Need for Speed 793</b></p> <p>40.1 Benchmarking 794</p> <p>40.2 Optimize Code 797</p> <p>40.2.1 Avoid Repeating the Same 797</p> <p>40.2.2 Use Vectorisation where Appropriate 797</p> <p>40.2.3 Pre-allocating Memory 799</p> <p>40.2.4 Use the Fastest Function 800</p> <p>40.2.5 Use the Fastest Package 801</p> <p>40.2.6 Be Mindful about Details 802</p> <p>40.2.7 Compile Functions 804</p> <p>40.2.8 Use C or C++ Code in R 806</p> <p>40.2.9 Using a C++ Source File in R 809</p> <p>40.2.10CallCompiledC++Functions in R 811</p> <p>40.3 Profiling Code 812</p> <p>40.3.1 The Package profr 813</p> <p>40.3.2 The Package proftools 813</p> <p>40.4 Optimize Your Computer 817</p> <p><b>IX Appendices 819</b></p> <p><b>A Create your own R Package 821</b></p> <p>A.1 Creating the Package in the R Console 823</p> <p>A.2 Update the Package Description 825</p> <p>A.3 Documenting the Functionsxs 826</p> <p>A.4 Loading the Package 827</p> <p>A.5 Further Steps 828</p> <p><b>B Levels of Measurement 829</b></p> <p>B.1 Nominal Scale 829</p> <p>B.2 Ordinal Scale 830</p> <p>B.3 Interval Scale 831</p> <p>B.4 Ratio Scale 832</p> <p><b>C Trademark Notices 833</b></p> <p>C.1 General Trademark Notices 834</p> <p>C.2 R-Related Notices 835</p> <p>C.2.1 Crediting Developers of R Packages 835</p> <p>C.2.2 The R-packages used in this Book 835</p> <p><b>D Code Not Shown in the Body of the Book 839</b></p> <p><b>E Answers to Selected Questions 845</b></p> <p>Bibliography 859</p> <p>Nomenclature 869</p> <p>Index 881 </p>
<p><b>PHILIPPE J.S. DE BROUWER, P<small>H</small>D,</b> is director at HSBC, guest professor at four universities and MBA programs (University of Warsaw, Jagiellonian University, Krakow School of Business and AGH University of Science and Technology) and honorary consul for Belgium in Krakow. As a professor, he builds bridges not only between universities and the industry, but also across disciplines. He teaches mathematicians leadership skills and non-mathematicians coding. As a scientist, he tries to combine research on financial markets, psychology, and investments to the benefit of the investor. As an honorary consul he is passionate about serving the community and helping initiatives grow.
<p><b>Introduces professionals and scientists to statistics, machine learning, and big data using the programming language R</b> <p>Written by and for practitioners, this book provides an overall introduction to R, focusing on tools and methods commonly used in data science, and placing emphasis on practice and business use. It covers a wide range of topics in a single volume, including big data, databases, statistical machine learning, data wrangling, data visualization, and the reporting of results. The topics covered are all important for someone with a science/math background that is looking to quickly learn several practical technologies to enter or transition to the growing field of data science. <p><i>The Big R-Book: From Data Science to Learning Machines and Big Data</i> includes nine parts, starting with an introduction to the subject and followed by an overview of R and elements of statistics. The third part revolves around data, while the fourth focuses on data wrangling and exploring data. In Part 5 we learn to build models, Part 6 introduces the reader to the reality in companies, Part 7 covers reports and interactive applications and Part 8 introduces the reader to big data and performance computing. The appendices focus on specialist topics such as building your own extention for R, answer questions that appear througout the book, etc. <ul> <li>Provides a practical guide for non-experts with a focus on business users</li> <li>Contains a unique combination of topics including an introduction to R, machine learning, multi criteria decision analysis, mathematical models, data wrangling, and reporting</li> <li>Uses a practical tone and integrates multiple topics in a coherent framework</li> <li>Demystifies the hype around machine learning and AI by enabling readers to understand the models and program them in R</li> <li>Shows readers how to visualize results in reports and dynamic websites</li> <li>Supplementary materials include PDF slides based on the book's content on an Wiley Instructor-only Book Companion Site, as well as all the extracted R-code available to everyone on a Wiley Student Book Companion Site</li> </ul> <p><i>The Big R-Book</i> is an excellent guide for science technology, engineering, or mathematics students and graduates who wish to make a successful transition from the academic world to the professional. It will also appeal to all young data scientists, quantitative analysts, and analytics professionals, as well as those who make mathematical models or review them.

Diese Produkte könnten Sie auch interessieren:

Statistics for Microarrays
Statistics for Microarrays
von: Ernst Wit, John McClure
PDF ebook
90,99 €