Details

Univariate, Bivariate, and Multivariate Statistics Using R


Univariate, Bivariate, and Multivariate Statistics Using R

Quantitative Tools for Data Analysis and Data Science
1. Aufl.

von: Daniel J. Denis

107,99 €

Verlag: Wiley
Format: PDF
Veröffentl.: 16.04.2020
ISBN/EAN: 9781119549956
Sprache: englisch
Anzahl Seiten: 384

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

<p><b>A practical source for performing essential statistical analyses and data management tasks in R</b></p> <p><i>Univariate, Bivariate, and Multivariate Statistics Using R </i>offers a practical and very user-friendly introduction to the use of R software that covers a range of statistical methods featured in data analysis and data science. The author— a noted expert in quantitative teaching —has written a quick go-to reference for performing essential statistical analyses and data management tasks in R. Requiring only minimal prior knowledge, the book introduces concepts needed for an immediate yet clear understanding of statistical concepts essential to interpreting software output.    </p> <p>The author explores univariate, bivariate, and multivariate statistical methods, as well as select nonparametric tests. Altogether a hands-on manual on the applied statistics and essential R computing capabilities needed to write theses, dissertations, as well as research publications. The book is comprehensive in its coverage of univariate through to multivariate procedures, while serving as a friendly and gentle introduction to R software for the newcomer. This important resource:</p> <ul> <li>Offers an introductory, concise guide to the computational tools that are useful for making sense out of data using R statistical software</li> <li>Provides a resource for students and professionals in the social, behavioral, and natural sciences</li> <li>Puts the emphasis on the computational tools used in the discovery of empirical patterns</li> <li>Features a variety of popular statistical analyses and data management tasks that can be immediately and quickly applied as needed to research projects</li> <li>Shows how to apply statistical analysis using R to data sets in order to get started quickly performing essential tasks in data analysis and data science</li> </ul> <p>Written for students, professionals, and researchers primarily in the social, behavioral, and natural sciences, <i>Univariate, Bivariate, and Multivariate Statistics Using R</i> offers an easy-to-use guide for performing data analysis <i>fast</i>, with an emphasis on drawing conclusions from empirical observations. The book can also serve as a primary or secondary textbook for courses in data analysis or data science, or others in which quantitative methods are featured. </p>
<p>Preface xiii</p> <p><b>1 Introduction to Applied Statistics </b><b>1</b></p> <p>1.1 The Nature of Statistics and Inference 2</p> <p>1.2 A Motivating Example 3</p> <p>1.3 What About “Big Data”? 4</p> <p>1.4 Approach to Learning R 7</p> <p>1.5 Statistical Modeling in a Nutshell 7</p> <p>1.6 Statistical Significance Testing and Error Rates 10</p> <p>1.7 Simple Example of Inference Using a Coin 11</p> <p>1.8 Statistics is for Messy Situations 13</p> <p>1.9 Type I versus Type II Errors 14</p> <p>1.10 Point Estimates and Confidence Intervals 15</p> <p>1.11 So What Can We Conclude from One Confidence Interval? 18</p> <p>1.12 Variable Types 19</p> <p>1.13 Sample Size, Statistical Power, and Statistical Significance 22</p> <p>1.14 How “<i>p </i>< 0.05” Happens 23</p> <p>1.15 Effect Size 25</p> <p>1.16 The Verdict on Significance Testing 26</p> <p>1.17 Training versus Test Data 27</p> <p>1.18 How to Get the Most Out of This Book 28</p> <p>Exercises 29</p> <p><b>2 Introduction to R and Computational Statistics </b><b>31</b></p> <p>2.1 How to Install R on Your Computer 34</p> <p>2.2 How to Do Basic Mathematics with R 35</p> <p>2.2.1 Combinations and Permutations 38</p> <p>2.2.2 Plotting Curves Using curve() 39</p> <p>2.3 Vectors and Matrices in R 41</p> <p>2.4 Matrices in R 44</p> <p>2.4.1 The Inverse of a Matrix 47</p> <p>2.4.2 Eigenvalues and Eigenvectors 49</p> <p>2.5 How to Get Data into R 52</p> <p>2.6 Merging Data Frames 55</p> <p>2.7 How to Install a Package in R, and How to Use It 55</p> <p>2.8 How to View the Top, Bottom, and “Some” of a Data File 58</p> <p>2.9 How to Select Subsets from a Dataframe 60</p> <p>2.10 How R Deals with Missing Data 62</p> <p>2.11 Using ls( ) to See Objects in the Workspace 63</p> <p>2.12 Writing Your Own Functions 65</p> <p>2.13 Writing Scripts 65</p> <p>2.14 How to Create Factors in R 66</p> <p>2.15 Using the table() Function 67</p> <p>2.16 Requesting a Demonstration Using the example() Function 68</p> <p>2.17 Citing R in Publications 69</p> <p>Exercises 69</p> <p><b>3 Exploring Data with R: Essential Graphics and Visualization </b><b>71</b></p> <p>3.1 Statistics, R, and Visualization 71</p> <p>3.2 R’s plot() Function 73</p> <p>3.3 Scatterplots and Depicting Data in Two or More Dimensions 77</p> <p>3.4 Communicating Density in a Plot 79</p> <p>3.5 Stem-and-Leaf Plots 85</p> <p>3.6 Assessing Normality 87</p> <p>3.7 Box-and-Whisker Plots 89</p> <p>3.8 Violin Plots 95</p> <p>3.9 Pie Graphs and Charts 97</p> <p>3.10 Plotting Tables 98</p> <p>Exercises 99</p> <p><b>4 Means, Correlations, Counts: Drawing Inferences Using Easy-to-Implement Statistical Tests </b><b>101</b></p> <p>4.1 Computing <i>z </i>and Related Scores in R 101</p> <p>4.2 Plotting Normal Distributions 105</p> <p>4.3 Correlation Coefficients in R 106</p> <p>4.4 Evaluating Pearson’s <i>r </i>for Statistical Significance 110</p> <p>4.5 Spearman’s Rho: A Nonparametric Alternative to Pearson 111</p> <p>4.6 Alternative Correlation Coefficients in R 113</p> <p>4.7 Tests of Mean Differences 114</p> <p>4.7.1 <i>t</i>-Tests for One Sample 114</p> <p>4.7.2 Two-Sample <i>t</i>-Test 115</p> <p>4.7.3 Was the Welch Test Necessary? 117</p> <p>4.7.4 <i>t</i>-Test via Linear Model Set-up 118</p> <p>4.7.5 Paired-Samples <i>t</i>-Test 118</p> <p>4.8 Categorical Data 120</p> <p>4.8.1 Binomial Test 120</p> <p>4.8.2 Categorical Data Having More Than Two Possibilities 123</p> <p>4.9 Radar Charts 126</p> <p>4.10 Cohen’s Kappa 127</p> <p>Exercises 129</p> <p><b>5 Power Analysis and Sample Size Estimation Using R </b><b>131</b></p> <p>5.1 What is Statistical Power? 131</p> <p>5.2 Does That Mean Power and Huge Sample Sizes Are “Bad?” 133</p> <p>5.3 Should I Be Estimating Power or Sample Size? 134</p> <p>5.4 How Do I Know What the Effect Size Should Be? 135</p> <p>5.4.1 Ways of Setting Effect Size in Power Analyses 135</p> <p>5.5 Power for <i>t</i>-Tests 136</p> <p>5.5.1 Example: Treatment versus Control Experiment 137</p> <p>5.5.2 Extremely Small Effect Size 138</p> <p>5.6 Estimating Power for a Given Sample Size 140</p> <p>5.7 Power for Other Designs – The Principles Are the Same 140</p> <p>5.7.1 Power for One-Way ANOVA 141</p> <p>5.7.2 Converting <i>R</i><sup>2</sup> to <i>f </i>143</p> <p>5.8 Power for Correlations 143</p> <p>5.9 Concluding Thoughts on Power 145</p> <p>Exercises 146</p> <p><b>6 Analysis of Variance: Fixed Effects, Random Effects, Mixed Models, and Repeated Measures </b><b>147</b></p> <p>6.1 Revisiting <i>t</i>-Tests 147</p> <p>6.2 Introducing the Analysis of Variance (ANOVA) 149</p> <p>6.2.1 Achievement as a Function of Teacher 149</p> <p>6.3 Evaluating Assumptions 152</p> <p>6.3.1 Inferential Tests for Normality 153</p> <p>6.3.2 Evaluating Homogeneity of Variances 154</p> <p>6.4 Performing the ANOVA Using aov() 156</p> <p>6.4.1 The Analysis of Variance Summary Table 157</p> <p>6.4.2 Obtaining Treatment Effects 158</p> <p>6.4.3 Plotting Results of the ANOVA 159</p> <p>6.4.4 Post Hoc Tests on the Teacher Factor 159</p> <p>6.5 Alternative Way of Getting ANOVA Results via lm() 161</p> <p>6.5.1 Contrasts in lm() versus Tukey’s HSD 163</p> <p>6.6 Factorial Analysis of Variance 163</p> <p>6.6.1 Why Not Do Two One-Way ANOVAs? 163</p> <p>6.7 Example of Factorial ANOVA 166</p> <p>6.7.1 Graphing Main Effects and Interaction in the Same Plot 171</p> <p>6.8 Should Main Effects Be Interpreted in the Presence of Interaction? 172</p> <p>6.9 Simple Main Effects 173</p> <p>6.10 Random Effects ANOVA and Mixed Models 175</p> <p>6.10.1 A Rationale for Random Factors 176</p> <p>6.10.2 One-Way Random Effects ANOVA in R 177</p> <p>6.11 Mixed Models 180</p> <p>6.12 Repeated-Measures Models 181</p> <p>Exercises 186</p> <p><b>7 Simple and Multiple Linear Regression </b><b>189</b></p> <p>7.1 Simple Linear Regression 190</p> <p>7.2 Ordinary Least-Squares Regression 192</p> <p>7.3 Adjusted <i>R</i><sup>2</sup> 198</p> <p>7.4 Multiple Regression Analysis 199</p> <p>7.5 Verifying Model Assumptions 202</p> <p>7.6 Collinearity Among Predictors and the Variance Inflation Factor 206</p> <p>7.7 Model-Building and Selection Algorithms 209</p> <p>7.7.1 Simultaneous Inference 209</p> <p>7.7.2 Hierarchical Regression 210</p> <p>7.7.2.1 Example of Hierarchical Regression 211</p> <p>7.8 Statistical Mediation 214</p> <p>7.9 Best Subset and Forward Regression 217</p> <p>7.9.1 How Forward Regression Works 218</p> <p>7.10 Stepwise Selection 219</p> <p>7.11 The Controversy Surrounding Selection Methods 221</p> <p>Exercises 223</p> <p><b>8 Logistic Regression and the Generalized Linear Model </b><b>225</b></p> <p>8.1 The “Why” Behind Logistic Regression 225</p> <p>8.2 Example of Logistic Regression in R 229</p> <p>8.3 Introducing the Logit: The Log of the Odds 232</p> <p>8.4 The Natural Log of the Odds 233</p> <p>8.5 From Logits Back to Odds 235</p> <p>8.6 Full Example of Logistic Regression 236</p> <p>8.6.1 Challenger O-ring Data 236</p> <p>8.7 Logistic Regression on Challenger Data 240</p> <p>8.8 Analysis of Deviance Table 241</p> <p>8.9 Predicting Probabilities 242</p> <p>8.10 Assumptions of Logistic Regression 243</p> <p>8.11 Multiple Logistic Regression 244</p> <p>8.12 Training Error Rate Versus Test Error Rate 247</p> <p>Exercises 248</p> <p><b>9 Multivariate Analysis of Variance (MANOVA) and Discriminant Analysis </b><b>251</b></p> <p>9.1 Why Conduct MANOVA? 252</p> <p>9.2 Multivariate Tests of Significance 254</p> <p>9.3 Example of MANOVA in R 257</p> <p>9.4 Effect Size for MANOVA 259</p> <p>9.5 Evaluating Assumptions in MANOVA 261</p> <p>9.6 Outliers 262</p> <p>9.7 Homogeneity of Covariance Matrices 263</p> <p>9.7.1 What if the Box-M Test Had Suggested a Violation? 264</p> <p>9.8 Linear Discriminant Function Analysis 265</p> <p>9.9 Theory of Discriminant Analysis 266</p> <p>9.10 Discriminant Analysis in R 267</p> <p>9.11 Computing Discriminant Scores Manually 270</p> <p>9.12 Predicting Group Membership 271</p> <p>9.13 How Well Did the Discriminant Function Analysis Do? 272</p> <p>9.14 Visualizing Separation 275</p> <p>9.15 Quadratic Discriminant Analysis 276</p> <p>9.16 Regularized Discriminant Analysis 278</p> <p>Exercises 278</p> <p><b>10 Principal Component Analysis </b><b>281</b></p> <p>10.1 Principal Component Analysis Versus Factor Analysis 282</p> <p>10.2 A Very Simple Example of PCA 283</p> <p>10.2.1 Pearson’s 1901 Data 284</p> <p>10.2.2 Assumptions of PCA 286</p> <p>10.2.3 Running the PCA 288</p> <p>10.2.4 Loadings in PCA 290</p> <p>10.3 What Are the Loadings in PCA? 292</p> <p>10.4 Properties of Principal Components 293</p> <p>10.5 Component Scores 294</p> <p>10.6 How Many Components to Keep? 295</p> <p>10.6.1 The Scree Plot as an Aid to Component Retention 295</p> <p>10.7 Principal Components of USA Arrests Data 297</p> <p>10.8 Unstandardized Versus Standardized Solutions 301</p> <p>Exercises 304</p> <p><b>11 Exploratory Factor Analysis </b><b>307</b></p> <p>11.1 Common Factor Analysis Model 308</p> <p>11.2 A Technical and Philosophical Pitfall of EFA 310</p> <p>11.3 Factor Analysis Versus Principal Component Analysis on the Same Data 311</p> <p>11.3.1 Demonstrating the Non-Uniqueness Issue 311</p> <p>11.4 The Issue of Factor Retention 314</p> <p>11.5 Initial Eigenvalues in Factor Analysis 315</p> <p>11.6 Rotation in Exploratory Factor Analysis 316</p> <p>11.7 Estimation in Factor Analysis 318</p> <p>11.8 Example of Factor Analysis on the Holzinger and Swineford Data 318</p> <p>11.8.1 Obtaining Initial Eigenvalues 323</p> <p>11.8.2 Making Sense of the Factor Solution 324</p> <p>Exercises 325</p> <p><b>12 Cluster Analysis </b><b>327</b></p> <p>12.1 A Simple Example of Cluster Analysis 329</p> <p>12.2 The Concepts of Proximity and Distance in Cluster Analysis 332</p> <p>12.3 <i>k</i>-Means Cluster Analysis 332</p> <p>12.4 Minimizing Criteria 333</p> <p>12.5 Example of <i>k</i>-Means Clustering in R 334</p> <p>12.5.1 Plotting the Data 335</p> <p>12.6 Hierarchical Cluster Analysis 339</p> <p>12.7 Why Clustering is Inherently Subjective 343</p> <p>Exercises 344</p> <p><b>13 Nonparametric Tests </b><b>347</b></p> <p>13.1 Mann–Whitney <i>U </i>Test 348</p> <p>13.2 Kruskal–Wallis Test 349</p> <p>13.3 Nonparametric Test for Paired Comparisons and Repeated Measures 351</p> <p>13.3.1 Wilcoxon Signed-Rank Test and Friedman Test 351</p> <p>13.4 Sign Test 354</p> <p>Exercises 356</p> <p>References 359</p> <p>Index 363</p>
<p><b>DANIEL J. DENIS, P<small>H</small>D,</b> is Professor of Quantitative Psychology in the Department of Psychology at the University of Montana. D. Denis is the author of <i>Applied Univariate, Bivariate, and Multivariate Statistics</i> and<i> SPSS Data Analysis for Univariate, Bivariate, and Multivariate Statistics</i>, both published by Wiley.
<p><b>A practical source for performing essential statistical analyses and data management tasks in R</b> <p><i>Univariate, Bivariate, and Multivariate Statistics Using R</i> offers a practical and very user-friendly introduction to the use of R software that covers a range of statistical methods featured in data analysis and data science. The author— a noted expert in quantitative teaching —has written a quick go-to reference for performing essential statistical analyses and data management tasks in R. Requiring only minimal prior knowledge, the book introduces concepts needed for an immediate yet clear understanding of statistical concepts essential to interpreting software output. <p>The author explores univariate, bivariate, and multivariate statistical methods, as well as select nonparametric tests. Altogether, a hands-on manual on the applied statistics and essential R computing capabilities needed to write theses, dissertations, as well as research publications. The book is comprehensive in its coverage of univariate through to multivariate procedures, while serving as a friendly and gentle introduction to R software for the newcomer. This important resource: <ul> <li>Offers an introductory, concise guide to the computational tools that are useful for making sense out of data using R statistical software</li> <li>Provides a resource for students and professionals in the social, behavioral, and natural sciences</li> <li>Puts the emphasis on the computational tools used in the discovery of empirical patterns</li> <li>Features a variety of popular statistical analyses and data management tasks that can be immediately and quickly applied as needed to research projects</li> <li>Shows how to apply statistical analysis using R to data sets in order to get started quickly performing essential tasks in data analysis and data science</li> </ul> <p>Written for students, professionals, and researchers primarily in the social, behavioral, and natural sciences, <i>Univariate, Bivariate, and Multivariate Statistics Using R</i> offers an easy-to-use guide for performing data analysis <i>fast</i>, with an emphasis on drawing conclusions from empirical observations. The book can also serve as a primary or secondary textbook for courses in data analysis or data science, or others in which quantitative methods are featured.

Diese Produkte könnten Sie auch interessieren:

Statistics for Microarrays
Statistics for Microarrays
von: Ernst Wit, John McClure
PDF ebook
90,99 €