Details

Handbook of Regression Analysis With Applications in R


Handbook of Regression Analysis With Applications in R


Wiley Series in Probability and Statistics 2. Aufl.

von: Samprit Chatterjee, Jeffrey S. Simonoff

100,99 €

Verlag: Wiley
Format: PDF
Veröffentl.: 27.07.2020
ISBN/EAN: 9781119392477
Sprache: englisch
Anzahl Seiten: 384

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

<p><b>Handbook and reference guide for students and practitioners of statistical regression-based analyses in R</b></p> <p><i>Handbook of Regression Analysis with Applications in R,</i> Second Edition is a comprehensive and up-to-date guide to conducting complex regressions in the R statistical programming language. The authors' thorough treatment of "classical" regression analysis in the first edition is complemented here by their discussion of more advanced topics including time-to-event survival data and longitudinal and clustered data.</p> <p>The book further pays particular attention to methods that have become prominent in the last few decades as increasingly large data sets have made new techniques and applications possible. These include:</p> <ul> <li>Regularization methods</li> <li>Smoothing methods</li> <li>Tree-based methods</li> </ul> <p>In the new edition of the <i>Handbook,</i> the data analyst's toolkit is explored and expanded. Examples are drawn from a wide variety of real-life applications and data sets. All the utilized R code and data are available via an author-maintained website.</p> <p>Of interest to undergraduate and graduate students taking courses in statistics and regression, the <i>Handbook of Regression Analysis</i> will also be invaluable to practicing data scientists and statisticians.</p>
<p>Preface to the Second Edition xv</p> <p>Preface to the First Edition xix</p> <p><b>Part I The Multiple Linear Regression Model</b></p> <p><b>1 Multiple Linear Regression </b><b>3</b></p> <p>1.1 Introduction 3</p> <p>1.2 Concepts and Background Material 4</p> <p>1.2.1 The Linear Regression Model 4</p> <p>1.2.2 Estimation Using Least Squares 5</p> <p>1.2.3 Assumptions 8</p> <p>1.3 Methodology 9</p> <p>1.3.1 Interpreting Regression Coefficients 9</p> <p>1.3.2 Measuring the Strength of the Regression Relationship 10</p> <p>1.3.3 Hypothesis Tests and Confidence Intervals for <i>β </i>12</p> <p>1.3.4 Fitted Values and Predictions 13</p> <p>1.3.5 Checking Assumptions Using Residual Plots 14</p> <p>1.4 Example —Estimating Home Prices 15</p> <p>1.5 Summary 19</p> <p><b>2 Model Building </b><b>23</b></p> <p>2.1 Introduction 23</p> <p>2.2 Concepts and Background Material 24</p> <p>2.2.1 Using Hypothesis Tests to Compare Models 24</p> <p>2.2.2 Collinearity 26</p> <p>2.3 Methodology 29</p> <p>2.3.1 Model Selection 29</p> <p>2.3.2 Example—Estimating Home Prices (continued) 31</p> <p>2.4 Indicator Variables and Modeling Interactions 38</p> <p>2.4.1 Example—Electronic Voting and the 2004 Presidential Election 40</p> <p>2.5 Summary 46</p> <p><b>Part II Addressing Violations of Assumptions</b></p> <p><b>3 Diagnostics for Unusual Observations </b><b>53</b></p> <p>3.1 Introduction 53</p> <p>3.2 Concepts and Background Material 54</p> <p>3.3 Methodology 56</p> <p>3.3.1 Residuals and Outliers 56</p> <p>3.3.2 Leverage Points 57</p> <p>3.3.3 Influential Points and Cook’s Distance 58</p> <p>3.4 Example— Estimating Home Prices (continued) 60</p> <p>3.5 Summary 63</p> <p><b>4 Transformations and Linearizable Models </b><b>67</b></p> <p>4.1 Introduction 67</p> <p>4.2 Concepts and Background Material: The Log-Log Model 69</p> <p>4.3 Concepts and Background Material: Semilog Models 69</p> <p>4.3.1 Logged Response Variable 70</p> <p>4.3.2 Logged Predictor Variable 70</p> <p>4.4 Example— Predicting Movie Grosses After One Week 71</p> <p>4.5 Summary 77</p> <p><b>5 Time Series Data and Autocorrelation </b><b>79</b></p> <p>5.1 Introduction 79</p> <p>5.2 Concepts and Background Material 81</p> <p>5.3 Methodology: Identifying Autocorrelation 83</p> <p>5.3.1 The Durbin-Watson Statistic 83</p> <p>5.3.2 The Autocorrelation Function (ACF) 84</p> <p>5.3.3 Residual Plots and the Runs Test 85</p> <p>5.4 Methodology: Addressing Autocorrelation 86</p> <p>5.4.1 Detrending and Deseasonalizing 86</p> <p>5.4.2 Example— e-Commerce Retail Sales 87</p> <p>5.4.3 Lagging and Differencing 93</p> <p>5.4.4 Example— Stock Indexes 94</p> <p>5.4.5 Generalized Least Squares (GLS): The Cochrane-Orcutt Procedure 99</p> <p>5.4.6 Example— Time Intervals Between Old Faithful Geyser Eruptions 100</p> <p>5.5 Summary 104</p> <p><b>Part III Categorical Predictors</b></p> <p><b>6 Analysis of Variance </b><b>109</b></p> <p>6.1 Introduction 109</p> <p>6.2 Concepts and Background Material 110</p> <p>6.2.1 One-Way ANOVA 110</p> <p>6.2.2 Two-Way ANOVA 111</p> <p>6.3 Methodology 113</p> <p>6.3.1 Codings for Categorical Predictors 113</p> <p>6.3.2 Multiple Comparisons 118</p> <p>6.3.3 Levene’s Test and Weighted Least Squares 120</p> <p>6.3.4 Membership in Multiple Groups 123</p> <p>6.4 Example—DVD Sales of Movies 125</p> <p>6.5 Higher-Way ANOVA 130</p> <p>6.6 Summary 132</p> <p><b>7 Analysis of Covariance </b><b>135</b></p> <p>7.1 Introduction 135</p> <p>7.2 Methodology 136</p> <p>7.2.1 Constant Shift Models 136</p> <p>7.2.2 Varying Slope Models 137</p> <p>7.3 Example —International Grosses of Movies 137</p> <p>7.4 Summary 142</p> <p><b>Part IV Non-Gaussian Regression Models</b></p> <p><b>8 Logistic Regression </b><b>145</b></p> <p>8.1 Introduction 145</p> <p>8.2 Concepts and Background Material 147</p> <p>8.2.1 The Logit Response Function 148</p> <p>8.2.2 Bernoulli and Binomial Random Variables 149</p> <p>8.2.3 Prospective and Retrospective Designs 149</p> <p>8.3 Methodology 152</p> <p>8.3.1 Maximum Likelihood Estimation 152</p> <p>8.3.2 Inference, Model Comparison, and Model Selection 153</p> <p>8.3.3 Goodness-of-Fit 155</p> <p>8.3.4 Measures of Association and Classification Accuracy 157</p> <p>8.3.5 Diagnostics 159</p> <p>8.4 Example— Smoking and Mortality 159</p> <p>8.5 Example— Modeling Bankruptcy 163</p> <p>8.6 Summary 168</p> <p><b>9 Multinomial Regression </b><b>173</b></p> <p>9.1 Introduction 173</p> <p>9.2 Concepts and Background Material 174</p> <p>9.2.1 Nominal Response Variable 174</p> <p>9.2.2 Ordinal Response Variable 176</p> <p>9.3 Methodology 178</p> <p>9.3.1 Estimation 178</p> <p>9.3.2 Inference, Model Comparisons, and Strength of Fit 178</p> <p>9.3.3 Lack of Fit and Violations of Assumptions 180</p> <p>9.4 Example— City Bond Ratings 180</p> <p>9.5 Summary 184</p> <p><b>10 Count Regression </b><b>187</b></p> <p>10.1 Introduction 187</p> <p>10.2 Concepts and Background Material 188</p> <p>10.2.1 The Poisson Random Variable 188</p> <p>10.2.2 Generalized Linear Models 189</p> <p>10.3 Methodology 190</p> <p>10.3.1 Estimation and Inference 190</p> <p>10.3.2 Offsets 191</p> <p>10.4 Overdispersion and Negative Binomial Regression 192</p> <p>10.4.1 Quasi-likelihood 192</p> <p>10.4.2 Negative Binomial Regression 193</p> <p>10.5 Example— Unprovoked Shark Attacks in Florida 194</p> <p>10.6 Other Count Regression Models 201</p> <p>10.7 Poisson Regression and Weighted Least Squares 203</p> <p>10.7.1 Example— International Grosses of Movies (continued) 204</p> <p>10.8 Summary 206</p> <p><b>11 Models for Time-to-Event (Survival) Data </b><b>209</b></p> <p>11.1 Introduction 210</p> <p>11.2 Concepts and Background Material 211</p> <p>11.2.1 The Nature of Survival Data 211</p> <p>11.2.2 Accelerated Failure Time Models 212</p> <p>11.2.3 The Proportional Hazards Model 214</p> <p>11.3 Methodology 214</p> <p>11.3.1 The Kaplan-Meier Estimator and the Log-Rank Test 214</p> <p>11.3.2 Parametric (Likelihood) Estimation 219</p> <p>11.3.3 Semiparametric (Partial Likelihood) Estimation 221</p> <p>11.3.4 The Buckley-James Estimator 223</p> <p>11.4 Example—The Survival of Broadway Shows (continued) 223</p> <p>11.5 Left-Truncated/Right-Censored Data and Time-Varying Covariates 230</p> <p>11.5.1 Left-Truncated/Right-Censored Data 230</p> <p>11.5.2 Example—The Survival of Broadway Shows (continued) 233</p> <p>11.5.3 Time-Varying Covariates 233</p> <p>11.5.4 Example—Female Heads of Government 235</p> <p>11.6 Summary 238</p> <p><b>Part V Other Regression Models</b></p> <p><b>12 Nonlinear Regression </b><b>243</b></p> <p>12.1 Introduction 243</p> <p>12.2 Concepts and Background Material 244</p> <p>12.3 Methodology 246</p> <p>12.3.1 Nonlinear Least Squares Estimation 246</p> <p>12.3.2 Inference for Nonlinear Regression Models 247</p> <p>12.4 Example —Michaelis-Menten Enzyme Kinetics 248</p> <p>12.5 Summary 252</p> <p><b>13 Models for Longitudinal and Nested Data </b><b>255</b></p> <p>13.1 Introduction 255</p> <p>13.2 Concepts and Background Material 257</p> <p>13.2.1 Nested Data and ANOVA 257</p> <p>13.2.2 Longitudinal Data and Time Series 258</p> <p>13.2.3 Fixed Effects Versus Random Effects 259</p> <p>13.3 Methodology 260</p> <p>13.3.1 The Linear Mixed Effects Model 260</p> <p>13.3.2 The Generalized Linear Mixed Effects Model 262</p> <p>13.3.3 Generalized Estimating Equations 262</p> <p>13.3.4 Nonlinear Mixed Effects Models 263</p> <p>13.4 Example —Tumor Growth in a Cancer Study 264</p> <p>13.5 Example —Unprovoked Shark Attacks in the United States 269</p> <p>13.6 Summary 275</p> <p><b>14 Regularization Methods and Sparse Models </b><b>277</b></p> <p>14.1 Introduction 277</p> <p>14.2 Concepts and Background Material 278</p> <p>14.2.1 The Bias–Variance Tradeoff 278</p> <p>14.2.2 Large Numbers of Predictors and Sparsity 279</p> <p>14.3 Methodology 280</p> <p>14.3.1 Forward Stepwise Regression 280</p> <p>14.3.2 Ridge Regression 281</p> <p>14.3.3 The Lasso 281</p> <p>14.3.4 Other Regularization Methods 283</p> <p>14.3.5 Choosing the Regularization Parameter(s) 284</p> <p>14.3.6 More Structured Regression Problems 285</p> <p>14.3.7 Cautions About Regularization Methods 286</p> <p>14.4 Example— Human Development Index 287</p> <p>14.5 Summary 289</p> <p><b>Part VI Nonparametric and Semiparametric Models</b></p> <p><b>15 Smoothing and Additive Models </b><b>295</b></p> <p>15.1 Introduction 296</p> <p>15.2 Concepts and Background Material 296</p> <p>15.2.1 The Bias–Variance Tradeoff 296</p> <p>15.2.2 Smoothing and Local Regression 297</p> <p>15.3 Methodology 298</p> <p>15.3.1 Local Polynomial Regression 298</p> <p>15.3.2 Choosing the Bandwidth 298</p> <p>15.3.3 Smoothing Splines 299</p> <p>15.3.4 Multiple Predictors, the Curse of Dimensionality, and Additive Models 300</p> <p>15.4 Example— Prices of German Used Automobiles 301</p> <p>15.5 Local and Penalized Likelihood Regression 304</p> <p>15.5.1 Example— The Bechdel Rule and Hollywood Movies 305</p> <p>15.6 Using Smoothing to Identify Interactions 307</p> <p>15.6.1 Example— Estimating Home Prices (continued) 308</p> <p>15.7 Summary 310</p> <p><b>16 Tree-Based Models </b><b>313</b></p> <p>16.1 Introduction 314</p> <p>16.2 Concepts and Background Material 314</p> <p>16.2.1 Recursive Partitioning 314</p> <p>16.2.2 Types of Trees 317</p> <p>16.3 Methodology 318</p> <p>16.3.1 CART 318</p> <p>16.3.2 Conditional Inference Trees 319</p> <p>16.3.3 Ensemble Methods 320</p> <p>16.4 Examples 321</p> <p>16.4.1 Estimating Home Prices (continued) 321</p> <p>16.4.2 Example—Courtesy in Airplane Travel 322</p> <p>16.5 Trees for Other Types of Data 327</p> <p>16.5.1 Trees for Nested and Longitudinal Data 327</p> <p>16.5.2 Survival Trees 328</p> <p>16.6 Summary 332</p> <p>Bibliography 337</p> <p>Index 343</p>
<p><b>Samprit Chatterjee, PhD,</b> is Professor Emeritus of Statistics at New York University. A Fellow of the American Statistical Association, Dr. Chatterjee has been a Fulbright scholar in both Kazakhstan and Mongolia. He is the coauthor of multiple editions of <i>Regression Analysis By Example</i>, <i>Sensitivity Analysis in Linear Regression</i>, <i>A Casebook for a First Course in Statistics and Data Analysis</i>, and the first edition of <i>Handbook of Regression Analysis</i>, all published by Wiley. <p><b>Jeffrey S. Simonoff, PhD,</b> is Professor of Statistics at the Leonard N. Stern School of Business of New York University. He is a Fellow of the American Statistical Association, a Fellow of the Institute of Mathematical Statistics, and an Elected Member of the International Statistical Institute. He has authored, coauthored, or coedited more than one hundred articles and seven books on the theory and applications of statistics.
<p><b>Handbook and reference guide for students and practitioners of statistical regression-based analyses in R</b> <p><i>Handbook of Regression Analysis with Applications in R, Second Edition</i> is a comprehensive and up-to-date guide to conducting complex regressions in the R statistical programming language. The authors' thorough treatment of "classical" regression analysis in the first edition is complemented here by their discussion of more advanced topics including time-to-event survival data and longitudinal and clustered data. <p>The book further pays particular attention to methods that have become prominent in the last few decades as increasingly large data sets have made new techniques and applications possible. These include: <ul> <li>Regularization methods</li> <li>Smoothing methods</li> <li>Tree-based methods</li> </ul> <p>In the new edition of the <i>Handbook</i>, the data analyst's toolkit is explored and expanded. Examples are drawn from a wide variety of real-life applications and data sets. All the utilized R code and data are available via an author-maintained website. <p>Of interest to undergraduate and graduate students taking courses in statistics and regression, the <i>Handbook of Regression Analysis</i> will also be invaluable to practicing data scientists and statisticians.

Diese Produkte könnten Sie auch interessieren:

Statistics for Microarrays
Statistics for Microarrays
von: Ernst Wit, John McClure
PDF ebook
90,99 €