Details

An Elementary Introduction to Statistical Learning Theory


An Elementary Introduction to Statistical Learning Theory


Wiley Series in Probability and Statistics, Band 853 1. Aufl.

von: Sanjeev Kulkarni, Gilbert Harman

109,99 €

Verlag: Wiley
Format: EPUB
Veröffentl.: 09.06.2011
ISBN/EAN: 9781118023464
Sprache: englisch
Anzahl Seiten: 232

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

<b>A thought-provoking look at statistical learning theory and its role in understanding human learning and inductive reasoning</b><br /> <br /> <p>A joint endeavor from leading researchers in the fields of philosophy and electrical engineering, <i>An Elementary Introduction to Statistical Learning Theory</i> is a comprehensive and accessible primer on the rapidly evolving fields of statistical pattern recognition and statistical learning theory. Explaining these areas at a level and in a way that is not often found in other books on the topic, the authors present the basic theory behind contemporary machine learning and uniquely utilize its foundations as a framework for philosophical thinking about inductive inference.</p> <p>Promoting the fundamental goal of statistical learning, knowing what is achievable and what is not, this book demonstrates the value of a systematic methodology when used along with the needed techniques for evaluating the performance of a learning system. First, an introduction to machine learning is presented that includes brief discussions of applications such as image recognition, speech recognition, medical diagnostics, and statistical arbitrage. To enhance accessibility, two chapters on relevant aspects of probability theory are provided. Subsequent chapters feature coverage of topics such as the pattern recognition problem, optimal Bayes decision rule, the nearest neighbor rule, kernel rules, neural networks, support vector machines, and boosting.</p> <p>Appendices throughout the book explore the relationship between the discussed material and related topics from mathematics, philosophy, psychology, and statistics, drawing insightful connections between problems in these areas and statistical learning theory. All chapters conclude with a summary section, a set of practice questions, and a reference sections that supplies historical notes and additional resources for further study.</p> <p><i>An Elementary Introduction to Statistical Learning Theory</i> is an excellent book for courses on statistical learning theory, pattern recognition, and machine learning at the upper-undergraduate and graduate levels. It also serves as an introductory reference for researchers and practitioners in the fields of engineering, computer science, philosophy, and cognitive science that would like to further their knowledge of the topic.</p>
<p>Preface xiii</p> <p><b>1 Introduction: Classification Learning Features and Applications 1</b></p> <p>1.1 Scope 1</p> <p>1.2 Why Machine Learning? 2</p> <p>1.3 Some Applications 3</p> <p>1.3.1 Image Recognition 3</p> <p>1.3.2 Speech Recognition 3</p> <p>1.3.3 Medical Diagnosis 4</p> <p>1.3.4 Statistical Arbitrage 4</p> <p>1.4 Measurements Features and Feature Vectors 4</p> <p>1.5 The Need for Probability 5</p> <p>1.6 Supervised Learning 5</p> <p>1.7 Summary 6</p> <p>1.8 Appendix: Induction 6</p> <p>1.9 Questions 7</p> <p>1.10 References 8</p> <p><b>2 Probability 10</b></p> <p>2.1 Probability of Some Basic Events 10</p> <p>2.2 Probabilities of Compound Events 12</p> <p>2.3 Conditional Probability 13</p> <p>2.4 Drawing Without Replacement 14</p> <p>2.5 A Classic Birthday Problem 15</p> <p>2.6 Random Variables 15</p> <p>2.7 Expected Value 16</p> <p>2.8 Variance 17</p> <p>2.9 Summary 19</p> <p>2.10 Appendix: Interpretations of Probability 19</p> <p>2.11 Questions 20</p> <p>2.12 References 21</p> <p><b>3 Probability Densities 23</b></p> <p>3.1 An Example in Two Dimensions 23</p> <p>3.2 Random Numbers in [01] 23</p> <p>3.3 Density Functions 24</p> <p>3.4 Probability Densities in Higher Dimensions 27</p> <p>3.5 Joint and Conditional Densities 27</p> <p>3.6 Expected Value and Variance 28</p> <p>3.7 Laws of Large Numbers 29</p> <p>3.8 Summary 30</p> <p>3.9 Appendix: Measurability 30</p> <p>3.10 Questions 32</p> <p>3.11 References 32</p> <p><b>4 The Pattern Recognition Problem 34</b></p> <p>4.1 A Simple Example 34</p> <p>4.2 Decision Rules 35</p> <p>4.3 Success Criterion 37</p> <p>4.4 The Best Classifier: Bayes Decision Rule 37</p> <p>4.5 Continuous Features and Densities 38</p> <p>4.6 Summary 39</p> <p>4.7 Appendix: Uncountably Many 39</p> <p>4.8 Questions 40</p> <p>4.9 References 41</p> <p><b>5 The Optimal Bayes Decision Rule 43</b></p> <p>5.1 Bayes Theorem 43</p> <p>5.2 Bayes Decision Rule 44</p> <p>5.3 Optimality and Some Comments 45</p> <p>5.4 An Example 47</p> <p>5.5 Bayes Theorem and Decision Rule with Densities 48</p> <p>5.6 Summary 49</p> <p>5.7 Appendix: Defining Conditional Probability 50</p> <p>5.8 Questions 50</p> <p>5.9 References 53</p> <p><b>6 Learning from Examples 55</b></p> <p>6.1 Lack of Knowledge of Distributions 55</p> <p>6.2 Training Data 56</p> <p>6.3 Assumptions on the Training Data 57</p> <p>6.4 A Brute Force Approach to Learning 59</p> <p>6.5 Curse of Dimensionality Inductive Bias and No Free Lunch 60</p> <p>6.6 Summary 61</p> <p>6.7 Appendix: What Sort of Learning? 62</p> <p>6.8 Questions 63</p> <p>6.9 References 64</p> <p><b>7 The Nearest Neighbor Rule 65</b></p> <p>7.1 The Nearest Neighbor Rule 65</p> <p>7.2 Performance of the Nearest Neighbor Rule 66</p> <p>7.3 Intuition and Proof Sketch of Performance 67</p> <p>7.4 Using more Neighbors 69</p> <p>7.5 Summary 70</p> <p>7.6 Appendix: When People use Nearest Neighbor Reasoning 70</p> <p>7.6.1 Who Is a Bachelor? 70</p> <p>7.6.2 Legal Reasoning 71</p> <p>7.6.3 Moral Reasoning 71</p> <p>7.7 Questions 72</p> <p>7.8 References 73</p> <p><b>8 Kernel Rules 74</b></p> <p>8.1 Motivation 74</p> <p>8.2 A Variation on Nearest Neighbor Rules 75</p> <p>8.3 Kernel Rules 76</p> <p>8.4 Universal Consistency of Kernel Rules 79</p> <p>8.5 Potential Functions 80</p> <p>8.6 More General Kernels 81</p> <p>8.7 Summary 82</p> <p>8.8 Appendix: Kernels Similarity and Features 82</p> <p>8.9 Questions 83</p> <p>8.10 References 84</p> <p><b>9 Neural Networks: Perceptrons 86</b></p> <p>9.1 Multilayer Feedforward Networks 86</p> <p>9.2 Neural Networks for Learning and Classification 87</p> <p>9.3 Perceptrons 89</p> <p>9.3.1 Threshold 90</p> <p>9.4 Learning Rule for Perceptrons 90</p> <p>9.5 Representational Capabilities of Perceptrons 92</p> <p>9.6 Summary 94</p> <p>9.7 Appendix: Models of Mind 95</p> <p>9.8 Questions 96</p> <p>9.9 References 97</p> <p><b>10 Multilayer Networks 99</b></p> <p>10.1 Representation Capabilities of Multilayer Networks 99</p> <p>10.2 Learning and Sigmoidal Outputs 101</p> <p>10.3 Training Error and Weight Space 104</p> <p>10.4 Error Minimization by Gradient Descent 105</p> <p>10.5 Backpropagation 106</p> <p>10.6 Derivation of Backpropagation Equations 109</p> <p>10.6.1 Derivation for a Single Unit 110</p> <p>10.6.2 Derivation for a Network 111</p> <p>10.7 Summary 113</p> <p>10.8 Appendix: Gradient Descent and Reasoning toward Reflective Equilibrium 113</p> <p>10.9 Questions 114</p> <p>10.10 References 115</p> <p><b>11 PAC Learning 116</b></p> <p>11.1 Class of Decision Rules 117</p> <p>11.2 Best Rule from a Class 118</p> <p>11.3 Probably Approximately Correct Criterion 119</p> <p>11.4 PAC Learning 120</p> <p>11.5 Summary 122</p> <p>11.6 Appendix: Identifying Indiscernibles 122</p> <p>11.7 Questions 123</p> <p>11.8 References 123</p> <p><b>12 VC Dimension 125</b></p> <p>12.1 Approximation and Estimation Errors 125</p> <p>12.2 Shattering 126</p> <p>12.3 VC Dimension 127</p> <p>12.4 Learning Result 128</p> <p>12.5 Some Examples 129</p> <p>12.6 Application to Neural Nets 132</p> <p>12.7 Summary 133</p> <p>12.8 Appendix: VC Dimension and Popper Dimension 133</p> <p>12.9 Questions 134</p> <p>12.10 References 135</p> <p><b>13 Infinite VC Dimension 137</b></p> <p>13.1 A Hierarchy of Classes and Modified PAC Criterion 138</p> <p>13.2 Misfit Versus Complexity Trade-Off 138</p> <p>13.3 Learning Results 139</p> <p>13.4 Inductive Bias and Simplicity 140</p> <p>13.5 Summary 141</p> <p>13.6 Appendix: Uniform Convergence and Universal Consistency 141</p> <p>13.7 Questions 142</p> <p>13.8 References 143</p> <p><b>14 The Function Estimation Problem 144</b></p> <p>14.1 Estimation 144</p> <p>14.2 Success Criterion 145</p> <p>14.3 Best Estimator: Regression Function 146</p> <p>14.4 Learning in Function Estimation 146</p> <p>14.5 Summary 147</p> <p>14.6 Appendix: Regression Toward the Mean 147</p> <p>14.7 Questions 148</p> <p>14.8 References 149</p> <p><b>15 Learning Function Estimation 150</b></p> <p>15.1 Review of the Function Estimation/Regression Problem 150</p> <p>15.2 Nearest Neighbor Rules 151</p> <p>15.3 Kernel Methods 151</p> <p>15.4 Neural Network Learning 152</p> <p>15.5 Estimation with a Fixed Class of Functions 153</p> <p>15.6 Shattering Pseudo-Dimension and Learning 154</p> <p>15.7 Conclusion 156</p> <p>15.8 Appendix: Accuracy Precision Bias and Variance in Estimation 156</p> <p>15.9 Questions 157</p> <p>15.10 References 158</p> <p><b>16 Simplicity 160</b></p> <p>16.1 Simplicity in Science 160</p> <p>16.1.1 Explicit Appeals to Simplicity 160</p> <p>16.1.2 Is the World Simple? 161</p> <p>16.1.3 Mistaken Appeals to Simplicity 161</p> <p>16.1.4 Implicit Appeals to Simplicity 161</p> <p>16.2 Ordering Hypotheses 162</p> <p>16.2.1 Two Kinds of Simplicity Orderings 162</p> <p>16.3 Two Examples 163</p> <p>16.3.1 Curve Fitting 163</p> <p>16.3.2 Enumerative Induction 164</p> <p>16.4 Simplicity as Simplicity of Representation 165</p> <p>16.4.1 Fix on a Particular System of Representation? 166</p> <p>16.4.2 Are Fewer Parameters Simpler? 167</p> <p>16.5 Pragmatic Theory of Simplicity 167</p> <p>16.6 Simplicity and Global Indeterminacy 168</p> <p>16.7 Summary 169</p> <p>16.8 Appendix: Basic Science and Statistical Learning Theory 169</p> <p>16.9 Questions 170</p> <p>16.10 References 170</p> <p><b>17 Support Vector Machines 172</b></p> <p>17.1 Mapping the Feature Vectors 173</p> <p>17.2 Maximizing the Margin 175</p> <p>17.3 Optimization and Support Vectors 177</p> <p>17.4 Implementation and Connection to Kernel Methods 179</p> <p>17.5 Details of the Optimization Problem 180</p> <p>17.5.1 Rewriting Separation Conditions 180</p> <p>17.5.2 Equation for Margin 181</p> <p>17.5.3 Slack Variables for Nonseparable Examples 181</p> <p>17.5.4 Reformulation and Solution of Optimization 182</p> <p>17.6 Summary 183</p> <p>17.7 Appendix: Computation 184</p> <p>17.8 Questions 185</p> <p>17.9 References 186</p> <p><b>18 Boosting 187</b></p> <p>18.1 Weak Learning Rules 187</p> <p>18.2 Combining Classifiers 188</p> <p>18.3 Distribution on the Training Examples 189</p> <p>18.4 The Adaboost Algorithm 190</p> <p>18.5 Performance on Training Data 191</p> <p>18.6 Generalization Performance 192</p> <p>18.7 Summary 194</p> <p>18.8 Appendix: Ensemble Methods 194</p> <p>18.9 Questions 195</p> <p>18.10 References 196</p> <p>Bibliography 197</p> <p>Author Index 203</p> <p>Subject Index 207</p>
<p>“The main focus of the book is on the ideas behind basic principles of learning theory and I can strongly recommend the book to anyone who wants to comprehend these ideas.”  (<i>Mathematical Reviews</i>, 1 January  2013)</p> <p>“It also serves as an introductory reference for researchers and practitioners in the fields of engineering, computer science, philosophy, and cognitive science that would like to further their knowledge of the topic.”  (<i>Zentralblatt MATH</i>, 2012)</p> <p> </p>
<p>SANJEEV KULKARNI, PhD, is Professor in the Department of Electrical Engineering at Princeton University, where he is also an affiliated faculty member in the Department of Operations Research and Financial Engineering and the Department of Philosophy. Dr. Kulkarni has published widely on statistical pattern recognition, nonparametric estimation, machine learning, information theory, and other areas. A Fellow of the IEEE, he was awarded Princeton University's President's Award for Distinguished Teaching in 2007.</p> <p>GILBERT HARMAN, PhD, is James S. McDonnell Distinguished University Professor in the Department of Philosophy at Princeton University. A Fellow of the Cognitive Science Society, he is the author of more than fifty published articles in his areas of research interest, which include ethics, statistical learning theory, psychology of reasoning, and logic.</p>
<p>A thought-provoking look at statistical learning theory and its role in understanding human learning and inductive reasoning</p> <p>A joint endeavor from leading researchers in philosophy and electrical engineering, An Elementary Introduction to Statistical Learning Theory is a comprehensive and accessible primer on the rapidly evolving fields of statistical pattern recognition and statistical learning theory. Explaining these areas at a level and in a way that is not often found in other books on the topic, the authors present the basic theory behind contemporary machine learning and uniquely utilize its foundations as a framework for philosophical thinking about inductive inference.</p> <p>Promoting the fundamental goal of statistical learning, knowing what is achievable and what is not, this book demonstrates the value of a systematic methodology when used along with the needed techniques for evaluating the performance of a learning system. First, an introduction to machine learning is presented that includes brief discussions of applications such as image recognition, speech recognition, medical diagnostics, and statistical arbitrage. To enhance accessibility, two chapters on relevant aspects of probability theory are provided. Subsequent chapters feature coverage of topics such as the pattern recognition problem, optimal Bayes decision rule, the nearest neighbor rule, kernel rules, neural networks, support vector machines, and boosting.</p> <p>Appendices throughout the book explore the relationship between the discussed material and related topics from mathematics, philosophy, psychology, and statistics, drawing insightful connections between problems in these areas and statistical learning theory. All chapters conclude with a summary section, a set of practice questions, and a reference section that supplies historical notes and additional resources for further study.</p> <p>An Elementary Introduction to Statistical Learning Theory is an excellent book for courses on statistical learning theory, pattern recognition, and machine learning at the upper-undergraduate and graduate levels. It also serves as an introductory reference for researchers and practitioners in the fields of engineering, computer science, philosophy, and cognitive science that would like to further their knowledge of the topic.</p>

Diese Produkte könnten Sie auch interessieren:

Statistics for Microarrays
Statistics for Microarrays
von: Ernst Wit, John McClure
PDF ebook
90,99 €