Image

CONTENTS

PREFACE

PREFACE TO FIRST EDITION

CHAPTER 1: GENERALITIES

1.1 WHY ROBUST PROCEDURES?

1.2 WHAT SHOULD A ROBUST PROCEDURE ACHIEVE?

1.3 QUALITATIVE ROBUSTNESS

1.4 QUANTITATIVE ROBUSTNESS

1.5 INFINITESIMAL ASPECTS

1.6 OPTIMAL ROBUSTNESS

1.7 PERFORMANCE COMPARISONS

1.8 COMPUTATION OF ROBUST ESTIMATES

1.9 LIMITATIONS TO ROBUSTNESS THEORY

CHAPTER 2: THE WEAK TOPOLOGY AND ITS METRIZATION

2.1 GENERAL REMARKS

2.2 THE WEAK TOPOLOGY

2.3 LEVY AND PROHOROV METRICS

2.4 THE BOUNDED LIPSCHITZ METRIC

2.5 FRÉCHET AND GÂTEAUX DERIVATIVES

2.6 HAMPEL’S THEOREM

CHAPTER 3: THE BASIC TYPES OF ESTIMATES

3.1 GENERAL REMARKS

3.2 MAXIMUM LIKELIHOOD TYPE ESTIMATES (M-ESTIMATES)

3.3 LINEAR COMBINATIONS OF ORDER STATISTICS (L-ESTIMATES)

3.4 ESTIMATES DERIVED FROM RANK TESTS (R-ESTIMATES)

3.5 ASYMPTOTICALLY EFFICIENT M-,L-, AND R-ESTIMATES

CHAPTER 4: ASYMPTOTIC MINIMAX THEORY FOR ESTIMATING LOCATION

4.1 GENERAL REMARKS

4.2 MINIMAX BIAS

4.3 MINIMAX VARIANCE: PRELIMINARIES

4.4 DISTRIBUTIONS MINIMIZING FISHER INFORMATION

4.5 DETERMINATION OF F0 BY VARIATIONAL METHODS

4.6 ASYMPTOTICALLY MINIMAX M-ESTIMATES

4.7 ON THE MINIMAX PROPERTY FOR L- AND R-ESTIMATES

4.8 REDESCENDING M-ESTIMATES

4.9 QUESTIONS OF ASYMMETRIC CONTAMINATION

CHAPTER 5: SCALE ESTIMATES

5.1 GENERAL REMARKS

5.2 M-estimates OF SCALE

5.3 L-ESTIMATES OF SCALE

5.4 R-ESTIMATES OF SCALE

5.5 ASYMPTOTICALLY EFFICIENT SCALE ESTIMATES

5.6 DISTRIBUTIONS MINIMIZING FISHER INFORMATION FOR SCALE

5.7 MINIMAX PROPERTIES

CHAPTER 6: MULTIPARAMETER PROBLEMS—IN PARTICULAR JOINT ESTIMATION OF LOCATION AND SCALE

6.1 GENERAL REMARKS

6.2 CONSISTENCY OF M-ESTIMATES

6.3 ASYMPTOTIC NORMALITY OF M-estimates

6.4 SIMULTANEOUS M-ESTIMATES OF LOCATION AND SCALE

6.5 M-ESTIMATES WITH PRELIMINARY ESTIMATES OF SCALE

6.6 QUANTITATIVE ROBUSTNESS OF JOINT ESTIMATES OF LOCATION AND SCALE

6.7 THE COMPUTATION OF M-ESTIMATES OF SCALE

6.8 STUDENTIZING

CHAPTER 7: REGRESSION

7.1 GENERAL REMARKS

7.2 THE CLASSICAL LINEAR LEAST SQUARES CASE

7.3 ROBUSTIZING THE LEAST SQUARES APPROACH

7.4 ASYMPTOTICS OF ROBUST REGRESSION ESTIMATES

7.5 CONJECTURES AND EMPIRICAL RESULTS

7.6 ASYMPTOTIC COVARIANCES AND THEIR ESTIMATION

7.7 CONCOMITANT SCALE ESTIMATES

7.8 COMPUTATION OF REGRESSION M-ESTIMATES

7.9 THE FIXED CARRIER CASE: WHAT SIZE hi?

7.10 ANALYSIS OF VARIANCE

7.11 L1-ESTIMATES AND MEDIAN POLISH

7.12 OTHER APPROACHES TO ROBUST REGRESSION

CHAPTER 8: ROBUST COVARIANCE AND CORRELATION MATRICES

8.1 GENERAL REMARKS

8.2 ESTIMATION OF MATRIX ELEMENTS THROUGH ROBUST VARIANCES

8.3 ESTIMATION OF MATRIX ELEMENTS THROUGH ROBUST CORRELATION

8.4 AN AFFINELY EQUIVARIANT APPROACH

8.5 ESTIMATES DETERMINED BY IMPLICIT EQUATIONS

8.6 EXISTENCE AND UNIQUENESS OF SOLUTIONS

8.7 INFLUENCE FUNCTIONS AND QUALITATIVE ROBUSTNESS

8.8 CONSISTENCY AND ASYMPTOTIC NORMALITY

8.9 BREAKDOWN POINT

8.10 LEAST INFORMATIVE DISTRIBUTIONS

8.11 SOME NOTES ON COMPUTATION

CHAPTER 9: ROBUSTNESS OF DESIGN

9.1 GENERAL REMARKS

9.2 MINIMAX GLOBAL FIT

9.3 MINIMAX SLOPE

CHAPTER 10: EXACT FINITE SAMPLE RESULTS

10.1 GENERAL REMARKS

10.2 LOWER AND UPPER PROBABILITIES AND CAPACITIES

10.3 ROBUST TESTS

10.4 SEQUENTIAL TESTS

10.5 THE NEYMAN–PEARSON LEMMA FOR 2-ALTERNATING CAPACITIES

10.6 ESTIMATES DERIVED FROM TESTS

10.7 MINIMAX INTERVAL ESTIMATES

CHAPTER 11: FINITE SAMPLE BREAKDOWN POINT

11.1 GENERAL REMARKS

11.2 DEFINITION AND EXAMPLES

11.3 INFINITESIMAL ROBUSTNESS AND BREAKDOWN

11.4 MALICIOUS VERSUS STOCHASTIC BREAKDOWN

CHAPTER 12: INFINITESIMAL ROBUSTNESS

12.1 GENERAL REMARKS

12.2 HAMPEL’S INFINITESIMAL APPROACH

12.3 SHRINKING NEIGHBORHOODS

CHAPTER 13: ROBUST TESTS

13.1 GENERAL REMARKS

13.2 LOCAL STABILITY OF A TEST

13.3 TESTS FOR GENERAL PARAMETRIC MODELS IN THE MULTIVARIATE CASE

13.4 ROBUST TESTS FOR REGRESSION AND GENERALIZED LINEAR MODELS

CHAPTER 14: SMALL SAMPLE ASYMPTOTICS

14.1 GENERAL REMARKS

14.2 SADDLEPOINT APPROXIMATION FOR THE MEAN

14.3 SADDLEPOINT APPROXIMATION OF THE DENSITY OF M-ESTIMATORS

14.4 TAIL PROBABILITIES

14.5 MARGINAL DISTRIBUTIONS

14.6 SADDLEPOINT TEST

14.7 RELATIONSHIP WITH NONPARAMETRIC TECHNIQUES

14.8 APPENDIX

CHAPTER 15: BAYESIAN ROBUSTNESS

15.1 GENERAL REMARKS

15.2 DISPARATE DATA AND PROBLEMS WITH THE PRIOR

15.3 MAXIMUM LIKELIHOOD AND BAYES ESTIMATES

15.4 SOME ASYMPTOTIC THEORY

15.5 MINIMAX ASYMPTOTIC ROBUSTNESS ASPECTS

15.6 NUISANCE PARAMETERS

15.7 WHY THERE IS NO FINITE SAMPLE BAYESIAN ROBUSTNESS THEORY

REFERENCES

INDEX

Image

To the memory of
John W. Tukey

PREFACE

When Wiley asked me to undertake a revision of Robust Statistics for a second edition, I was at first very reluctant to do so. My own interests had begun to shift toward data analysis in the late 1970s, and I had ceased to work in robustness shortly after the publication of the first edition. Not only was I now out of contact with the forefront of current work, but I also disagreed with some of the directions that the latter had taken and was not overly keen to enter into polemics. Back in the 1960s, robustness theory had been created to correct the instability problems of the “optimal” procedures of classical mathematical statistics. At that time, in order to make robustness acceptable within the paradigms then prevalent in statistics, it had been indispensable to create optimally robust (i.e., minimax) alternative procedures. Ironically, by the 1980s, “optimal” robustness began to run into analogous instability problems. In particular, while a high breakdown point clearly is desirable, the (still) fashionable strife for the highest possible breakdown point in my opinion is misguided: it is not only overly pessimistic, but, even worse, it disregards the central stability aspect of robustness.

But an update clearly was necessary. After the closure date of the first edition, there had been important developments not only with regard to the breakdown point, on which I have added a chapter, but also in the areas of infinitesimal robustness, robust tests, and small sample asymptotics. In many places, it would suffice to update bibliographical references, so the manuscript of the second edition could be based on a re-keyed version of the first. Other aspects deserved a more extended discussion. I was fortunate to persuade Elvezio Ronchetti, who had been one of the prime researchers working in the two last mentioned areas (robust tests and small sample asymptotics), to collaborate and add the corresponding Chapters 13 and 14. Also, I extended the discussion of regression, and I decided to add a chapter on Bayesian robustness—even though, or perhaps because, I am not a Bayesian (or only rarely so). Among other minor changes, since most readers of the first edition had appreciated the General Remarks at the beginning of the chapters, I have expanded some of them and also elsewhere devoted more space to an informal discussion of motivations.

The new edition still has no pretensions of being encyclopedic. Like the first, it is centered on a robustness concept based on minimax asymptotic variance and on M-estimation, complemented by some exact finite sample results. Much of the material of the first edition is just as valid as it was in 1980. Deliberately, such parts were left intact, except that bibliographical references had to be added. Also, I hope that my own perspective has improved with an increased temporal and professional distance. Although this improved perspective has not affected the mathematical facts, it has sometimes sharpened their interpretation.

Special thanks go to Amy Hendrickson for her patient help with the Wiley LATEX-macros and the various quirks of TEX.

PETER J. HUBER

Klosters
November 2008

PREFACE TO THE FIRST EDITION

The present monograph is the first systematic, book-length exposition of robust statistics. The technical term “robust” was coined only in 1953 (by G. E. P. Box), and the subject matter acquired recognition as a legitimate topic for investigation only in the mid-sixties, but it certainly never was a revolutionary new concept. Among the leading scientists of the late nineteenth and early twentieth century, there were several practicing statisticians (to name but a few: the astronomer S. Newcomb, the astrophysicist A.S. Eddington, and the geophysicist H. Jeffreys), who had a perfectly clear, operational understanding of the idea; they knew the dangers of long-tailed error distributions, they proposed probability models for gross errors, and they even invented excellent robust alternatives to the standard estimates, which were rediscovered only recently. But for a long time theoretical statisticians tended to shun the subject as being inexact and “dirty.” My 1964 paper may have helped to dispel such prejudices. Amusingly (and disturbingly), it seems that lately a kind of bandwagon effect has evolved, that the pendulum has swung to the other extreme, and that “robust” has now become a magic word, which is invoked in order to add respectability.

This book gives a solid foundation in robustness to both the theoretical and the applied statistician. The treatment is theoretical, but the stress is on concepts, rather than on mathematical completeness. The level of presentation is deliberately uneven: in some chapters simple cases are treated with mathematical rigor; in others the results obtained in the simple cases are transferred by analogy to more complicated situations (like multiparameter regression and covariance matrix estimation), where proofs are not always available (or are available only under unrealistically severe assumptions). Also selected numerical algorithms for computing robust estimates are described and, where possible, convergence proofs are given.

Chapter 1 gives a general introduction and overview; it is a must for every reader. Chapter 2 contains an account of the formal mathematical background behind qualitative and quantitative robustness, which can be skipped (or skimmed) if the reader is willing to accept certain results on faith. Chapter 3 introduces and discusses the three basic types of estimates (M-, L-, and R-estimates), and Chapter 4 treats the asymptotic minimax theory for location estimates; both chapters again are musts. The remaining chapters branch out in different directions and are fairly independent and self-contained; they can be read or taught in more or less any order.

The book does not contain exercises—I found it hard to invent a sufficient number of problems in this area that were neither trivial nor too hard—so it does not satisfy some of the formal criteria for a textbook. Nevertheless I have successfully used various stages of the manuscript as such in graduate courses.

The book also has no pretensions of being encyclopedic. I wanted to cover only those aspects and tools that I personally considered to be the most important ones. Some omissions and gaps are simply due to the fact that I currently lack time to fill them in, but do not want to procrastinate any longer (the first draft for this book goes back to 1972). Others are intentional. For instance, adaptive estimates were excluded because I would now prefer to classify them with nonparametric rather than with robust statistics, under the heading of nonparametric efficient estimation. The so-called Bayesian approach to robustness confounds the subject with admissible estimation in an ad hoc parametric supermodel, and still lacks reliable guidelines on how to select the supermodel and the prior so that we end up with something robust. The coverage of L- and R-estimates was cut back from earlier plans because they do not generalize well and get awkward to compute and to handle in multiparameter situations.

A large part of the final draft was written when I was visiting Harvard University in the fall of 1977; my thanks go to the students, in particular to P. Rosenbaum and Y. Yoshizoe, who then sat in my seminar course and provided many helpful comments.

PETER J. HUBER

Cambridge, Massachusetts
July 1980