Cover Page

WILEY SERIES IN PROBABILITY AND STATISTICS

Established by Walter A. Shewhart and Samuel S. Wilks

Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Geof H. Givens, Harvey Goldstein, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay

Editors Emeriti: J. Stuart Hunter, Iain M. Johnstone, Joseph B. Kadane, Jozef L. Teugels

The Wiley Series in Probability and Statistics is well established and authoritative. It covers many topics of current research interest in both pure and applied statistics and probability theory. Written by leading statisticians and institutions, the titles span both state-of-the-art developments in the field and classical methods.

Reflecting the wide range of current research in statistics, the series encompasses applied, methodological and theoretical statistics, ranging from applications and new techniques made possible by advances in computerized practice to rigorous treatment of theoretical approaches.

This series provides essential and invaluable reading for all statisticians, whether in academia, industry, government, or research.

A complete list of titles in this series can be found at http://www.wiley.com/go/wsps

Statistical Analysis with Missing Data



Roderick J. A. Little

Richard D. Remington Distinguished University Professor of
Biostatistics, Professor of Statistics, and Research Professor,
Institute for Social Rsearch, at the University of Michigan


Donald B. Rubin

Professor at Yau Mathematical Sciences Center, Tsinghua
University; Murray Shusterman Senior Research Fellow, Fox
School of Business, at Temple University; and Professor Emeritus,
at Harvard University



3rd Edition















Wiley Logo

Preface to the Third Edition

There has been tremendous growth in the literature on statistical methods for handling missing data, and associated software, since the publication of the second edition of “Statistical Analysis with Missing Data” in 2002. Attempting to cover this literature comprehensively would add excessively to the length of the book and also change its character. Therefore, our additions have focused mainly on work with which we have been associated and we can write about with some authority. The main changes from the second edition are as follows:

  1. Concerning theory, we have changed the “obs” and “mis” notation for observed and missing data, which, though intuitive, caused some confusion because subscripting data by “obs” was not intended to imply conditioning on the pattern of observed values. We now use subscript (0) to denote observed values and subscript (1) to denote missing values, which is in fact similar to the notation employed by Rubin's original (1976a) paper. We have also been more specific about assumptions for ignoring the missing data mechanism for likelihood-based/Bayesian analyses and asymptotic frequentist analysis; the latter involves changing missing data patterns in repeated analysis. These changes reflect material in Mealli and Rubin (2015). A definition of “partially missing at random” and ignorability for parameter subsets has been added, based on Little et al. (2016a).
  2. Data previously termed “not missing at random” are now called “missing not at random,” which we think is clearer.
  3. Applications place greater emphasis on multiple imputation rather than direct computation of the posterior distribution of parameters. This new emphasis reflects the expansion of flexible software for multiple imputation, which makes the method attractive to applied statisticians.
  4. We have added a number of additional missing data applications to measurement error, disclosure limitation, robust inference, and clinical trial data.
  5. Chapter 15, on missing not at random data, has been completely revamped, including a number of new applications to subsample regression and sensitivity analysis
  6. A number of minor errors in the previous edition have been corrected, although (as in all books), some probably remain and other new ones may have crept in – for which we apologize.

The ideal of using a consistent notation across all chapters, avoiding the use of the same symbol to mean different concepts, proved too hard given the range of topics covered. However, we have tried to maintain a consistent notation within chapters, and defined new uses of common letters as they arise. We hope different uses of the same symbol across chapters is not too confusing, and welcome suggestions for improvements.

Part I
Overview and Basic Approaches