Wiley Logo

WILEY SERIES IN PROBABILITY AND STATISTICS

Established by Walter A. Shewhart and Samuel S. Wilks

Editors: David J. Balding, Noel A. C. Cressie, Garrett M. Fitzmaurice, Geof H. Givens, Harvey Goldstein, Geert Molenberghs, David W. Scott, Adrian F. M. Smith, Ruey S. Tsay

Editors Emeriti: J. Stuart Hunter, Iain M. Johnstone, Joseph B. Kadane, Jozef L. Teugels

The Wiley Series in Probability and Statistics is well established and authoritative. It covers many topics of current research interest in both pure and applied statistics and probability theory. Written by leading statisticians and institutions, the titles span both state-of-the-art developments in the field and classical methods.

Reflecting the wide range of current research in statistics, the series encompasses applied, methodological and theoretical statistics, ranging from applications and new techniques made possible by advances in computerized practice to rigorous treatment of theoretical approaches. This series provides essential and invaluable reading for all statisticians, whether in academia, industry, government, or research.

A complete list of titles in this series can be found at http://www.wiley.com/go/wsps

MEASURING AGREEMENT

Models, Methods, and Applications



Pankaj K. Choudhary

The University of Texas at Dallas


Haikady N. Nagaraja

The Ohio State University









Wiley Logo

To: My parents, and Swati, Aalo, and Arushi—PKC Jyothi—HNN

Preface

This book presents statistical models and methods for analyzing common types of data collected in method comparison experiments and illustrates their application through detailed case studies. The main aim of these trials is to evaluate agreement between two or more methods of measurement. Although such studies are particularly abundant in health-related fields, they are also conducted in other disciplines, including metrology, ecology, and social and behavioral sciences.

Currently, at least six books cover the topic of agreement evaluation, including von Eye and Mun (2004), Carstensen (2010), Dunn (2004), Shoukri (2010), Broemeling (2009), and Lin et al. (2011). Of these, the first focuses exclusively on categorical data, and the second on continuous data. Others consider both types of data with varying levels of depth and choice of topics. Our book also considers both but with a primary focus on continuous data and one chapter devoted to categorical data. By providing chapter-length treatments of the common types of continuous data, it offers a comprehensive coverage of the topic, and its scope is broader than any other book currently available. It, however, by no means offers a complete survey of the literature. For example, measurement error models, Bayesian methods, and approaches based on generalized estimating equations are not included.

Essentially two principles guided us while writing this book. The first was to view the analysis of method comparison data as a two-step procedure where, in step 1, an adequate model for the data is found, and in step 2, inferential techniques are applied for appropriate functions of the parameters of the model found in step 1. For modeling of data, we primarily rely on mixed-effects models because they capture dependence in a subject’s measurements in an intuitively appealing manner by means of random subject effects; and they also offer a unified framework for dealing with a variety of data types. Besides, they can be fit by the maximum likelihood method using any commonly available statistical software package. For inference, we use the standard large-sample theory and invoke a bootstrap approach whenever the sample size seems too small for the asymptotic methods to be accurate. The second principle was to strive to make the presentation accessible to a wide audience while at the same time making the book theoretically rigorous and self-contained with necessary technical details and references. We have attempted to strike this balance by separating the technical details from the methodological descriptions, forgoing the references in favor of a bibliographic note at the end of each chapter, and by presenting detailed analyses of several real datasets.

The book is organized into twelve chapters. The first eleven are concerned with continuous data while the last covers categorical data. Chapter 1 provides a general introduction to studies comparing two measurement methods and discusses key concepts and statistical issues and tools involved in their analysis. Chapter 2 introduces various measures of agreement for continuous data. Chapter 3 describes mixed-effects models in general and presents the large-sample approach for inference. It provides the technical foundation for the rest of the book and can be skipped by a reader interested in applications. Chapters 4 through 9 consider continuous data collected from various types of experiments, with study designs becoming increasingly more complex. In order, these chapters are devoted to designs with paired measurements, repeated measurements, heteroscedastic measurements, more than two methods, covariates, and longitudinal data. Chapter 10 presents a nonparametric approach for data that do not satisfy assumptions of a mixed-effects model. Chapter 11 considers sample size determination for designing a method comparison study with continuous data. Chapter 12 takes up the question of agreement with categorical data.

Even though the presentation is self-contained, some statistical background is expected from the readers. Familiarity with basic statistical concepts such as maximum likelihood estimation, hypothesis testing, confidence intervals, correlation, and linear regression is necessary. A prior introduction to mixed-effects models and linear algebra will enhance the understanding of the technical details.

The free statistical software R (R Core Team, 2015) has been used to perform all the computations and to generate all the graphics presented in this book. However, the R code is not presented. Much of the code and many of the datasets used here are publicly available at the companion website: http://www.utdallas.edu/~pankaj/agreement_book/

Some familiarity with R programming is assumed for following the code and understanding the output produced. In addition to the base and graphics packages of R, the the following packages and their dependencies have been used in preparing this book: lattice (Sarkar, 2008), latticeExtra (Sarkar and Andrews, 2013), Matrix (Bates and Maechler, 2015), mvtnorm (Genz et al., 2015), multcomp (Hothorn et al., 2008), nlme (Pinheiro et al., 2015), numDeriv (Gilbert and Varadhan, 2015), tikzDevice (Sharpsteen and Bracken, 2015), and xtable (Dahl, 2016).

The book is targeted primarily towards two groups of researchers. The first consists of biomedical and social and behavioral scientists interested in the development and validation of measurement methods. The second includes statisticians engaged in the design and analysis of method comparison studies and in the development of associated statistical methodologies. It can also serve as a textbook for a semester-long special topics course at the graduate level. With that purpose, we have incorporated numerous theoretical and data-centric exercises at the end of the chapters that expand on the material covered in the main body. These exercises provide practice for mastering methodological details and applying the results.

We appreciate the support from our institutions as we marched through this project and for their outstanding library and computing facilities. We thank all those scientists whose dedicated research we were able to highlight in this work. We thank our long-time friends and colleagues for their advice and encouragement, including Professors Babis Papachristou (Rowan University), Michael Baron (American University), Vladimir Dragovic and Vish Ramakrishna (UT Dallas), and Tom Santner and Doug Wolfe (Ohio State University). We thank Professor Phill Cassey (University of Adelaide) for introducing us to applications in ecology and providing datasets, and Professor Chaitra Nagaraja (Fordham University) for producing the plots in Chapter 12. We also thank Professors Huiman Barnhart (Duke University), Douglas Hawkins (University of Minnesota), Vernon Chinchilli (Pennsylvania State University), and Michael Haber (Emory University) for sharing their datasets.

We are grateful to Professors Mohamed Shoukri (King Faisal Specialist Hospital and Research Centre) and Tony Ng (Southern Methodist University) for reading an earlier draft of the manuscript and providing valuable comments. We thank Susanne Steitz-Filler, Allison McGinniss, and Melissa Yanuzzi from John Wiley for guiding the project from start to finish and for their patience and perseverance. We invite the input of our readers on the coverage and presentation here as well as on the companion website as there is always room for improvement.

This book would not have been possible without the support of our family members. They gracefully sacrificed their time with us to allow us to work on a project that seemed to take forever. We take this opportunity to thank them all from the bottom of our hearts.

P. K. Choudhary & H. N. Nagaraja

Richardson, Texas

Columbus, Ohio

July, 2017