CONTENTS
PREFACE
ACKNOWLEDGMENTS
ACRONYMS
CHAPTER 1: INTRODUCTION TO BAYESIAN INFERENCE
1.1 INTRODUCTION: BAYESIAN MODELING IN THE 21 ST CENTURY
1.2 DEFINITION OF STATISTICAL MODELS
1.3 BAYES THEOREM
1.4 MODEL-BASED BAYESIAN INFERENCE
1.5 INFERENCE USING CONJUGATE PRIOR DISTRIBUTIONS
1.6 NONCONJUGATE ANALYSIS
Problems
CHAPTER 2: MARKOV CHAIN MONTE CARLO ALGORITHMS IN BAYESIAN INFERENCE
2.1 SIMULATION, MONTE CARLO INTEGRATION, AND THEIR IMPLEMENTATION IN BAYESIAN INFERENCE
2.2 MARKOV CHAIN MONTE CARLO METHODS
2.3 POPULAR MCMC ALGORITHMS
2.4 SUMMARY AND CLOSING REMARKS
Problems
CHAPTER 3: WinBUGS SOFTWARE: INTRODUCTION, SETUP, AND BASIC ANALYSIS
3.1 INTRODUCTION AND HISTORICAL BACKGROUND
3.2 THE WinBUGS ENVIRONMENT
3.3 PRELIMINARIES ON USING WinBUGS
3.4 BUILDING BAYESIAN MODELS IN WinBUGS
3.5 COMPILING THE MODEL AND SIMULATING VALUES
3.6 BASIC OUTPUT ANALYSIS USING THE SAMPLE MONITOR TOOL
3.7 SUMMARIZING THE PROCEDURE
3.8 CHAPTER SUMMARY AND CONCLUDING COMMENTS
Problems
CHAPTER 4: WinBUGS SOFTWARE: ILLUSTRATION, RESULTS, AND FURTHER ANALYSIS
4.1 A COMPLETE EXAMPLE OF RUNNING MCMC IN WinBUGS FOR A SIMPLE MODEL
4.2 FURTHER OUTPUT ANALYSIS USING THE INFERENCE MENU
4.3 MULTIPLE CHAINS
4.4 CHANGING THE PROPERTIES OF A FIGURE
4.5 OTHER TOOLS AND MENUS
4.6 SUMMARY AND CONCLUDING REMARKS
Problems
CHAPTER 5: INTRODUCTION TO BAYESIAN MODELS: NORMAL MODELS
5.1 GENERAL MODELING PRINCIPLES
5.2 MODEL SPECIFICATION IN NORMAL REGRESSION MODELS
5.3 USING VECTORS AND MULTIVARIATE PRIORS IN NORMAL REGRESSION MODELS
5.4 ANALYSIS OF VARIANCE MODELS
Problems
CHAPTER 6: INCORPORATING CATEGORICAL VARIABLES IN NORMAL MODELS AND FURTHER MODELING ISSUES
6.1 ANALYSIS OF VARIANCE MODELS USING DUMMY VARIABLES
6.2 ANALYSIS OF COVARIANCE MODELS
6.3 A BIOASSAY EXAMPLE
6.4 FURTHER MODELING ISSUES
6.5 CLOSING REMARKS
Problems
CHAPTER 7: INTRODUCTION TO GENERALIZED LINEAR MODELS: BINOMIAL AND POISSON DATA
7.1 INTRODUCTION
7.2 PRIOR DISTRIBUTIONS
7.3 POSTERIOR INFERENCE
7.4 POISSON REGRESSION MODELS
7.5 BINOMIAL RESPONSE MODELS
7.6 MODELS FOR CONTINGENCY TABLES
Problems
CHAPTER 8: MODELS FOR POSITIVE CONTINUOUS DATA, COUNT DATA, AND OTHER GLM-BASED EXTENSIONS
8.1 MODELS WITH NONSTANDARD DISTRIBUTIONS
8.2 MODELS FOR POSITIVE CONTINUOUS RESPONSE VARIABLES
8.3 ADDITIONAL MODELS FOR COUNT DATA
8.4 FURTHER GLM-BASED MODELS AND EXTENSIONS
Problems
CHAPTER 9: BAYESIAN HIERARCHICAL MODELS
9.1 INTRODUCTION
9.2 SOME SIMPLE EXAMPLES
9.3 THE GENERALIZED LINEAR MIXED MODEL FORMULATION
9.4 DISCUSSION, CLOSING REMARKS, AND FURTHER READING
Problems
CHAPTER 10: THE PREDICTIVE DISTRIBUTION AND MODEL CHECKING
10.1 INTRODUCTION
10.2 ESTIMATING THE PREDICTIVE DISTRIBUTION FOR FUTURE OR MISSING OBSERVATIONS USING MCMC
10.3 USING THE PREDICTIVE DISTRIBUTION FOR MODEL CHECKING
10.4 USING CROSS-VALIDATION PREDICTIVE DENSITIES FOR MODEL CHECKING, EVALUATION, AND COMPARISON
10.5 ILLUSTRATION OF A COMPLETE PREDICTIVE ANALYSIS: NORMAL REGRESSION MODELS
10.6 DISCUSSION
Problems
CHAPTER 11: BAYESIAN MODEL AND VARIABLE EVALUATION
11.1 PRIOR PREDICTIVE DISTRIBUTIONS AS MEASURES OF MODEL COMPARISON: POSTERIOR MODEL ODDS AND BAYES FACTORS
11.2 SENSITIVITY OF THE POSTERIOR MODEL PROBABILITIES: THE LINDLEY-BARTLETT PARADOX
11.3 COMPUTATION OF THE MARGINAL LIKELIHOOD
11.4 COMPUTATION OF THE MARGINAL LIKELIHOOD USING WinBUGS
11.5 BAYESIAN VARIABLE SELECTION USING GIBBS-BASED METHODS
11.6 POSTERIOR INFERENCE USING THE OUTPUT OF BAYESIAN VARIABLE SELECTION SAMPLERS
11.7 IMPLEMENTATION OF GIBBS VARIABLE SELECTION IN WinBUGS USING AN ILLUSTRATIVE EXAMPLE
11.8 THE CARLIN–CHIB METHOD
11.9 REVERSIBLE JUMP MCMC (RJMCMC)
11.10 USING POSTERIOR PREDICTIVE DENSITIES FOR MODEL EVALUATION
11.11 INFORMATION CRITERIA
11.12 DISCUSSION AND FURTHER READING
Problems
APPENDIX A: MODEL SPECIFICATION VIA DIRECTED ACYCLIC GRAPHS: THE DOODLE MENU
A.1 INTRODUCTION: STARTING WITH DOODLE
A.2 NODES
A.3 EDGES
A.4 PANELS
A.5 A SIMPLE EXAMPLE
APPENDIX B: THE BATCH MODE: RUNNING A MODEL IN THE BACKGROUND USING SCRIPTS
B.1 INTRODUCTION
B.2 BASIC COMMANDS: COMPILING AND RUNNING THE MODEL
APPENDIX C: CHECKING CONVERGENCE USING CODA/BOA
C.1 INTRODUCTION
C.2 A SHORT HISTORICAL REVIEW
C.3 DIAGNOSTICS IMPLEMENTED BY CODA/BOA
C.4 A FIRST LOOK AT CODA/BOA
C.5 A SIMPLE EXAMPLE
APPENDIX D: NOTATION SUMMARY
D.1 MCMC
D.2 SUBSCRIPTS AND INDICES
D.3 PARAMETERS
D.4 RANDOM VARIABLES AND DATA
D.5 SAMPLE ESTIMATES
D.6 SPECIAL FUNCTIONS, VECTORS, AND MATRICES
D.7 DISTRIBUTIONS
D.8 DISTRIBUTION-RELATED NOTATION
D.9 NOTATION USED IN ANOVA AND ANCOVA
D.10 VARIABLE AND MODEL SPECIFICATION
D.11 DEVIANCE INFORMATION CRITERION (DIC)
D.12 PREDICTIVE MEASURES
REFERENCES
INDEX
WILEY SERIES IN COMPUTATIONAL STATISTICS
Consulting Editors:
Paolo Giudici
University of Pavia, Italy
Geof H. Givens
Colorado State University, USA
Bani K. Mallick
Texas A&M University, USA
Wiley Series in Computational Statistics is comprised of practical guides and cutting edge research books on new developments in computational statistics. It features quality authors with a strong applications focus. The texts in the series provide detailed coverage of statistical concepts, methods and case studies in areas at the interface of statistics, computing, and numerics.
With sound motivation and a wealth of practical examples, the books show in concrete terms how to select and to use appropriate ranges of statistical computing techniques in particular fields of study. Readers are assumed to have a basic understanding of introductory terminology.
The series concentrates on applications of computational methods in statistics to fields of bioinformatics, genomics, epidemiology, business, engineering, finance and applied statistics.
A complete list of titles in this series appears at the end of the volume.
Copyright © 2009 by John Wiley & Sons, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic format. For information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data is available.
Ntzoufras, Ioannis, 1973-
Bayesian modeling using WinBUGS / Ioannis Ntzoufras.
p. cm.
Includes bibliographical references and index.
ISBN 978-0-470-14114-4 (pbk.)
1. Bayesian statistical decision theory. 2. WinBUGS. I. Title.
QA279.5.N89 2009
519.5′42—dc22 2008033316
To Ioanna and our baby daughter
PREFACE
Since the mid-1980s, the development of widely accessible powerful computers and the implementation of Markov chain Monte Carlo (MCMC) methods have led to an explosion of interest in Bayesian statistics and modeling. This was followed by an extensive research for new Bayesian methodologies generating the practical application of complicated models used over a wide range of sciences. During the late 1990s, BUGS emerged in the foreground. BUGS was a free software that could fit complicated models in a relatively easy manner, using standard MCMC methods. Since 1998 or so, WinBUGS, the Windows version of BUGS, has earned great popularity among researchers of diverse scientific fields. Therefore, an increased need for an introductory book related to Bayesian models and their implementation via WinBUGS has been realized.
The objective of the present book is to offer an introduction to the principles of Bayesian modeling, with emphasis on model building and model implementation using WinBUGS. Detailed examples are provided, ranging from very simple to more advanced and realistic ones. Generalized linear models (GLMs), which are familiar to most students and researchers, are discussed. Details concerning model building, prior specification, writing the WinBUGS code and the analysis and interpretation of the WinBUGS output are also provided. Because of the introductory character of the book, I focused on elementary models, starting from the normal regression models and moving to generalized linear models. Even more advanced readers, familiar with such models, may benefit from the Bayesian implementation using WinBUGS.
Basic knowledge of probability theory and statistics is assumed. Computations that could not be performed in WinBUGS are illustrated using R. Therefore, a minimum knowledge of R is also required.
This manuscript can be used as the main textbook in a second-level course of Bayesian statistics focusing on modeling and/or computation. Alternatively, it can serve as a companion (to a main textbook) in an introductory course of a Bayesian statistics. Finally, because of its structure, postgraduate students and other researchers can complete a self-taught tutorial course on Bayesian modeling by following the material of this book.
All datasets and code used in the book are available in the book’s Webpage: www.stat-athens.aueb.gr/~jbn/winbugs_book.
IOANNIS NTZOUFRAS
Athens, Greece
June 29, 2008
ACKNOWLEDGMENTS
I am indebted to the people at Wiley publications for their understanding and assistance during the preparation of the manuscript. Acknowledgments are due to the anonymous referees. Their suggestions and comments led to a substantial improvement of the present book. I would particularly like to thank Dimitris Fouskakis, colleague and good friend, for his valuable comments on an early version of chapters 1-6 and 10-11. I am also grateful to Professor Brani Vidakovic for proposing and motivating this book. Last but not least, I wish to thank my wife Ioanna for her love, support, and patience during the writing of this book as well as for her suggestions on the manuscript.
I. N.
ACRONYMS
ACF | Autocorrelation |
AIC | Akaike information criterion |
ANOVA | Analysis of variance |
ANCOVA | Analysis of covariance |
AR | Attributable risk |
BF | Bayes factor |
BIC | Bayes information criterion |
BOA | Bayesian output analysis (R package) |
BP | Bivariate Poisson |
BOD | Biological oxygen demand (data variable in example 6.3) |
BUGS | Bayesian inference using Gibbs (software) |
CDF | Cumulative distribution function |
COD | Chemical oxygen demand (data variable in example 6.3) |
CODA | Convergence diagnostics and output analysis software for Gibbs sampling analysis (R package) |
CPO | Conditional Predictive Ordinate |
CR | corner (constraint) |
CV | Cross-validation |
CV-1 | Leave-one-out cross-validation |
DAG | Directed acyclic graph |
DI | Dispersion index |
DIBP | Diagonal inflated bivariate Poisson distribution |
DIC | Deviance information criterion |
GLM | Generalized linear model |
GP | Generalized Poisson |
GVS | Gibbs variable selection |
ICPO | Inverse conditional predictive ordinate |
i.i.d. | Independent identically distributed |
LS | Logarithmic score |
MAP | Maximum a posteriori |
MP model | Median probability |
MCMC | Markov chain Monte Carlo |
MCE | Monte Carlo error |
ML | Maximum likelihood |
MLE | Maximum-likelihood estimate/estimator |
NB | Negative binomial |
OR | Odds ratio |
PBF | Posterior Bayes factor |
PD | Poisson difference |
p.d.f. | Probability density function |
PO | Posterior model odds |
PPO | Posterior predictive ordinate |
RJMCMC | Reversible jump Markov chain Monte Carlo |
RR | Relative risk |
SD | Standard deviation |
SE | Standard error |
SSVS | Stochastic search variable selection |
STZ | sum-to-zero (constraint) |
TS | Total solids(data variable in example 6.3) |
TVS | Total volatile solids (data variable in example 6.3) |
WinBUGS | Windows version of BUGS (software) |
ZI | Zero inflated |
ZID | Zero inflated distribution |
ZIP | Zero inflated Poisson distribution |
ZINB | Zero inflated negative binomial distribution |
ZIGP | Zero inflated generalized Poisson distribution |
ZIBP | Zero inflated bivariate Poisson distribution |