This edition first published 2016 © 2016 by John Wiley & Sons Ltd
Registered office:
John Wiley & Sons, Ltd, The Atrium, Southern Gate, Chichester,
West Sussex, PO19 8SQ, UK
111 River Street, Hoboken, NJ 07030-5774, USA
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com/wiley-blackwell.
The right of the author to be identified as the author of this work has been asserted in accordance with the UK Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and author(s) have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloging-in-Publication Data
Gierliński, Marek, author.
Understanding statistical error: a primer for biologists / Marek Gierliński.
p. ; cm.
Includes bibliographical references and index.
ISBN 978-1-119-10691-3 (pbk.)
I. Title.
[DNLM: 1. Statistics as Topic. 2. Analysis of Variance. 3. Biostatistics. 4. Computational
Biology–methods. 5. Probability. 6. Statistical Distributions. WA 950]
R853.S7
610.72′7–dc23
2015024748
A catalogue record for this book is available from the British Library.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Cover image: © Lonely_/iStockphoto
Errors, like straws, upon the surface flow;
He who would search for pearls must dive below
—John Dryden (1631–1700)
It is common that each nonfiction book is preceded by an ‘introduction’, or a ‘preface’, or a ‘foreword’ or sometimes a combination of the above. If you are (un)lucky, you might find a note from the Editor, a foreword followed by the preface to the first edition, a preface to the second edition and a general introduction. There, first of all, you can read about how great the author is. Next, you will find that the book is unique and better than all other books on the topic written so far. Then, the author will delve into painstakingly detailed description of each chapter, which by the way can be found in the table of contents. Finally, there is time for compulsory acknowledgements to all family and friends who the author forced into reading his or her magnum opus. There is no escaping; forewords, prefaces and introductions are everywhere. Stanisław Lem once wrote a book consisting entirely of forewords (Lem 1979).
People usually skip all of these intros as they are boring, pretentious, self-righteous and useless. All right, are you still with me? If you managed to get that far, you might be one of the few who actually read introductions. Very well, then. I'll try to be brief, down to the point and not too conceited.
As the title suggests, the book is about error analysis, with emphasis on applications in biology or, more generally, in life sciences. Since the time of the great Ronald Fisher, statistics have become an inherent part of biology. Very few numerical results from either biological or medical studies can make their way into publication without confirming their statistical significance. One way of doing this is by providing a p-value from a statistical test, or – roughly speaking – a probability of being wrong in a particular statement. That is what this book is not about.
The other way of assessing the significance of a result is by finding its inherent error, or uncertainty. In my mind, a numerical result quoted without any kind of uncertainty is meaningless. Hence, it is good to know how to calculate errors. And that is what the book is about.
Here I discuss various aspects of error analysis: a bit of theoretical background and practical ways of calculating confidence intervals, but also graphical presentation of error bars and quoting numbers with errors. I put emphasis on intuition and understanding rather than practical computational recipes, although I give exact formulae for types of errors. Beware: this is not a comprehensive book on statistics; it is rather focused on practical understanding of uncertainty analysis. You can find more details in the table of contents, right after the introduction.
This book is written for an inquisitive biologist who wants to improve his or her understanding of data analysis. While a biologist is my target reader, the book may be useful for anyone who deals with numerical data and wants to learn more about how to evaluate and compare measurements. If you calculate various types of errors using a software package and you would like to find out where these errors come from, this book is for you. If you use standard deviations, standard errors and confidence intervals, but you are not sure what they really mean, this book is for you. If you struggle with finding errors of the median or correlation coefficient, this book is for you. Or, perhaps you are just curious and would like to learn a few basic things about uncertainty analysis – this book is also for you.
Despite the existence of a few attempts in the literature that use a purely intuitive approach (e.g. Motulsky 2010), I believe that it is very difficult to do statistics without maths. Plain English explanations cannot replace the strict precision of a mathematical equation. A simple derivation can explain where a given formula came from. Hence, there is maths in this book. Not very complex, not very extensive, but maths there is.
Needless to say, equations are required in practical applications, so if you need to find a particular uncertainty not provided by the statistical software you normally use, you can employ equations from this book. They can be easily encoded, either in any programming language or even in a computer spreadsheet. Mathematics in this book is quite basic; it doesn't really go beyond the level taught in a typical secondary school. Most equations contain simple algebra and sums. The most advanced operator I use is a derivative.
I don't want to scare potential readers away. This is not a mathematical textbook! I apply equations only when necessary and I always try to accompany them with an intuitive explanation. Often, I show the results of a computer simulation to illustrate the meaning of a concept or formula. I have also made a few simplifications and approximations here and there at the expense of mathematical correctness. I hope this makes the maths in this book much easier to understand.
I need to finish with a caveat. This is a book written primarily for biologists, not for mathematicians or physicists. Hence, there are no mathematical proofs, some derivations are not strict and there is a general lack of mathematical rigour. A mathematician might scowl at the content of this book, so if you are one, please shut your eyes now.
I would like to thank Professor Angus Lamond, who carefully read the manuscript from cover to cover and gave me a great deal of invaluable comments. Being a biologist, he helped me to understand better my target reader (you!). He also helped me with my English, which is not my first language.