Contents
Cover
Title Page
Copyright
Preface
Part I: Fuzzy Information
1: Fuzzy data
1.1 One-dimensional fuzzy data
1.2 Vector-valued fuzzy data
1.3 Fuzziness and variability
1.4 Fuzziness and errors
1.5 Problems
2: Fuzzy numbers and fuzzy vectors
2.1 Fuzzy numbers and characterizing functions
2.2 Vectors of fuzzy numbers and fuzzy vectors
2.3 Triangular norms
2.4 Problems
3: Mathematical operations for fuzzy quantities
3.1 Functions of fuzzy variables
3.2 Addition of fuzzy numbers
3.3 Multiplication of fuzzy numbers
3.4 Mean value of fuzzy numbers
3.5 Differences and quotients
3.6 Fuzzy valued functions
3.7 Problems
Part II: Descriptive Statistics for Fuzzy Data
4: Fuzzy samples
4.1 Minimum of fuzzy data
4.2 Maximum of fuzzy data
4.3 Cumulative sum for fuzzy data
4.4 Problems
5: Histograms for fuzzy data
5.1 Fuzzy frequency of a fixed class
5.2 Fuzzy frequency distributions
5.3 Axonometric diagram of the fuzzy histogram
5.4 Problems
6: Empirical distribution functions
6.1 Fuzzy valued empirical distribution function
6.2 Fuzzy empirical fractiles
6.3 Smoothed empirical distribution function
6.4 Problems
7: Empirical correlation for fuzzy data
7.1 Fuzzy empirical correlation coefficient
7.2 Problems
Part III: Foundations of Statistical Inference With Fuzzy Data
8: Fuzzy probability distributions
8.1 Fuzzy probability densities
8.2 Probabilities based on fuzzy probability densities
8.3 General fuzzy probability distributions
8.4 Problems
9: A law of large numbers
9.1 Fuzzy random variables
9.2 Fuzzy probability distributions induced by fuzzy random variables
9.3 Sequences of fuzzy random variables
9.4 Law of large numbers for fuzzy random variables
9.5 Problems
10: Combined fuzzy samples
10.1 Observation space and sample space
10.2 Combination of fuzzy samples
10.3 Statistics of fuzzy data
10.4 Problems
Part IV: Classical Statistical Inference for Fuzzy Data
11: Generalized point estimators
11.1 Estimators based on fuzzy samples
11.2 Sample moments
11.3 Problems
12: Generalized confidence regions
12.1 Confidence functions
12.2 Fuzzy confidence regions
12.3 Problems
13: Statistical tests for fuzzy data
13.1 Test statistics and fuzzy data
13.2 Fuzzy p-values
13.3 Problems
Part V: Bayesian Inference and Fuzzy Information
14: Bayes’ theorem and fuzzy information
14.1 Fuzzy a priori distributions
14.2 Updating fuzzy a priori distributions
14.3 Problems
15: Generalized Bayes’ theorem
15.1 Likelihood function for fuzzy data
15.2 Bayes’ theorem for fuzzy a priori distribution and fuzzy data
15.3 Problems
16: Bayesian confidence regions
16.1 Bayesian confidence regions based on fuzzy data
16.2 Fuzzy HPD-regions
16.3 Problems
17: Fuzzy predictive distributions
17.1 Discrete case
17.2 Discrete models with continuous parameter space
17.3 Continuous case
17.4 Problems
18: Bayesian decisions and fuzzy information
18.1 Bayesian decisions
18.2 Fuzzy utility
18.3 Discrete state space
18.4 Continuous state space
18.5 Problems
Part VI: Regression Analysis and Fuzzy Information
19: Classical regression analysis
19.1 Regression models
19.2 Linear regression models with Gaussian dependent variables
19.3 General linear models
19.4 Nonidentical variances
19.5 Problems
20: Regression models and fuzzy data
20.1 Regression Models and Fuzzy Data
20.2 Generalized estimators for linear regression models based on the extension principle
20.3 Generalized confidence regions for parameters
20.4 Prediction in fuzzy regression models
20.5 Problems
21: Bayesian regression analysis
21.1 Calculation of a posteriori distributions
21.2 Bayesian confidence regions
21.3 Probabilities of Hypotheses
21.4 Predictive distributions
21.5 A posteriori Bayes estimators for regression parameters
21.6 Bayesian regression with Gaussian distributions
21.7 Problems
22: Bayesian regression analysis and fuzzy information
22.1 Fuzzy estimators of regression parameters
22.2 Generalized Bayesian confidence regions
22.3 Fuzzy predictive distributions
22.4 Problems
Part VII: Fuzzy time series
23: Mathematical concepts
23.1 Support functions of fuzzy quantities
23.2 Distances of fuzzy quantities
23.3 Generalized Hukuhara difference
24: Descriptive methods for fuzzy time series
24.1 Moving averages
24.2 Filtering
24.3 Exponential smoothing
24.4 Components model
24.5 Difference filters
24.6 Generalized Holt–Winter method
24.7 Presentation in the frequency domain
25: More on fuzzy random variables and fuzzy random vectors
25.1 Basics
25.2 Expectation and variance of fuzzy random variables
25.3 Covariance and correlation
25.4 Further results
26: Stochastic methods in fuzzy time series analysis
26.1 Linear approximation and prediction
26.2 Remarks concerning Kalman filtering
Part VIII: Appendices
A1: List of symbols and abbreviations
A2: Solutions to the problems
Chapter 1
Chapter 2
Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 9
Chapter 10
Chapter 11
Chapter 12
Chapter 13
Chapter 14
Chapter 15
Chapter 16
Chapter 17
Chapter 18
Chapter 19
Chapter 20
Chapter 21
Chapter 22
A3: Glossary
A4: Related literature
References
Index
This edition first published 2011
© 2011 John Wiley & Sons, Ltd
Registered office
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.
The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloging-in-Publication Data
Viertl, R. (Reinhard)
Statistical methods for fuzzy data / Reinhard Viertl.
p. cm.
Includes bibliographical references and index.
ISBN 978-0-470-69945-4 (cloth)
1. Fuzzy measure theory. 2. Fuzzy sets. 3. Mathematical statistics. I. Title.
QA312.5.V54 2010
515′.42–dc22
2010031105
A catalogue record for this book is available from the British Library.
Print ISBN: 978-0-470-69945-4
ePDF ISBN: 978-0-470-97442-1
oBook ISBN: 978-0-470-97441-4
ePub ISBN: 978-0-470-97456-8
Preface
Statistics is concerned with the analysis of data and estimation of probability distribution and stochastic models. Therefore the quantitative description of data is essential for statistics.
In standard statistics data are assumed to be numbers, vectors or classical functions. But in applications real data are frequently not precise numbers or vectors, but often more or less imprecise, also called fuzzy. It is important to note that this kind of uncertainty is different from errors; it is the imprecision of individual observations or measurements.
Whereas counting data can be precise, possibly biased by errors, measurement data of continuous quantities like length, time, volume, concentrations of poisons, amounts of chemicals released to the environment and others, are always not precise real numbers but connected with imprecision.
In measurement analysis usually statistical models are used to describe data uncertainty. But statistical models are describing variability and not the imprecision of individual measurement results. Therefore other models are necessary to quantify the imprecision of measurement results.
For a special kind of data, e.g. data from digital instruments, interval arithmetic can be used to describe the propagation of data imprecision in statistical inference. But there are data of a more general form than intervals, e.g. data obtained from analog instruments or data from oscillographs, or graphical data like color intensity pictures. Therefore it is necessary to have a more general numerical model to describe measurement data.
The most up-to-date concept for this is special fuzzy subsets of the set of real numbers, or special fuzzy subsets of the k-dimensional Euclidean space k in the case of vector quantities. These special fuzzy subsets of are called nonprecise numbers in the one-dimensional case and nonprecise vectors in the k-dimensional case for k > 1. Nonprecise numbers are defined by so-called characterizing functions and nonprecise vectors by so-called vector-characterizing functions. These are generalizations of indicator functions of classical sets in standard set theory. The concept of fuzzy numbers from fuzzy set theory is too restrictive to describe real data. Therefore nonprecise numbers are introduced.
By the necessity of the quantitative description of fuzzy data it is necessary to adapt statistical methods to the situation of fuzzy data. This is possible and generalized statistical procedures for fuzzy data are described in this book.
There are also other approaches for the analysis of fuzzy data. Here an approach from the viewpoint of applications is used. Other approaches are mentioned in Appendix A4.
Besides fuzziness of data there is also fuzziness of a priori distributions in Bayesian statistics. So called fuzzy probability distributions can be used to model nonprecise a priori knowledge concerning parameters in statistical models.
In the text the necessary foundations of fuzzy models are explained and basic statistical analysis methods for fuzzy samples are described. These include generalized classical statistical procedures as well as generalized Bayesian inference procedures.
A software system for statistical analysis of fuzzy data (AFD) is under development. Some procedures are already available, and others are in progress. The available software can be obtained from the author.
Last but not least I want to thank all persons who contributed to this work: Dr D. Hareter, Mr H. Schwarz, Mrs D. Vater, Dr I. Meliconi, H. Kay, P. Sinha-Sahay and B. Kaur from Wiley for the excellent cooperation, and my wife Dorothea for preparing the files for the last two parts of this book.
I hope the readers will enjoy the text.
Reinhard Viertl
Vienna, Austria
July 2010
Part I
FUZZY INFORMATION
Fuzzy information is a special kind of information and information is an omnipresent word in our society. But in general there is no precise definition of information.
However, in the context of statistics which is connected to uncertainty, a possible definition of information is the following: Information is everything which has influence on the assessment of uncertainty by an analyst. This uncertainty can be of different types: data uncertainty, nondeterministic quantities, model uncertainty, and uncertainty of a priori information.
Measurement results and observational data are special forms of information. Such data are frequently not precise numbers but more or less nonprecise, also called fuzzy. Such data will be considered in the first chapter.
Another kind of information is probabilities. Standard probability theory is considering probabilities to be numbers. Often this is not realistic, and in a more general approach probabilities are considered to be so-called fuzzy numbers.
The idea of generalized sets was originally published in Menger (1951) and the term ‘fuzzy set’ was coined in Zadeh (1965).
1
Fuzzy data
All kinds of data which cannot be presented as precise numbers or cannot be precisely classified are called nonprecise or fuzzy. Examples are data in the form of linguistic descriptions like high temperature, low flexibility and high blood pressure. Also, precision measurement results of continuous variables are not precise numbers but always more or less fuzzy.
1.1 One-dimensional fuzzy data
Measurement results of one-dimensional continuous quantities are frequently idealized to be numbers times a measurement unit. However, real measurement results of continuous quantities are never precise numbers but always connected with uncertainty. Usually this uncertainty is considered to be statistical in nature, but this is not suitable since statistical models are suitable to describe variability. For a single measurement result there is no variability, therefore another method to model the measurement uncertainty of individual measurement results is necessary. The best up-to-date mathematical model for that are so-called fuzzy numbers which are described in Section 2.1 [cf. Viertl (2002)].
Examples of one-dimensional fuzzy data are lifetimes of biological units, length measurements, volume measurements, height of a tree, water levels in lakes and rivers, speed measurements, mass measurements, concentrations of dangerous substances in environmental media, and so on.
A special kind of one-dimensional fuzzy data are data in the form of intervals [a;b]⊆. Such data are generated by digital measurement equipment, because they have only a finite number of digits.
1.2 Vector-valued fuzzy data
Many statistical data are multivariate, i.e. ideally the corresponding measurement results are real vectors (x1, … , xk)∈k. In applications such data are frequently not precise vectors but to some degree fuzzy. A mathematical model for this kind of data is so-called fuzzy vectors which are formalized in Section 2.2.
Examples of vector valued fuzzy data are locations of objects in space like positions of ships on radar screens, space–time data, multivariate nonprecise data in the form of vectors (x1*,…,xn*) of fuzzy numbers xi*.
1.3 Fuzziness and variability
In statistics frequently so-called stochastic quantities (also called random variables) are observed, where the observed results are fuzzy. In this situation two kinds of uncertainty are present: Variability, which can be modeled by probability distributions, also called stochastic models, and fuzziness, which can be modeled by fuzzy numbers and fuzzy vectors, respectively. It is important to note that these are two different kinds of uncertainty. Moreover it is necessary to describe fuzziness of data in order to obtain realistic results from statistical analysis. In Figure 1.1 the situation is graphically outlined.
Real data are also subject to a third kind of uncertainty: errors. These are the subject of Section 1.4.
1.4 Fuzziness and errors
In standard statistics errors are modeled in the following way. The observation y of a stochastic quantity is not its true value x, but superimposed by a quantity e, called error, i.e.
The error is considered as the realization of another stochastic quantity. These kinds of errors are denoted as random errors.
For one-dimensional quantities, all three quantities x, y, and e are, after the experiment, real numbers. But this is not suitable for continuous variables because the observed values y are fuzzy.
It is important to note that all three kinds of uncertainty are present in real data. Therefore it is necessary to generalize the mathematical operations for real numbers to the situation of fuzzy numbers.
1.5 Problems
a. Find examples of fuzzy numerical data which are not given in Section 1.1 and Section 1.2.
b. Work out the difference between stochastic uncertainty and fuzziness of individual observations.
c. Make clear how data in the form of intervals are obtained by digital measurement devices.
d. What do X-ray pictures and data from satellite photographs have in common?