This edition first published 2017
© 2017 John Wiley & Sons, Ltd

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Bangjun Lei, Dick de Ridder, David M. J. Tax, Ferdinand van der Heijden, Guangzhu Xu, Ming Feng, Yaobin Zou to be identified as the authors of this work has been asserted in accordance with law.

Registered Offices

John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
John Wiley & Sons, Ltd., The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

Editorial Office
The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of Warranty:
MATLAB^¯ is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does not warrant the accuracy of the text or exercises in this book. This work's use or discussion of MATLAB^¯ software or related products does not constitute endorsement or sponsorship by The MathWorks of a particular pedagogical approach or particular use of the MATLAB^¯ software. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging-in-Publication Data

Names: Heijden, Ferdinand van der. | Lei, Bangjun, 1973– author. | Xu, Guangzhu, 1979– author.
| Ming, Feng, 1957– author. | Zou, Yaobin, 1978– author. | Ridder, Dick de, 1971– author. | Tax,
David M. J., 1973– author.
Title: Classification, parameter estimation, and state estimation : an engineering approach using
MATLAB / Bangjun Lei, Guangzhu Xu, Ming Feng, Yaobin Zou, Ferdinand van der Heijden,
Dick de Ridder, David M. J. Tax.
Description: Second edition. | Hoboken, NJ, USA : John Wiley & Sons, Inc., 2017. | Revised edition
of: Classification, parameter estimation, and state estimation : an engineering approach using
MATLAB / F. van der Heijden … [et al.]. 2004. | Includes bibliographical references and index.
Identifiers: LCCN 2016059294 (print) | LCCN 2016059809 (ebook) | ISBN 9781119152439 (cloth)
| ISBN 9781119152446 (pdf) | ISBN 9781119152453 (epub)
Subjects: LCSH: Engineering mathematics--Data processing. | MATLAB. | Measurement--Data
processing. | Estimation theory--Data processing.
Classification: LCC TA331 .C53 2017 (print) | LCC TA331 (ebook) | DDC 681/.2--dc23
LC record available at https://lccn.loc.gov/2016059294

Cover Design: Wiley
Cover Images: neural network © maxuser/Shutterstock; digital circuit board
© Powderblue/Shutterstock

CONTENTS

Preface

Note

Acknowledgements

About the Companion Website

1 Introduction

1.1 The Scope of the Book
1.2 Engineering
1.3 The Organization of the Book
1.4 Changes from First Edition
1.5 References
Note

2 PRTools Introduction

2.1 Motivation
2.2 Essential Concepts
2.3 PRTools Organization Structure and Implementation
2.4 Some Details about PRTools
2.5 Selected Bibliography

3 Detection and Classification

3.1 Bayesian Classification
3.2 Rejection
3.3 Detection: The Two-Class Case
3.4 Selected Bibliography
Exercises

4 Parameter Estimation

4.1 Bayesian Estimation
4.2 Performance Estimators
4.3 Data Fitting
4.4 Overview of the Family of Estimators
4.5 Selected Bibliography
Exercises
Notes

5 State Estimation

5.1 A General Framework for Online Estimation
5.2 Infinite Discrete-Time State Variables
5.3 Finite Discrete-Time State Variables
5.4 Mixed States and the Particle Filter
5.5 Genetic State Estimation
5.6 State Estimation in Practice
5.7 Selected Bibliography
Exercises

6 Supervised Learning

6.1 Training Sets
6.2 Parametric Learning
6.3 Non-parametric Learning
6.4 Adaptive Boosting – Adaboost
6.5 Convolutional Neural Networks (CNNs)
6.6 Empirical Evaluation
6.7 Selected Bibliography
Exercises
Note

7 Feature Extraction and Selection

7.1 Criteria for Selection and Extraction
7.2 Feature Selection
7.3 Linear Feature Extraction
7.4 References
Exercises

8 Unsupervised Learning

8.1 Feature Reduction
8.2 Clustering
8.3 References
Exercises
Note

9 Worked Out Examples

9.1 Example on Image Classification with PRTools
9.2 Boston Housing Classification Problem
9.3 Time-of-Flight Estimation of an Acoustic Tone Burst
9.4 Online Level Estimation in a Hydraulic System
9.5 References

A Topics Selected from Functional Analysis

A.1 Linear Spaces
A.2 Metric Spaces
A.3 Orthonormal Systems and Fourier Series
A.4 Linear Operators
A.5 Selected Bibliography
Notes

B Topics Selected from Linear Algebra and Matrix Theory

B.1 Vectors and Matrices
B.2 Convolution
B.3 Trace and Determinant
B.4 Differentiation of Vector and Matrix Functions
B.5 Diagonalization of Self-Adjoint Matrices
B.6 Singular Value Decomposition (SVD)
B.7 Selected Bibliography
Note

C Probability Theory

C.1 Probability Theory and Random Variables
C.2 Bivariate Random Variables
C.3 Random Vectors
C.4 Selected Bibliography
Notes

D Discrete-Time Dynamic Systems

D.1 Discrete-Time Dynamic Systems
D.2 Linear Systems
D.3 Linear Time-Invariant Systems
Selected Bibliography

Index

EULA

List of Illustrations

Chapter 1

Figure 1.1 Licence plate recognition: a classification problem with noisy measurements.
Figure 1.2 Position measurement: a parameter estimation problem handling uncertainties.
Figure 1.3 Assessment of water levels in a water management system: a state estimation problem (the data are obtained from a scale model).
Figure 1.4 Relations between the subjects.
Figure 1.5 An elementary step in the design process (Finkelstein and Finkelstein, 1994).

Chapter 2

Figure 2.1 The workflow of PRTools.

Chapter 3

Figure 3.1 Pattern classification.
Figure 3.2 Classification of mechanical parts: (a) image of various objects, (b) scatter diagram.
Figure 3.3 Statistical pattern classification.
Figure 3.4 Probability densities of the measurements shown in Figure 3.3. (a) The 3D plot of the unconditional density together with a 2D contour plot of this density on the ground plane. (b) 2D contour plots of the conditional probability densities.
Figure 3.5 Bayes classification. (a) With prior probabilities: P (bolt) = 0:21, P (nut) = 0:30, P (ring) = 0:29, and P (scrap) = 0:20. (b) With increased prior probability for scrap: P (scrap) = 0:50. (c) With uniform cost function.
Figure 3.6 Bayes’ decision function with the uniform cost function (MAP classification).
Figure 3.7 Minimum Mahalanobis distance classification. (a) Scatter diagram with contour plot of the conditional probability densities. (b) Decision boundaries.
Figure 3.8 Decision boundary of a minimum distance classifier.
Figure 3.9 Minimum distance classification. (a) Scatter diagram with a contour plot of the conditional probability densities. (b) Decision boundaries.
Figure 3.10 Classification of objects with equal expectation vectors. (a) Rotational symmetric conditional probability densities. (b) Conditional probability densities with different orientations (see text).
Figure 3.11 Bayes’ classification with the reject option included.
Figure 3.12 The conditional probability densities of the log-likelihood ratio in the Gaussian case with C₁ = C₂ = C.
Figure 3.13 Performance of a detector in the Gaussian case with equal covariance matrices. (a) P_miss, P_det and P_fa versus the threshold T. (b) P_det versus P_fa as a parametric plot of T.
Figure 3.14 Quality inspection system for the recycling of bottles.
Figure 3.15 Acquired images of two different bottles. (a) Image of the mouth of a new bottle. (b) Image of the mouth of an older bottle with clearly visible intrusions.
Figure 3.16 Estimated performance of the bottle inspector. (a) The conditional probability densities of the log-likelihood ratio. (b) The ROC curve.

Chapter 4

Figure 4.1 Parameter estimation.
Figure 4.2 Different realizations of the backscattering coefficient and its corresponding measurement.
Figure 4.3 Parameter estimation.
Figure 4.4 Probability densities for the backscattering coefficient. (a) Prior density p(x). (b) Conditional density p(z|x) with N_probes = 8. The two axes have been scaled with x and 1/x, respectively, to obtain invariance with respect to x.
Figure 4.5 Three different Bayesian estimators.
Figure 4.6 MAP estimation, ML estimation and linear MMSE estimation.
Figure 4.7 The bias and the variance of the various estimators in the backscattering problem.
Figure 4.8 LS estimation of the diameter D and the position y₀ of a blood vessel. (a) X-ray image of the blood vessel. (b) Cross-section of the image together with a fitted profile. (c) The sum of least squares errors as a function of the diameter and the position.
Figure 4.9 A robust error norm and its derivative.
Figure 4.10 LS estimation of the diameter D and the position y₀. (a) Cross-section of the image together with a profile fitted with the LS norm. (b) The LS norm as a function of the diameter and the position.
Figure 4.11 Robust estimation of the diameter D and the position y₀. (a) Cross-section of the image together with a profile fitted with a robust error norm. (b) The robust error norm as a function of the diameter and the position.
Figure 4.12 Determination of a calibration curve by means of polynomial regression.
Figure 4.13 A family tree of estimators.

Chapter 5

Figure 5.1 A density control system for the process industry.
Figure 5.2 An overview of online estimation.
Figure 5.3 Motion tracking: both system and measurement errors are white noises with covariance 1.
Figure 5.4 Motion tracking: measurement errors are white noise with covariance 4.
Figure 5.5 Linearized Kalman filtering applied to the volume density estimation problem.
Figure 5.6 Extended Kalman filtering for the volume density estimation problem.
Figure 5.7 Iterated extended Kalman filtering for the volume density estimation problem. (a) Measurements. (b) Results from the EKF. (c) Results from the iterated EKF (number of iterations = 20).
Figure 5.8 A three-state ergodic Markov model.
Figure 5.9 A four-state left–right model.
Figure 5.10 Licence plate detection.
Figure 5.11 Definitions of the measurements associated with a video line.
Figure 5.12 States and measurements of a video line.
Figure 5.13 Online state estimation.
Figure 5.14 Detected licence plate pixels using online estimation.
Figure 5.15 Offline state estimation
Figure 5.16 Detected licence plate pixels using offline estimation. (a) Individually estimated states. (b) Jointly estimated states.
Figure 5.17 Representation of a probability density. (a) A density p(x). (b) The proposal density q(x). (c) The 40 samples of q(x). (d) Importance sampling of p(x) using the 40 samples of q(x). (e) Selected samples from (d) as an equally weighted sample representation of p(x).
Figure 5.18 Application of particle filtering to the density estimation problem. (a) Real states and measurements. (b) The particles obtained at i = 511. (c) Results.
Figure 5.19 A general GA-based state estimator.
Figure 5.20 GA-based state estimation for the jet engine compressor model, Ψ_J(0) = 0.5, R_J(0) = 0.1 and Φ_J(0) = 0.8. Plot 1 corresponds to Ψ_J, plot 2 corresponds to Φ_J and plot 3 corresponds to R_J. The dotted lines in the right two plots indicate the actual state, while the solid lines indicate the state estimates. The dash–dot line in plot 1 indicates the reference input for Ψ_J (adopted from James et al., 2000).
Figure 5.21 Design stages for state estimators.

Chapter 6

Figure 6.1 Training sets. (a) Labelled. (b) Unlabelled.
Figure 6.2 Classification assuming Gaussian distributions. (a) Linear decision boundaries. (b) Quadratic decision boundaries.
Figure 6.3 Parzen estimation of a density function using 50 samples: (a) σ_h = 1 and (b) σ_h = 0.2.
Figure 6.4 Probability densities of the measurements shown in Figure 6.1. (a) The 3D plot of the Parzen estimate of the unconditional density together with a 2D contour plot of this density on the ground plane. The parameter σ_h was set to 0.0486. (b) The resulting decision boundaries. (c) Same as (a), but with σ_h set to 0.0176. (d) Same as (b), for the density estimate shown in (c).
Figure 6.5 Application of κ-NNR classification: (a) κ = 7, (b) κ = 1.
Figure 6.6 Application of editing and condensing. (a) Edited training set. (b) Edited and condensed training set.
Figure 6.7 Training by means of performance optimization.
Figure 6.8 The perceptron.
Figure 6.9 Application of two linear classifiers. (a) Linear perceptron. (b) Least squares error classifier.
Figure 6.10 The linear support vector classifier.
Figure 6.11 Application of two support vector classifiers. (a) Polynomial kernel, d = 2, C = 100. (b) Gaussian kernel, σ = 0.1, C = 100.
Figure 6.12 A two-layer feedforward neural network with two input dimensions and one output (for presentation purposes, not all connections have been drawn).
Figure 6.13 Application of Adaboost with a linear perceptron as the weak classifier. (a) 200 base classifiers. (b) The classification result with the stronger classifier combined with 200 weak classifiers.
Figure 6.14 Application of Adaboost with the decision stump as a weak classifier. (a) 200 base classifiers. (b) The classification result with the stronger classifier combined with 200 weak classifiers.
Figure 6.15 An illustration of CNN architecture (comes from WiKi). Source: Lei, https://en.wikipedia.org/wiki/Convolutional_neural_network#/ media/File:Typical_cnn.png_. Used under CC-BY-SA 3.0 https://en .wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License.
Figure 6.16 Illustration of max pooling.
Figure 6.17 Application of two neural networks. (a) One hidden layer of five units. (b) Learn curves of (a). (c) Two hidden layers of 100 units each. (d) Learn curves of (c).
Figure 6.18 Required number of test samples.

Chapter 7

Figure 7.1 Error rates versus dimension of measurement space.
Figure 7.2 The interclass/intraclass distance.
Figure 7.3 Error bounds and the true minimum error for the Gaussian case (C₁ = C₂). (a) The minimum error rate with some bounds given by the Chernoff distance. In this example the bound with s = 0.5 (Bhattacharyya upper bound) is the most tight. The figure also shows the Bhattacharyya lower bound. (b) The Chernoff bound with dependence on s.
Figure 7.4 A top-down tree structure on behalf of feature selection.
Figure 7.5 A bottom-up tree structure for feature selection.
Figure 7.6 Character classifications for licence plate recognition. (a) Character sets from licence plates, before and after normalization. (b) Selected features. The number of features is 18 and 50, respectively.
Figure 7.7 Feature extraction.
Figure 7.8 Linear feature extractions with equal expectation vectors. (a) Covariance matrices with decision function. (b) Whitening of ω₁ samples. (c) Decorrelation of ω₂ samples. (d) Decision function based on one linear feature.
Figure 7.9 Feature extraction based on the interclass/intraclass distance (see Figure 7.2). (a) The within and between scatters after simultaneous decorrelation. (b) Linear feature extraction.
Figure 7.10 Feature extraction in the licence plate application. (a) The inter/intra distance as a function of D. (b) First 24 eigenvectors in W are depicted as 15 × 11 pixel images.

Chapter 8

Figure 8.1 Principal component analyses.
Figure 8.2 Application of PCA to image compression.
Figure 8.3 MDS applied to a matrix of geodesic distances of world cities.
Figure 8.4 Input points before kernel PCA.
Figure 8.5 Output after kernel PCA with k(x, y) = (x^Ty + 1)². The three groups are distinguishable using the first component only (https://en.wikipedia.org/wiki/File:Kernel_pca_output.png).
Figure 8.6 Output after kernel PCA with a Gaussian kernel (https://en.wikipedia.org/wiki/File:Kernel_pca_output_gaussian.png). Source: Lei, https://en.wikipedia.org/wiki/Kernel_principal_component_analysis. Used under CC-BY-SA 3.0 https://en.wikipedia.org/wiki/Wikipedia:Text_of_ Creative_Commons_Attribution-ShareAlike_3.0_Unported_License.
Figure 8.7 Kernel PCA methodology. A training set is mapped from input space I to feature space F via a non-linear function ϕ. PCA is performed in F to determine the principal directions defining the kernel PCA space (learned space): oval area. Any element of I can then be mapped to F and projected on the kernel PCA space via P^lϕ, where l refers to the first l eigenvectors used to build the KPCA space.
Figure 8.8 The example of reducing dimensionality of a 2D circle by using KPCA.
Figure 8.9 Hierarchical clustering of the data in Table 8.1.
Figure 8.10 The development from K = N_S clusters to K = 1 cluster. (a) Single-link clustering. (b) Complete-link clustering.
Figure 8.11 Hierarchical clustering with two different clustering types. (a) Single-link clustering. (b) Complete-link clustering.
Figure 8.12 The development of the cluster means during 10 update steps of the K-means algorithm.
Figure 8.13 Two results of K-means clustering applied to the ‘mechanical parts’ dataset.
Figure 8.14 Two results of the EM algorithm for the mixture of Gaussians estimation.
Figure 8.15 The development of a one-dimensional self-organizing map, trained on a two-dimensional uniform distribution: (a) initialization; (b) to (d) after 10, 25 and 100 iterations, respectively.
Figure 8.16 A SOM that visualizes the effects of a highlight. (a) RGB image of an illuminated surface with a highlight (= glossy spot). (b) Scatter diagram of RGB samples together with a one-dimensional SOM.
Figure 8.17 Trained generative topographic mappings. (a) K = 14, M = 10, σ_ϕ = 0.2 and λ = 0. (b) K = 14, M = 5, σ_ϕ = 0.2 and λ = 0. (c) K = 14, M = 10, σ_ϕ = 0.2 and λ = 0.01.

Chapter 9

Figure 9.1 The Delft Image Database, 10 pictures of 40 subjects each. Source: The Delft Image Database.
Figure 9.2 The ATT face database (formerly the ORL database), 10 pictures of 40 subjects each. Source: The ATT face Database.
Figure 9.3 The eigenfaces corresponding to the first 25 eigenvectors.
Figure 9.4 Projections on the first two eigenvectors.
Figure 9.5 The classification error by the 1NN rule as a function of the number of eigenvectors.
Figure 9.6 The total Kimia dataset.
Figure 9.7 The selected classes.
Figure 9.8 The computed contours.
Figure 9.9 Scatterplot.
Figure 9.10 Scatterplots of the Boston Housing dataset. The left subplot shows features STATUS and INDUSTRY, where the discrete nature of INDUSTRY can be spotted. In the right subplot, the dataset is first scaled to unit variance, after which it is projected on to its first two principal components.
Figure 9.11 Performance of a polynomial kernel SVC (left, as a function of the degree of the polynomial) and a radial basis function kernel SVC (right, as a function of the basis function width).
Figure 9.12 Performance of neural networks with one or two hidden layers as a function of the number of units per hidden layer, trained using bpxnc (left) and neurc (right).
Figure 9.13 Setup of a sensory system for acoustic distance measurements.
Figure 9.14 A record of the nominal response h(t).
Figure 9.15 ToF measurements based on thresholding operations.
Figure 9.16 ToF estimation by fitting a one-sided parabola to the foot of the envelope.
Figure 9.17 Matched filtering.
Figure 9.18 ML estimation based on covariance models.
Figure 9.19 Eigenvalues, weights and filter responses of the covariance model-based estimator.
Figure 9.20 Results of the estimator based on covariance models.
Figure 9.21 Training the estimators.
Figure 9.22 A simple hydraulic system consisting of two connected tanks.
Figure 9.23 Measured levels of two interconnected tanks.
Figure 9.24 Results from the linearized Kalman filter.
Figure 9.25 Results from the extended Kalman filter.
Figure 9.26 Results from particle filtering.
Figure 9.27 Calibration curve for the second sensor.
Figure 9.28 Results from particle filtering after applying a linearity correction.

Preface

Information processing has always been an important factor in the development of human society and its role is still increasing. The inventions of advanced information devices paved the way for achievements in a diversity of fields like trade, navigation, agriculture, industry, transportation and communication. The term ‘information device’ refers here to systems for the sensing, acquisition, processing and outputting of information from the real world. Usually, they are measurement systems. Sensing and acquisition provide us with signals that bear a direct relation to some of the physical properties of the sensed object or process. Often, the information of interest is hidden in these signals. Signal processing is needed to reveal the information and to transform it into an explicit form. Further, in the past 10 years image processing (together with intelligent computer vision) has gone through rapid developments. There are substantial new developments on, for example, machine learning methods (such as Adaboost and it's varieties, Deep learning etc.) and particle filtering like parameter estimation methods.

The three topics discussed in this book, classification, parameter estimation and state estimation, share a common factor in the sense that each topic provides the theory and methodology for the functional design of the signal processing part of an information device. The major distinction between the topics is the type of information that is outputted. In classification problems the output is discrete, that is a class, a label or a category. In estimation problems, it is a real-valued scalar or vector. Since these problems occur either in a static or in a dynamic setting, actually four different topics can be distinguished. The term state estimation refers to the dynamic setting. It covers both discrete and real-valued cases (and sometimes even mixed cases).

The similarity between the topics allows one to use a generic methodology, that is Bayesian decision theory. Our aim is to present this material concisely and efficiently by an integrated treatment of similar topics. We present an overview of the core mathematical constructs and the many resulting techniques. By doing so, we hope that the reader recognizes the connections and the similarities between these constructs, but also becomes aware of the differences. For instance, the phenomenon of overfitting is a threat that ambushes all four cases. In a static classification problem it introduces large classification errors, but in the case of a dynamic state estimation it may be the cause of instable behaviour. Further, in this edition, we made some modifications to accommodate engineering requests on intelligent computer vision.

Our goal is to emphasize the engineering aspects of the matter. Instead of a purely theoretical and rigorous treatment, we aim for the acquirement of skills to bring theoretical solutions to practice. The models that are needed for the application of the Bayesian framework are often not available in practice. This brings in the paradigm of statistical inference, that is learning from examples. Matlab^®¹is used as a vehicle to implement and to evaluate design concepts.

As alluded to above, the range of application areas is broad. Application fields are found within computer vision, mechanical engineering, electrical engineering, civil engineering, environmental engineering, process engineering, geo-informatics, bio-informatics, information technology, mechatronics, applied physics, and so on. The book is of interest to a range of users, from the first-year graduate-level student up to the experienced professional. The reader should have some background knowledge with respect to linear algebra, dynamic systems and probability theory. Most educational programmes offer courses on these topics as part of undergraduate education. The appendices contain reviews of the relevant material. Another target group is formed by the experienced engineers working in industrial development laboratories. The numerous examples of Matlab^® code allow these engineers to quickly prototype their designs.

The book roughly consists of three parts. The first part, Chapter 2, presents an introduction to the PRTools used throughout this book. The second part, Chapters 3, 4 and 5, covers the theory with respect to classification and estimation problems in the static case, as well as the dynamic case. This part handles problems where it is assumed that accurate models, describing the physical processes, are available. The third part, Chapters 6 up to 8, deals with the more practical situation in which these models are not or only partly available. Either these models must be built using experimental data or these data must be used directly to train methods for estimation and classification. The final chapter presents three worked out problems. The selected bibliography has been kept short in order not to overwhelm the reader with an enormous list of references.

The material of the book can be covered by two semester courses. A possibility is to use Chapters 3, 4, 6, 7 and 8 for a one-semester course on Classification and Estimation. This course deals with the static case. An additional one-semester course handles the dynamic case, that is Optimal Dynamic Estimation, and would use Chapter 5. The prerequisites for Chapter 5 are mainly concentrated in Chapter 4. Therefore, it is recommended to include a review of Chapter 4 in the second course. Such a review will make the second course independent from the first one.

Each chapter is closed with a number of exercises. The mark at the end of each exercise indicates whether the exercise is considered easy (‘0’), moderately difficult (‘*’) or difficult (‘**’). Another possibility to acquire practical skills is offered by the projects that accompany the text. These projects are available at the companion website. A project is an extensive task to be undertaken by a group of students. The task is situated within a given theme, for instance, classification using supervised learning, unsupervised learning, parameter estimation, dynamic labelling and dynamic estimation. Each project consists of a set of instructions together with data that should be used to solve the problem.

The use of Matlab^® tools is an integrated part of the book. Matlab^® offers a number of standard toolboxes that are useful for parameter estimation, state estimation and data analysis. The standard software for classification and unsupervised learning is not complete and not well structured. This motivated us to develop the PRTools software for all classification tasks and related items. PRTools is a Matlab^® toolbox for pattern recognition. It is freely available for non-commercial purposes. The version used in the text is compatible with Matlab^® Version 5 and higher. It is available from http://37steps.com.

The authors keep an open mind for any suggestions and comments (which should be addressed to cpese@wiley.com). A list of errata and any other additional comments will be made available at the companion website.

Classification, Parameter
Estimation and State Estimation

An Engineering Approach Using MATLAB

Preface

Note

Acknowledgements

About the Companion Website