Cover Page

Series Editor

Nikolaos Limnios

Numerical Methods for Inverse Problems

Michel Kern

log

To my wife Elisabeth, to my children David and Jonathan

Preface

This book studies methods to concretely address (on a computer) inverse problems. But what is an inverse problem? An inverse problem appears whenever the causes that produced a given effect must be determined, or when we seek to indirectly estimate the parameters of a physical system.

The most common example in the everyday life of many of us comes from the medical field: the medical ultrasound that informs if an unborn baby is in good health involves the solution of an inverse problem. A probe, placed on the belly of the patient, emits and receives ultrasounds. These are deflected, and reflected, by the tissues of the fetus. The sensor receives and interprets these echoes to return an image of the contours of these tissues. The image is effectively obtained in an indirect manner. We will see further examples throughout this book.

Intuitively, the observation of an effect may not be sufficient to determine its cause. If I go inside a room and I note that the temperature is (nearly) uniform, it is difficult for me to know what the distribution of temperature was 2 h earlier. It is said that the inverse problem to determine the temperature in the past is “ill-posed”. This definition contrasts with the question of determining the future evolution of the temperature, which is, in a sense that we will specify, “well-posed”. As Molière’s character Monsieur Jourdain does when he speaks prose, it is so common to solve well-posed problems that we (almost) do it without thinking.

Solving inverse problems thus requires the mastery of techniques and specific methods. This book presents some of those chosen for their very general domain of application. It focuses on a small number of methods that will be used in most applications:

  1. – the reformulation of an inverse problem in the form of minimization of a square error functional. The reason for this choice is mainly practical: it makes it possible to carry out calculations at a reasonable cost;
  2. – the regularization of ill-posed problems and in particular Tikhonov’s method;
  3. – the use of the singular value decomposition to analyze an ill-posed problem;
  4. – the adjoint state method to calculate the gradient of the functionals to minimize when these are not quadratic.

These tools will help to address many (but not all!) inverse problems that arise in practice. Two limitations should be however kept in mind. On the one hand, many inverse problems will make use of different techniques (we will mention a few of them). On the other hand, even when the presented tools can be employed, they are rarely sufficient on their own to completely analyze a complex physical application. Most often, it will be necessary to supplement these tools with a fine analysis of the particular situation to make the most of it (redundancy or not of the data, fast or slow variation of the parameters looked for, etc.).

It is common, in this type of preface, to justify the existence of the presented book! It is true that the question is legitimate (many books already exist on the subject as can be seen in the bibliography), and I do not claim any originality about the content. Nonetheless, readers might still be interested to find a book that discusses both linear and nonlinear problems. In addition, this book can be used as an introduction to the more advanced literature.

This book is aimed at readers with a rather substantial mathematical and a scientific computing background, equivalent to a masters in applied mathematics. Nevertheless, it is a book with a practical perspective. The methods described therein are applicable, and have been applied, and are often illustrated by numerical examples.

The prerequisites to approach this book are unfortunately more numerous that I would have wished. This is a consequence of the fact that the study of inverse problems calls upon many others areas of mathematics. A working knowledge of (both theoretical and numerical) linear algebra is assumed, as is a familiarity with the language of integration theory. Functional analysis, which is what linear algebra becomes when it abandons the finite dimensional setting, is ubiquitous, and the Appendices herein serve as reminders of concepts directly useful in this book. An important part of the examples comes from models of partial differential equations. Here again, the reader will benefit from a prior knowledge of analysis methods (weak formulations, Sobolev spaces) and of numerical analysis (finite element method, discretization schemes for differential equations).

Book layout

We start the book with some general remarks on inverse problems. We will introduce the fundamental concept of an ill-posed problem, which is characteristic of inverse problems.

In Chapter 2, we will give several examples of inverse problems, originating from several areas of physics.

An important source of linear inverse problems will be introduced in Chapter 3: the integral equations of the first kind. After outlining the main properties of integral operators, we will show that they lead to ill-posed problems. Finally, we will introduce discretization methods, leading to least squares problems.

The study of these problems is the subject of the subsequent two chapters. In Chapter 4, we will study their mathematical properties in a Hilbertian context: the geometric aspect, and the relationship with normal equations, as well as the questions of existence and uniqueness of the solutions. We will also introduce the fundamental tool, both for theoretical analysis and for numerical approximation, that is the singular value decomposition, first for matrices, then for operators between Hilbert spaces. Reminders regarding the numerical aspects of inverse problems can be found in Appendix 1. Techniques for solving ill-posed problems are the subject of Chapter 5, especially Tikhonov’s regularization method and spectral truncation. Tikhonov’s method will be first addressed from a variational perspective before bringing clarification with singular value decomposition. We will discuss the question of the choice of the regularization parameter and will finish by a short introduction to iterative method.

In the second part, we will discuss nonlinear problems, which are essentially problems of parameters estimation in differential or partial differential equations. In Chapter 6, we will see how to formulate identification problems in terms of minimization and explore the main difficulties that we can expect therefrom. Appendix 2 contains reminders about the basic numerical methods in optimization. Chapter 7 will address the important technique of the adjoint state to compute the functional gradient involved in least squares problems. We will see in several examples how to conduct this computation in an efficient way.

We conclude this second part by briefly introducing issues that could not be discussed in this book, giving some bibliographic hints.

We have compiled reminders regarding the numerical methods of linear algebra for least squares problems, reminders on optimization, as well as some functional analysis results and supplements on linear operators in the appendices.

Acknowledgments

My thanks go first to Professor Limnios, who suggested I write this book, from a first version of course notes that I had published on the Internet. I am grateful to him for giving me the opportunity to publish this course by providing more visibility thereto.

The contents of this book owe a lot, and this is a euphemism, to Guy Chavent. This book grew out of lecture notes that I had written for a course that had originally been taught by G. Chavent, and for which he trusted me enough to let me replace him. Guy was also my thesis supervisor and was the leader of the Inria team where I did all my career. He has been, and remains, a source of inspiration with regard to how to address a scientific problem.

I had the chance to work in the particularly stimulating environment of Inria and to meet colleagues who added great scientific qualities to endearing personalities. I am thinking especially of Jérôme Jaffré and Jean Roberts. A special mention for my colleagues in the Serena team: Hend Benameur, Nathalie Bonte, François Clément, Caroline Japhet, Vincent Martin, Martin Vohralík and Pierre Weis. Thank you for your friendship, and thank you for making our work environment a pleasant and an intellectually stimulating one.

I would like to thank all the colleagues who have told me of errors they found in previous versions of the book, the students of the Pôle Universitaire Léonard de Vinci, of Mines–ParisTech and of the École Nationale d’Ingénieurs of Tunis for listening to me and for their questions, as well as the staff of ISTE publishing for their help in seeing the book through to completion.

Michel KERN

February 2016

1
Overview of Inverse Problems

1.1. Direct and inverse problems

According to Keller [KEL 76], two problems are said to be the inverse of one another if the formulation of one of them involves the other. This definition includes a degree of arbitrariness and confers a symmetric role to both problems under consideration. A more operational definition is that an inverse problem consists of determining the causes knowing the effects. Thus, this problem is the inverse of what is called a “direct problem”, consisting of the deduction of the effects, the causes being known.

This second definition shows that it is more usual to study direct problems. As a matter of fact, since Newton, the notion of causality is rooted in our scientific subconscious, and at a more prosaic level, we have learned to pose, and then solve, problems for which the causes are given, where the objective is to find the effects. This definition also shows that inverse problems may give rise to particular difficulties. We will see further that it is possible to attribute a mathematical content to the sentence “the same causes produce the same effects”; in other words, it is reasonable to require that the direct problem is well-posed. On the other hand, it is easy to imagine, and we will see numerous examples, that the same effects may originate from different causes. At the origin, this idea contains the main difficulty of the study of inverse problems: they can have several solutions and it is important to have additional information in order to discriminate between them.

The prediction of the future state of a physical system, knowing its current state, is the typical example of a direct problem. We can consider various inverse problems: for example to reconstitute the past state of the system knowing its current state (if this system is irreversible), or the determination of parameters of the system, knowing (part of) its evolution. This latter problem is that of the identification of parameters, which will be our main concern in the following.

A practical challenge of the study of inverse problems is that it often requires a good knowledge of the direct problem, which is reflected in the use of a large variety of both physical and mathematical concepts. The success in solving an inverse problem is based, in general, on elements specific to this problem. However, some techniques present an extended application domain and this book is an introduction to the principal techniques: the regularization of ill-posed problems and the least squares method.

The most important technique is the reformulation of an inverse problem in the form of the minimization of an error functional between the actual measurements and the synthetic measurements (that is the solution to the direct problem). It will be convenient to distinguish between linear and nonlinear problems. It should be noted here that the nonlinearity in question refers to the inverse problem, and that the direct problem itself may or may not be linear.

In the case of linear problems, resorting to linear algebra and to functional analysis allows accurate results as well as efficient algorithms to be obtained. The fundamental tool here is the singular value decomposition of the operator, or of the matrix, being considered. We will study the regularization method in detail, which consists of slightly “modifying” the problem under study by another that has “better” properties. This will be specified in Chapters 4 and 5.

Nonlinear problems are more difficult and there exist less overall results. We will study the application of optimization algorithms to problems obtained by the reformulation referred to above. A crucial technical ingredient (from the numerical point of view) is the calculation of the gradient of the functional to be minimized. We will study the adjoint state method in Chapter 7. It allows this calculation at a cost that is a (small) multiple of that of solving the direct problem.

As it can be seen, the content of this book primarily aims to present numerical methods to address inverse problems. This does not mean that theoretical questions do not exist, or are devoid of interest. The deliberate choice of not addressing them is dictated by the practical orientation of the course, by the author’s taste and knowledge, but also by the high mathematical level that these issues require.

1.2. Well-posed and ill-posed problems

In his famous book, Hadamard [HAD 23] introduced, as early as 1932, the notion of a well-posed problem. It concerns a problem for which:

  1. – a solution exists;
  2. – the solution is unique;
  3. – the solution depends continuously on the data.

Of course, these concepts must be clarified by the choice of space (and of topologies) to which the data and the solution belong.

In the same book, Hadamard suggested (and it was a widespread opinion until recently) that only a well-posed problem could properly model a physical phenomenon. After all, these three conditions seem very natural. In fact, we shall see that inverse problems often do not satisfy either of these conditions, or even the three all together. Upon reflection, this is not so surprising:

  1. – a physical model being established, the experimental data available are generally noisy and there is no guarantee that such data originate from this model, even for another set of parameters;
  2. – if a solution exists, it is perfectly conceivable (and we will see examples of this) that different parameters may result in the same observations.

The absence of one or any other of the three Hadamard’s conditions does not have the same importance with respect to being able to solve (in a sense that remains to be defined) the associated problem:

  1. – the fact that the solution of an inverse problem may not exist is not a serious difficulty. It is usually possible to restore the existence by relaxing the concept of solution (a classic procedure in mathematics);
  2. – the non-uniqueness is a more serious problem. If a problem has several solutions, there should be a means of distinguishing between them. This requires additional information (we speak of a priori information);
  3. – the lack of continuity is probably the most problematic, in particular in view of an approximate or a numerical solution. Lack of continuity means that it is not possible (regardless of the numerical method) to approach a satisfactory solution of the inverse problem, since the data available will be noisy, therefore close to the actual data, but different from the actual data.

A problem that is not well-posed within the meaning of the definition above is said to be ill-posed. We now give an example that, although very simple, illustrates the difficulties that may be found in more general situations.

EXAMPLE 1.1.– Differentiation and integration are two problems that are the inverse of each other. It would seem more natural to consider differentiation as the direct problem and integration as the inverse problem. In fact, integration has good mathematical properties that lead to consider it as the direct problem. In addition, differentiation is the prototypical ill-posed problem, as we shall see in the following.

Consider the Hilbert space image266.gif, and the integral operator A defined by

[1.1] image267.jpg

It is easy to directly see that image268.gif, or theorem 3.1, can be applied (see example 3.1). This operator is injective; however, its image is the vector subspace

image267.jpg

where image270.gif is the Sobolev space. In effect, the equation

image267.jpg

is equivalent to

image267.jpg

The image of A is not closed in image273.gif (of course, it is closed in image274.gif. As a result, the inverse of A is not continuous on image275.gif as shown in the following example.

Consider a function image276.gif and let image277.gif. Let

image267.jpg

then

image267.jpg

Simple calculations show that

image267.jpg

whereas

image267.jpg

Thus, the difference between f′ and image267.jpg may be arbitrarily large, even though the difference between f and f′ is arbitrarily small. The derivation operator (the inverse of A) is thus not continuous, at least with this choice of norms.

The instability of the inverse is typical of ill-posed problems. A small perturbation over the data (here f) can have an arbitrarily large influence on the result (here f′).

A second class of inverse problems is the estimation of parameters in differential equations. We are going to discuss a very simple example of this situation.

EXAMPLE 1.2.– Considering the elliptic problem in one dimension:

[1.2] image282.jpg

This equation, or other similar although more complex, arises in several examples in the following chapter. In this example, we choose image283.gif and the solution image284.gif, which gives image285.gif

The direct problem consists of calculating u, given a and f. For the inverse problem, we shall consider that f is known, and we will try to recover the coefficient a from a measurement of u. For this example, voluntarily simplified, we shall assume that u is measured over the whole interval ] − 1, 1[, which is obviously unrealistic. We shall see that even in this optimistic situation, we are likely to face difficulties.

By integrating equation [1.2], and by dividing by u′, we obtain the following expression for a (assuming that u′ does not vanish, which is not true in our example):

[1.3] image286.jpg

which yields in our particular case:

[1.4] image287.jpg

where C is an integration constant.

We can see that even in this particular case, a is not determined by the data, that is u. Of course in this case, it is clear that the “correct” solution corresponds to C = 0, since this is the only value for which a is bounded. In order to be able to discriminate among the various possible solutions, we resort to additional information (usually referred to as a priori information).

In this problem, there are two sources of instability: first, equation [1.3] involves u′, and we just have seen that the transition from u to u′ causes instability. This is a phenomenon common to linear and nonlinear problems. On the other hand, the division by u′ shows an instability specific to nonlinear problems. If u′ vanishes at some point, the division is impossible. If u′ is simply small, the division will be a cause of instability.

This book is dedicated to the study of methods allowing for the recovery of a certain degree of stability in ill-posed problems. It is however necessary to keep in mind this observation from [ENG 96]: “no mathematical trick can make an inherently unstable problem stable”. The methods we are going to introduce in the following will make the problem under consideration stable, but at the price of a modification of the solved problem (and therefore of its solution).