Cover Page

Contents

Cover

Half Title page

Title page

Copyright page

Preface

Part One: Overview and Motivation

Chapter One: Introduction to Monte Carlo Methods

1.1 Historical origin of Monte Carlo simulation

1.2 Monte Carlo simulation vs. Monte Carlo sampling

1.3 System dynamics and the mechanics of Monte Carlo simulation

1.4 Simulation and optimization

1.5 Pitfalls in Monte Carlo simulation

1.6 Software tools for Monte Carlo simulation

1.7 Prerequisites

For further reading

References

Chapter Two: Numerical Integration Methods

2.1 Classical quadrature formulas

2.2 Gaussian quadrature

2.3 Extension to higher dimensions: Product rules

2.4 Alternative approaches for high-dimensional integration

2.5 Relationship with moment matching

2.6 Numerical integration in R

For further reading

References

Part Two: Input Analysis: Modeling and Estimation

Chapter Three: Stochastic Modeling in Finance and Economics

3.1 Introductory examples

3.2 Some common probability distributions

3.3 Multivariate distributions: Covariance and correlation

3.4 Modeling dependence with copulas

3.5 Linear regression models: A probabilistic view

3.6 Time series models

3.7 Stochastic differential equations

3.8 Dimensionality reduction

3.9 Risk-neutral derivative pricing

For further reading

References

Chapter Four: Estimation and Fitting

4.1 Basic inferential statistics in R

4.2 Parameter estimation

4.3 Checking the fit of hypothetical distributions

4.4 Estimation of linear regression models by ordinary least squares

4.5 Fitting time series models

4.6 Subjective probability: The Bayesian view

For further reading

References

Part Three: Sampling and Path Generation

Chapter Five: Random Variate Generation

5.1 The structure of a Monte Carlo simulation

5.2 Generating pseudorandom numbers

5.3 The inverse transform method

5.4 The acceptance-rejection method

5.5 Generating normal variates

5.6 Other ad hoc methods

5.7 Sampling from copulas

For further reading

References

Chapter Six: Sample Path Generation for Continuous-Time Models

6.1 Issues in path generation

6.2 Simulating geometric Brownian motion

6.3 Sample paths of short-term interest rates

6.4 Dealing with stochastic volatility

6.5 Dealing with jumps

For further reading

References

Part Four: Output Analysis and Efficiency Improvement

Chapter Seven: Output Analysis

7.1 Pitfalls in output analysis

7.2 Setting the number of replications

7.3 A world beyond averages

7.4 Good and bad news

For further reading

References

Chapter Eight: Variance Reduction Methods

8.1 Antithetic sampling

8.2 Common random numbers

8.3 Control variates

8.4 Conditional Monte Carlo

8.5 Stratified sampling

8.6 Importance sampling

For further reading

References

Chapter Nine: Low-Discrepancy Sequences

9.1 Low-discrepancy sequences

9.2 Halton sequences

9.3 Sobol low-discrepancy sequences

9.4 Randomized and scrambled low-discrepancy sequences

9.5 Sample path generation with low-discrepancy sequences

For further reading

References

Part Five: Miscellaneous Applications

Chapter Ten: Optimization

10.1 Classification of optimization problems

10.2 Optimization model building

10.3 Monte Carlo methods for global optimization

10.4 Direct search and simulation-based optimization methods

10.5 Stochastic programming models

10.6 Stochastic dynamic programming

10.7 Numerical dynamic programming

10.8 Approximate dynamic programming

For further reading

References

Chapter Eleven: Option Pricing

11.1 European-style multidimensional options in the BSM world

11.2 European-style path-dependent options in the BSM world

11.3 Pricing options with early exercise features

11.4 A look outside the BSM world: Equity options under the Heston model

11.5 Pricing interest rate derivatives

For further reading

References

Chapter Twelve: Sensitivity Estimation

12.1 Estimating option greeks by finite differences

12.2 Estimating option greeks by pathwise derivatives

12.3 Estimating option greeks by the likelihood ratio method

For further reading

References

Chapter Thirteen: Risk Measurement and Management

13.1 What is a risk measure?

13.2 Quantile-based risk measures: Value-at-risk

13.3 Issues in Monte Carlo estimation of V@R

13.4 Variance reduction methods for V@R

13.5 Mean–risk models in stochastic programming

13.6 Simulating delta hedging strategies

13.7 The interplay of financial and nonfinancial risks

For further reading

References

Chapter Fourteen: Markov Chain Monte Carlo and Bayesian Statistics

14.1 Acceptance–rejection sampling in Bayesian statistics

14.2 An introduction to Markov chains

14.3 The Metropolis–Hastings algorithm

14.4 A re-examination of simulated annealing

For further reading

References

Index

Handbook in Monte Carlo Simulation

Title Page

Preface

The aim of this book is to provide a wide class of readers with a low- to intermediate-level treatment of Monte Carlo methods for applications in finance and economics. The target audience consists of students and junior practitioners with a quantitative background, and it includes not only students in economics and finance, but also in mathematics, statistics, and engineering. In fact, this is the kind of audience I typically deal with in my courses. Not all of these readers have a strong background in either statistics, financial economics, or econometrics, which is why I have also included some basic material on stochastic modeling in the early chapters, which is typically skipped in higher level books. Clearly, this is not meant as a substitute for a proper treatment, which can be found in the references listed at the end of each chapter. Some level of mathematical maturity is assumed, but the prerequisites are rather low and boil down to the essentials of probability and statistics, as well as some basic programming skills.1 Advanced readers may skip the introductory chapters on modeling and estimation, which are also included as a reminder that no Monte Carlo method, however sophisticated, will yield useful results if the input model is flawed. Indeed, the power and flexibility of such methods may lure us into a false sense of security, making us forget some of their inherent limitations.

Option pricing is certainly a relevant application domain for the techniques we discuss in the book, but this is not meant to be a book on financial engineering. I have also included a significant amount of material on optimization in its many guises, as well as a chapter related to computational Bayesian statistics. I have favored a wide scope over a deeply technical treatment, for which there are already some really excellent and more demanding books. Many of them, however, do not quite help the reader to really “feel” what she is learning, as no ready-to-use code is offered. In order to allow anyone to run the code, play with it, and hopefully come up with some variations on the theme, I have chosen to develop code in R. Readers familiar with my previous book written in MATLAB might wonder whether I have changed my mind. I did not: I never use R in research or consulting, but I use it a lot for teaching. When I started writing the book, I was less than impressed by the lack of an adequate development environment, and some design choices of the language itself left me a bit puzzled. As an example, the * operator in MATLAB multiplies matrices row by column; whenever you want to work elementwise, you use the . operator, which has a clear and uniform meaning when applied to other operators. On the contrary, the operator * works elementwise in R, and row-by-column matrix product is accomplished by the somewhat baroque operator %*%. Furthermore, having to desperately google every time you have to understand a command, because documentation is a bit poor and you have to make your way in a mess of packages, may be quite frustrating at times. I have also found that some optimization functions are less accurate and less clever in dealing with limit cases than the corresponding MATLAB functions. Having said that, while working on the book, I have started to appreciate R much more. Also my teaching experience with R has certainly been fun and rewarding. A free tool with such a potential as R is certainly most welcome, and R developers must be praised for offering all of this. Hopefully, the reader will find R code useful as a starting point for further experimentation. I did not assemble R code into a package, as this would be extremely misleading: I had no plan to develop an integrated and reliable set of functions. I just use R code to illustrate ideas in concrete terms and to encourage active learning. When appropriate, I have pointed out some programming practices that may help in reducing the computational burden, but as a general rule I have tried to emphasize clarity over efficiency. I have also avoided writing an introduction to R programming, as there are many freely available tutorials (and a few good books2). A reader with some programming experience in any language should be able to make her way through the code, which has been commented on when necessary. My assumption is that a reader, when stumbling upon an unknown function, will take advantage of the online help and the example I provide in order to understand its use and potentiality. Typically, R library functions are equipped with optional parameters that can be put to good use, but for the sake of conciseness I have refrained from a full description of function inputs.

Book structure

The book is organized in five parts.

1. Part I, Overview and Motivation, consists of two chapters. Chapter 1 provides an introduction to Monte Carlo methods and applications. The different classes of dynamic models that are encountered in simulation are outlined, and due emphasis is placed on pitfalls and limitations of Monte Carlo methods. Chapter 2 deals with numerical integration methods. Numerical integration is quite relevant, as it provided most of the historical motivation to develop Monte Carlo methods; furthermore, there are cases in which one is much better off using good quadrature formulas than throwing random numbers around. Finally, framing Monte Carlo methods within numerical integration provides the necessary discipline to understand and properly use low-discrepancy sequences, sometimes referred to as quasi–Monte Carlo methods.
2. Part II, Input Analysis: Modeling and Estimation, is specifically aimed at students and newcomers, as it includes two introductory chapters dealing with stochastic model building (Chapter 3) and model fitting (Chapter 4). Essentially, in this part of the book we are concerned with the modeling of inputs of a Monte Carlo simulation. Many advanced books on Monte Carlo methods for finance skip and take for granted these concepts. I have preferred to offer a limited treatment for the sake of unfamiliar readers, such as students in engineering or practitioners without an econometrics background. Needless to say, space does not allow for a deep treatment, but I believe that it is important to build at least a framework for further study. In order to make this part useful to intermediate readers, too, I have taken each topic as an excuse for a further illustration of R functionalities. Furthermore, some more advanced sections may be useful to students in economics and finance as well, such as those on stochastic calculus, copulas, and Bayesian statistics.
3. Part III, Sampling and Path Generation, is more technical and consists of two chapters. In Chapter 5 we deal with pseudorandom number and variate generation. While it is certainly true that in common practice one takes advantage of reliable generators provided by software tools like R, and there is no need for an overly deep treatment, some basic knowledge is needed in order to select generators and to manage simulation runs properly. We also outline scenario generation using copulas. In Chapter 6 we deal with sample path generation for continuous-time models based on stochastic differential equations. This is an essential tool for any financial engineer and is at the heart of many derivative pricing methods. It is important to point out that this is also relevant for risk managers, insurers, and some economists as well.
4. Part IV, Output Analysis and Efficiency Improvement, looks at the final step of the simulation process. Monte Carlo methods are extremely powerful and flexible; yet, their output may not be quite reliable, and an unreasonable computational effort may be called for, unless suitable countermeasures are taken. Chapter 7 covers very simple, and possibly overlooked, concepts related to confidence intervals. Counterexamples are used to point out the danger of forgetting some underlying assumptions. Chapter 8 deals with variance reduction strategies that are essential in many financial engineering and risk management applications; indeed, the techniques illustrated here are applied in later chapters, too. Chapter 9 deals with low-discrepancy sequences, which are sometimes gathered under the quasi–Monte Carlo nickname. Actually, there is nothing stochastic in low-discrepancy sequences, which should be regarded as deterministic numerical integration strategies. For a certain range of problem dimensionality, they are a good alternative to pseudorandom sampling.
5. Part IV, Miscellaneous Applications, includes five more or less interrelated chapters dealing with:
The interplay between optimization and Monte Carlo methods, including stochastic methods for deterministic global optimization, scenario generation for stochastic programming with recourse, and stochastic dynamic programming (Chapter 10)
Option pricing, with an emphasis on variance reduction methods (Chapter 11)
Obtaining sensitivity of performance measures with Monte Carlo simulation (Chapter 12)
Risk measurement and management, with an emphasis on value-at-risk and related risk measures for financial risk management (Chapter 13)
Markov chain Monte Carlo (MCMC) methods, which are relevant for different applications, most notably computational Bayesian statistics (Chapter 14)
There are some logical precedences among these final chapters, but they need not be read in a strict sequential order. Chapter 14 is independent of the others, and the only link is represented by the possibility of regarding simulated annealing, a stochastic approach for both global and combinatorial optimization, as an MCMC method. Stochastic dynamic programming, dealt with in Chapter 10, is needed to understand American-style option pricing in Chapter 11. Measuring option price sensitivity is used as a motivating example in Chapter 12, but the methods outlined there have a much more general applicability, as they can also be used within optimization procedures. Finally, there is certainly a natural link between option pricing and the risk management examples discussed in Chapter 13.

Supplements and R code

The code has been organized in two files per chapter. The first one contains all of the function definitions and the reference to required packages, if any; this file should be sourced before running the scripts, which are included as chunks of code in a second file. An archive including all of the R code will be posted on a webpage. My current URL is

http://staff.polito.it/paolo.brandimarte/

A hopefully short list of errata will be posted there as well. One of the many corollaries of Murphy’s law states that my URL is going to change shortly after publication of the book. An up-to-date link will be maintained on the Wiley webpage:

http://www.wiley.com/

For comments, suggestions, and criticisms, all of which are quite welcome, my e-mail address is

paolo.brandimarte@polito.it

PAOLO BRANDIMARTE

Turin, February 2014

1In case of need, the mathematical prerequisites are covered in my other book: Quantitative Methods: An Introduction for Business Management. Wiley, 2011.

2Unfortunately, I have also run into very bad books using R; hopefully, this one will not contribute to the list.

Part One

Overview and Motivation

Chapter One

Introduction to Monte Carlo Methods

The term Monte Carlo is typically associated with the process of modeling and simulating a system affected by randomness: Several random scenarios are generated, and relevant statistics are gathered in order to assess, e.g., the performance of a decision policy or the value of an asset. Stated as such, it sounds like a fairly easy task from a conceptual point of view, even though some programming craft might be needed. Although it is certainly true that Monte Carlo methods are extremely flexible and valuable tools, quite often the last resort approach for overly complicated problems impervious to a more mathematically elegant treatment, it is also true that running a bad Monte Carlo simulation is very easy as well. There are several reasons why this may happen:

We are using a wrong model of uncertainty:
– Because we are using an unrealistic probability distribution
– Or because we are missing some link among the underlying risk factors
– Or because some unknown parameters have been poorly estimated
– Or because the very nature of uncertainty in our problem does not lend itself to a stochastic representation
The output estimates are not reliable enough, i.e., the estimator variance is so large that a much larger sample size is required.
There is a systematic error in the estimates, which could be low or high biased.
The way we generate sample paths, possibly discretizing a continuous-time model, induces a non-negligible error.
We are using poor random variate generators.
There is some possibly subtle bug in the computer program implementing the method.

Some of these issues are technical and can be addressed by the techniques that we will explore in this book, but others are more conceptual in nature and point out a few intrinsic and inescapable limitations of Monte Carlo methods; it is wise not to forget them, while exploring the richness and power of the approach.

The best countermeasure, in order to avoid the aforementioned pitfalls, is to build reasonably strong theoretical foundations and to gain a deep understanding of the Monte Carlo approach and its variants. To that end, a good first step is to frame Monte Carlo methods as a numerical integration tool. Indeed, while the term simulation sounds more exciting, the term Monte Carlo sampling is often used. The latter is more appropriate when we deal with Monte Carlo sampling as a tool for numerical integration or statistical computing. Granted, the idea of simulating financial markets is somewhat more appealing than the idea of computing a multidimensional integral. However, a more conceptual framework helps in understanding powerful methods for reducing, or avoiding altogether, the difficulties related to the variance of random estimators. Some of these methods, such as low-discrepancy sequences and Gaussian quadrature, are actually deterministic in nature, but their understanding needs a view integrating numerical integration and statistical sampling.

In this introductory chapter, we consider first the historical roots of Monte Carlo; we will see in Section 1.1 that some early Monte Carlo methods were actually aimed at solving deterministic problems. Then, in Section 1.2 we compare Monte Carlo sampling and Monte Carlo simulation, showing their deep relationship. Typical simulations deal with dynamic systems evolving in time, and there are three essential kinds of dynamic models:

Continuous-time models
Discrete-time models
Discrete-event models

These model classes are introduced in Section 1.3, where we also illustrate how their nature affects the mechanics of Monte Carlo simulation. In this book, a rather relevant role is played by applications involving optimization. This may sound odd to readers who associate simulation with performance evaluation; on the contrary, there is a multiway interaction between optimization and Monte Carlo methods, which is outlined in Section 1.4.

In this book we illustrate a rather wide range of applications, which may suggest the idea that Monte Carlo methods are almost a panacea. Unfortunately, this power may hide many pitfalls and dangers. In Section 1.5 we aim at making the reader aware of some of these traps. Finally, in Section 1.6 we list a few software tools that are commonly used to implement Monte Carlo simulations, justifying the choice of R as the language of this book, and in Section 1.7 we list prerequisites and references for readers who may need a refresher on some background material.

1.1 Historical origin of Monte Carlo simulation

Monte Carlo methods involve random sampling, but the actual aim is to estimate a deterministic quantity. Indeed, a well-known and early use of Monte Carlo-like methods is Buffon’s needle approach to estimate π.1 The idea, illustrated in Fig. 1.1, is to randomly throw n times a needle of length l on a floor consisting of wood strips of width t > l, and to observe the number of times h that the needle crosses the border between two strips. Let X be the distance from the center of the needle to the closest border; X is a uniform random variable on the range [0, t/2], and its probability density function is 2/t on that interval and 0 outside.2 By a similar token, let θ be the acute angle between the needle and that border; this is another uniformly distributed variable, on the range [0, π/2], with density 2/π on that support. Using elementary trigonometry, it is easy to see that the needle will cross a border when

FIGURE 1.1 An illustration of Buffon’s needle.

equation

The probability of this event is found by integrating the joint density of the random variables X and θ over the relevant domain. Assuming independence, the joint density is just the product of the two marginals. Therefore, we have to compute a double integral as follows:

equation

If we want to estimate π, we can use a sample estimate of this probability, i.e., the ratio of h over n:

equation

We see that the original problem is purely deterministic, but we use Monte Carlo sampling as an estimation tool.

A slightly more practical approach, taking advantage of R, is illustrated in the code of Fig. 1.2. The idea is shooting n random bullets on the square [−1, 1] × [−1, 1] and evaluating the fraction h/n of points falling inside the unit circle x2 + y2 = 1. Since the ratio between the area of the circle and the area of the square is π/4, we have

FIGURE 1.2 Using R to estimate π.

equation

Running the code,3 we obtain

> set.seed(55555)
> estPi(1000)
[1] 3.132
> estPi(1000)
[1] 3.14
> estPi(1000)
[1] 3.084
> estPi(1000000)
[1] 3.141388
> pi
[1] 3.141593

Clearly, a huge sample size is needed to obtain a fairly accurate estimate, and this is certainly not the smartest way to estimate π.

In more recent times, in the period between 1930 and 1950, Monte Carlo methods were used by the likes of Enrico Fermi, Stanislaw Ulam, and John von Neumann to solve problems related to physics. Indeed, such methods were used in the 1950s when the hydrogen bomb was developed at the Los Alamos laboratories. It was then that the term Monte Carlo was coined after the well-known gambling house. Then, luckily enough, the idea moved to other domains, including operations research and economics. At present, Monte Carlo methods are extremely widespread in many domains, including risk management and financial engineering.

1.2 Monte Carlo simulation vs. Monte Carlo sampling

The terms simulation and sampling are both used with reference to Monte Carlo methods. Indeed, there is no 100% sharp line separating the two concepts, and a couple of examples are the best way to get the point.

Example 1.1 Shortfall probability in wealth management

Consider an investor who allocates her wealth W0 to n risky financial assets, such as stock, bonds, derivatives, and whatnot. At time t = 0, the asset price of asset i, i = 1, …, n, is Pi0, and the (integer) number of shares of assets i in the portfolio is hi. Therefore, the initial wealth is

equation

The portfolio is held up to time t = T, and the price of asset i at the end of this investment horizon depends on a set of underlying risk factors, such as the term structure of interest rates, spreads due to credit risk, inflation, oil price, the overall state of the economy, etc. Let X be a vector random variable, with joint density fx(x), representing these factors. If we assume that the price of each asset at time T is given by a function of the underlying factors,

equation

then the terminal wealth is a function of X,

equation

which is itself a random variable. On the one hand, we are interested in the expected value of WT. However, since risk is a quite relevant issue, we might also be interested in the shortfall probability with respect to a target wealth H:

equation

Whatever the case, we are lead to the expected value of some function g(X):

(1.1) equation

If we choose g(X) = WT we obtain the expected value of future wealth; if we choose the indicator function of the event {WT < H}, i.e.,

equation

then we obtain the shortfall probability. Other risk measures can be selected and will be discussed in Chapter 13. Evaluating the multidimensional integral in Eq. (1.1) is quite difficult, even when we know the analytical form of the involved functions. As we shall see, by using Monte Carlo we may draw a sample of m observations Xk, k = 1, …, m, and estimate the quantities of interest. We are using random sampling as a tool for numerical integration.

In the example above, no simulation is involved at all, assuming that we are able to sample directly the random variable X given its density. In other cases, we may have to cope with a more complicated dynamic model that describes the dynamics of the underlying risk factors over the time interval [0, T]. In such a case, we may not be able to sample risk factors directly, and we have to generate a set of sample paths. Dynamic models may be represented by continuous-time stochastic differential equations4 or by relatively simple discrete-time processes as in the following example.

Example 1.2 An autoregressive process

Time series models are quite often used in financial econometrics to describe the evolution of a quantity of interest over time. A simple example of this class of models is an autoregressive (AR) process of order 1 like

(1.2) equation

where X0 is given and t is the driving stochastic process, i.e., a sequence of random variables t, t = 1, 2, 3, …. In the simplest models, this noise term consists of a sequence of mutually independent and identically distributed normal random variables. We use the notation X ~ N(μ, σ2) to state that X is a normal random variable with expected value μ and variance σ2. Unfolding the recursion, we immediately see that

equation

which implies

equation

The process described by Eq. (1.2) is very easy to analyze, but it is not the most general AR process of order 1. A more general example is

equation

whose properties critically depend on the coefficient α (see Example 1.4). In other cases, we may have a system of such equations, with mutually correlated random driving factors and a more complicated structure, preventing us from obtaining the probability distribution of the variable of interest in an explicit form. In such a case, we may have to simulate the dynamical system.

On the basis of the above examples, we may argue that the term sampling could be reserved to those cases in which there is no dynamics to be simulated over time, whereas simulation entails the generation of sample paths. We may need to generate sample paths because the dynamical model is complicated and prevents an analytical solution, or because it is important to check the underlying variables over time. For instance, when pricing a European-style option,5 we just need to sample the underlying variable at the option maturity. In the easy case of geometric Brownian motion (GBM for short) this boils down to sampling a lognormal random variable, but we have to resort to sample path generation when we step outside that safe and comforting domain, and we enter the realm populated by stochastic volatilities and price jumps. On the contrary, when we are pricing an American-style option, featuring early exercise opportunities, we need a whole sample path, whatever price model we adopt.

As is usually the case, also the above distinction is not so clear-cut. In both settings we are actually evaluating a possibly high-dimensional integral of a function, in order to estimate a probability or an expectation. The function may be given in analytical form, or it may be defined by a possibly complicated black box, requiring several lines of code to be evaluated. Conceptually, there is no difference. Indeed, numerical integration plays such a pivotal role for a full appreciation of Monte Carlo methods that we will devote the whole of Chapter 2 to it.

1.3 System dynamics and the mechanics of Monte Carlo simulation

When we have to describe the dynamic evolution of a system over time, we typically resort to one of the following model classes:

Discrete-time models
Continuous-time models
Discrete-event models

From a technical point of view, as usual, the distinction is not sharp. For instance, even if we choose a continuous-time model, we have to discretize it in some way for the sake of computational feasibility. This implies that we actually simulate a discrete-time approximation, but given the issues involved in sample path generation, which involves nontrivial concepts related to the numerical solution of stochastic differential equations, it is better to stick to the classification above. It is also possible to create somewhat hybrid models, such as stochastic processes including both a diffusion component, which is continuous, and jumps, which are related to discrete-event dynamics.

1.3.1 DISCRETE-TIME MODELS

In discrete-time models we assume that the time horizon [0, T] is discretized in time intervals (time buckets) of width δt. In principle, nothing forbids nonuniform time steps, which may actually be required when dealing with certain financial derivatives; nevertheless, we usually stick to the uniform case for the sake of simplicity. The system is actually observed only at time instants of the form

equation

where T = M δt. What happens between two discrete-time instants is not relevant. Typically, we forget about the time bucket size δt, which could be a day, a week, or a month, and we use discrete-time subscripts, like t = 0, 1, 2, …, directly.

Example 1.3 Cash on hand

Consider the cash management problem for a retail bank. During each day t, we observe cash inflows wt+ and cash outflows wt, corresponding to cash deposits and withdrawals by customers, respectively. Hence, if we denote by Ht the cash holding at the end of day t, we have the dynamic equation

equation

Such a model makes sense if we are only interested in the balance at the end of each day.

The AR model of Example 1.2 is a very basic example of time series models. From a technical point of view, simulating a time series model is usually no big deal.

Example 1.4 Simulating a simple AR process

Let us consider the following autoregressive process:

equation

where t ~ N(0, σ2). The R function in Fig. 1.3 can be used for its simulation. In the specific case

(1.3) equation

we run the function as follows:

> set.seed(55555)
> AR(40,.8, 8, 1, 50)

obtaining the plot in Fig. 1.4.

FIGURE 1.3 R code to simulate a simple autoregressive process.

FIGURE 1.4 Sample path of the AR process of Eq. (1.3).

In the above examples we had the purely random evolution of a system. In a more general setting, we may have the possibility of influencing, i.e., of partially controlling its dynamics. For instance, in discrete-time stochastic optimization models, we deal with a system modeled by the state transition equation

(1.4) equation

where

st is a vector of state variables, observed at the end of time period t.
xt is a vector of control variables, i.e., decisions made after observing the state st.
t+1 is a vector of disturbances, realized after we apply the control variable xt.

If the sequence of disturbances t is intertemporally independent, we obtain a Markov process. Markov processes are relatively easy to represent, simulate, and control. We should also observe that the state need not be a continuous variable. We may have a system with discrete states, resulting in a discrete-time Markov chain.6 As an example, we may consider a Markov chain modeling the migration in the credit rating of a bond issue. Sometimes, we use a discrete-state Markov chain to approximate a continuous-state system by aggregation: We partition the state space with a mutually exclusive and collectively exhaustive collection of subsets and identify each one with a single representative value.

1.3.2 CONTINUOUS-TIME MODELS

The typical representation of a continuous-time model relies on differential equations. We are all familiar with the differential equation F = ma from elementary physics, where F is force, m is mass, and a is acceleration, i.e., the second-order derivative of position with respect to time. As a more financially motivated example, imagine that you deposit an amount of cash B(0) = B0, at time t = 0, in a safe bank account. If r is the continuously compounded interest rate on your deposit,7 the evolution in time of wealth over time can be represented as

(1.5) equation

whose solution, given the initial condition B(0) = B0, is

equation

If the interest rate is not constant, and it is given by r(t), we obtain

(1.6) equation

This solution is easily obtained by rewriting Eq. (1.5) in the differential form

equation

which, by recalling the derivative of the natural logarithm, can be transformed to

equation

Integrating on the interval [0, t] and taking the exponential to get rid of the logarithm immediately yields Eq. (1.6). In less lucky cases, we have to resort to numerical methods, which typically yield a discrete-time approximation of the continuous-time system.

Monte Carlo methods come into play when randomness is introduced into the differential equation, yielding a stochastic differential equation. A well-known model in this vein is the GBM

(1.7) equation

which is arguably the most elementary model used to represent the random evolution of stock prices. With respect to the previous case, we have an additional term W(t), which is a Wiener process. We will deal later with the underpinnings of this kind of stochastic process,8 but for now suffice to say that the increment of the Wiener process over a finite time interval [t,t + δt] is a normal random variable:

equation

Loosely speaking, the differential dW(t) is the limit of this increment for δt → 0. Thus, we might consider simulating random sample paths of GBM by a straightforward discretization scheme:

equation

This is called the Euler discretization scheme, and it immediately leads to the discrete-time process

equation

where (t + δt) ~ N(0, 1). This kind of process, in terms of simulation, is not unlike an AR process and is easy to deal with. Unfortunately, this rather naive discretization suffers from a significant inconvenience: We obtain the wrong probability distribution. Indeed, it is easy to see that we generate normally distributed prices; therefore, at least in principle, we may generate a negative price, which does not make sense for limited liability assets like stock shares.

Using tools from stochastic calculus, we will see that we may solve the stochastic differential equation exactly and find an exact discretization:

(1.8) equation

This results in a lognormal distribution, ruling out negative prices. We also notice that μ is related to the drift of the process, i.e., its average rate of change over time, whereas σ plays the role of a volatility.

Example 1.5 Simulating geometric Brownian motion

The R code in Fig. 1.5 can be used to generate sample paths of GBM. The function returns a matrix whose rows correspond to numRepl sample paths starting from initial state s0, over numSteps time steps from 0 to T, for a GBM with parameters μ and σ. The R snapshot of Fig. 1.6 yields the plots in Fig. 1.7. We immediately appreciate the role of the volatility σ.

FIGURE 1.5 R code to generate sample paths of geometric Brownian motion.

FIGURE 1.6 R script to plot GBM sample paths for different volatilities.

FIGURE 1.7 Sample paths of geometric Brownian motion for different levels of volatility.

This solution is exact in the sense that there is no discretization error in the sample path generation; we are left with sampling errors that are typical of Monte Carlo methods. In other cases, things are not that easy: Even though, in the end, continuous-time models may boil down to discrete-time models, they do have some peculiarities, which may be hard to deal with.

1.3.3 DISCRETE-EVENT MODELS

The GBM sample paths depicted in Fig. 1.7 look noisy enough, but continuous. Indeed, it can be shown that they are continuous, in a stochastic sense, but nondifferentiable everywhere. The sample paths of a discrete-event dynamic system are quite different in nature. It is important to avoid a possible confusion between discrete event and discrete time. Discrete-event systems are continuous-time models and, unlike discrete-time models, there is no constant time step δt. Like discrete-time systems, the state changes only at some time instants, corresponding to the occurrence of some event, but the time between events is in general a continuous random variable.

Perhaps, the simplest discrete-event model, to be further discussed in Section 3.2.2, is the Poisson process. This is an instance of a counting process: The value N(t) counts the number of events that occurred during the time interval [0, t], where the times Xk elapsing between events k − 1 and k, k = 1, 2, 3, …, are a sequence of independent and identically distributed exponential variables. By convention, X1 is the time at which the first event occurs after the start time t = 0. A sample path of a Poisson process is illustrated in Fig. 1.8; we see that the process “jumps” whenever an event occurs, so that sample paths are piecewise constant. The underlying exponential distribution is characterized by a single parameter λ that can be interpreted as a “rate,” i.e., the average number of events occurring per unit time.9 This basic Poisson process can be generalized by allowing for a nonconstant rate λ(t), which yields an inhomogeneous Poisson process. We may also allow for arbitrary jump sizes. If we associate a continuous probability distribution with the jump size, we obtain a continuous-state system, whereas unit jumps yield a discrete state space corresponding to integer numbers. This second generalization is called compound Poisson process, which is a discrete-event, continuous-state model.

FIGURE 1.8 Sample path of the Poisson process.

Financially motivated examples of such models are:

Models of stock prices allowing for jumps. Pure-jump models have been proposed, but we may also integrate diffusion processes, like GBM, with jumps. Stochastic processes in this more general form are known as Lévy processes.
Models of credit rating migration, where the rating of a bond issue changes across a finite set of states at random times. This is a discrete-event, discrete-state model. Continuous-time Markov chains are an example in this family.

Discrete-event models are the rule in other application domains, like the simulation of manufacturing systems and logistic networks.10 These kinds of simulation problems are, in a sense, dual to the simulation problems faced in economics and finance: The mathematics involved is generally trivial, but there is an incredible complexity in data bookkeeping, resulting in huge computer programs, unless one resorts to a special-purpose simulation tool. On the contrary, when dealing with a stochastic differential equation, the program often consists of just a few lines of code, possibly hiding a quite sophisticated mathematical complexity. In the next section we give a simple example of a discrete-event system, so that the reader can appreciate the mechanics involved.

1.3.3.1 Simulation of discrete-event systems: a single queue

Let us consider a very simple system consisting of a single server (e.g., a bank teller), which must satisfy a set of incoming requests for service (e.g., customers entering a bank). The time between arrivals of consecutive customers is random; we might assume that it is a plain Poisson process or some time-varying process. The time needed to serve each request is random as well. The system dynamics is not difficult to describe:

If the arriving customer finds the server idle, it starts her service immediately and will free the resource after service completion; the server state changes from idle to busy.
If the server is idle, the new customer joins a queue of requests waiting for service. The simplest queuing discipline is first-in, first-out (FIFO), i.e., requests are handled according to the order of their arrivals.
When the server terminates a service, it starts servicing the first customer in the queue, if any; otherwise, its state changes from busy to idle.

In order to describe the state of this system, we would certainly use L(t) = 0, 1, 2, …, the number of customers in the system, which includes the one being served, if any, and the state of the server, S(t) {0, 1}, where we assume that 1 corresponds to busy and 0 to idle. We also need some information about the future departure of the currently served customer, if any. A sample path of this single-server queuing system is depicted in Fig. 1.9. Statistics that we might be interested to collect are the average queue length or the average server utilization, which are essentially the integrals of the sample paths in Fig. 1.9 divided by the simulation time horizon. We immediately see that some bookkeeping is required, as these are not the familiar sample means, but rather integrals over time. A performance measure that can be evaluated as a sample mean is the average customer waiting time.

FIGURE 1.9 Sample path of queuing system consisting of a single server.

In some specific cases, there is no need to resort to simulation, as the system can be analyzed mathematically. For instance, if the service and the interarrival times are both exponentially distributed, the system is essentially a continuous-time Markov chain, thanks to the memoryless property of the exponential distribution.11 Such a queue is denoted by M/M/1, where the M refers to the memoryless nature of the two stochastic processes involved, customer arrivals and customer service, and the 1 refers to a system consisting of a single server. If we denote the arrival and service rates by λ and μ, it can be shown that

(1.9) equation

where ρ = λ/μ is the utilization rate; clearly, stability requires μ > λ, i.e., the service rate, i.e., the average number of requests served per unit time, must be larger than the arrival rate. In more complicated cases, typically involving network of servers, or more difficult distributions, or different queuing disciplines, one may be forced to resort to simulation.

Let us denote by Aj, Sj, and Wj the arrival time, service time, and waiting time, respectively, for customer j. It is possible to find a recursive equation for the waiting time of customer j. Customer j has to wait if it arrives before the completion time of customer j − 1. Denoting the latter quantity by Cj−1, we have

(1.10) equation

However, the completion time of customer j − 1 is

equation

Plugging this into Eq. (1.10), we find

(1.11) equation

where IjAjAj−1 is the interarrival time between customers (j − 1) and j. We note that, in a more general setting possibly involving priorities, balking, or multiple servers, complicated data bookkeeping would be needed to track customer waiting times, requiring suitable data structures and a corresponding programming effort. In this lucky case, it is easy to implement the recursion in R, as shown in Fig. 1.10. In Fig. 1.11 we also show an R script running the simulator in order to check the results against the exact formula of Eq. (1.9). With ρ = 90.90%, which is a fairly high value for the utilization rate, we observe a lot of variability around the correct value:

FIGURE 1.10 Implementation of Lindley’s recursion for a M/M/1 queue.

FIGURE 1.11 Script to check the function MM1_Queue ().

> rho/mu/(1-rho)
[1] 9.090909
> MM1_Queue(lambda, mu, 100000)
[1] 9.026053
> MM1_Queue(lambda,mu,100000)
[1] 10.34844
> MM1_Queue(lambda,mu,100000)
[1] 8.143615

For a low utilization rate, such as ρ = 50%, we get more reliable results:

> rho/mu/(1-rho)
[1] 0.5
> MM1_Queue(lambda,mu, 100000)
[1] 0.497959
> MM1_Queue(lambda,mu,100000)
[1] 0.5044687
> MM1_Queue(lambda,mu,100000)
[1] 0.4961641

We need to quantify the reliability of the above estimates, possibly by confidence intervals, as we show in Chapter 7 on simulation output analysis.

1.4 Simulation and optimization

There is a considerable interplay between Monte Carlo methods and optimization problems. A generic finite-dimensional optimization problem can be stated as

(1.12) equation

where x n is an n-dimensional vector collecting the decision variables, Sn is the feasible set, and f is the objective function. The feasible set S is usually represented by a collection of equalities and inequalities. In recent years, there has been an astonishing progress in the speed and reliability of commercial software tools to solve wide classes of optimization problems. As a general rule that we shall illustrate in more detail in Chapter 10, convex optimization problems, whereby the function f and the set S are both convex, can be efficiently solved. In particular, large-scale linear programming problems can be solved in a matter of seconds or minutes. Other families of convex problems, such as second-order cone programming models, can be solved rather efficiently by relatively recent interior-point methods.

The joint progress in algorithmic sophistication and hardware speed makes optimization modeling a more and more relevant tool for many practical problems, including those motivated by economical or financial applications. Nevertheless, there are classes of problems that still are a very hard nut to crack. They include the following:

1. Nonconvex problems, i.e., either problems with a nonconvex objective function featuring many local optima, or problems with a nonconvex feasible set, like integer programming and combinatorial optimization problems
2. Stochastic optimization problems involving uncertainty in the problem data
3. Dynamic decision problems affected by uncertainty, which preclude the determination of an optimal static solution, but on the contrary require a dynamic strategy that adapts decisions to the unfolding of uncertainty

We will see much more on this kind of models, and the related solution algorithms, in Chapter 10. Here we just want to point out that Monte Carlo methods play a key role in tackling these problems, which we also illustrate in the next subsections.

1.4.1 NONCONVEX OPTIMIZATION