Chapter 13 Stochastic Optimal Control

Chapter 13 Stochastic Optimal Control The state of the system is represented by a controlled stochastic process. Section 13.2 formulates a stochastic optimal control problem. We shall consider stochastic differential equations of a type known as Itô equations, which are perturbed by Markov diffusion processes, and our goal will be to synthesize optimal feedback controls for systems subject to Itô equations.

In Section 13.3, we shall extend the production planning model of Chapter 6. In Section 13.4, we solve an optimal stochastic advertising problem. In Section 13.5, we will introduce investment decisions in the consumption model of Example 1.3, and consider both risk-free and risky investments. In Section 13.6, we will conclude the chapter by mentioning stochastic optimal control problems involving jump Markov processes, which are treated in Sethi and Zhang (1994a, 1994c).

13.2 Stochastic Optimal Control We assume the state variables to be observable, and we use the dynamic programming or the Hamilton- Jacobi-Bellman framework rather than the stochastic maximum principle. Maximize subject to the Itô stochastic differential equation V(x,t), the value function satisfies

By Taylor’s expansion, we have From (3.29), we can formally write The multiplication rules of the stochastic calculus are:

13.3 A stochastic Production Planning Model Xt = the inventory level at time t (state variable), Ut = the production rate at time t (control variable), S = the constant demand rate at time t ; S >0, T = the length of planning period, = the factory-optimal inventory level, = the factory-optimal production level, x0 = the initial inventory level, h = the inventory holding cost coefficient, c = the production cost coefficient, B = the salvage value per unit of inventory at time T, zt= the standard Wiener process, = the constant diffusion coefficient.

Let V(x,t) be the value function. It satisfies the Hamilton-Jacobi-Bellman (HJB) equation

Remark 13.1: If production rate were restricted to be nonnegative, then (13.44) would be changed to

13.3.1 Solution for the Production Planning Problem Since (13.51) must hold for any value of x , we must have where the boundary conditions are obtained by comparing (13.47) with the boundary condition V(x,T) = Bx of (13.43).

To solve (13.52), we expand by partial fractions to obtain where Since S is assumed to be a constant, we can reduce (13.53) to By the change of variable defined by the solution is given by

Remark 13.2 The optimal production rate in (13.59) equals the demand plus a correction term which depends on the level of inventory and the distance from the horizon time T. Since (y-1) < 0 for t < T, it is clear that for lower values of x, the optimal production rate is likely to be positive. However, if x is very high, the correction term will become smaller than –S, and the optimal control will be negative. In other words, if inventory level is too high, the factory can save money by disposing a part of the inventory resulting in lower holding costs.

Figure 13.2: A Sample Path of Xtwith X0=x0 >0 andB > 0

13.4 A Stochastic Advertising Problem The model is : where Xt is the market share and Ut is the rate of advertising at time t .

V(x) is the expected value of the discounted profits from time t to infinity. Since T = , the future looks the same from any time t, and therefore the value function does not depend on t. We can write the HJB equation as

We obtain the explicit formula for the optimal feedback control as Eventually, the market share process hovers around the equilibrium level

13.5 An Optimal Consumption-Investment Problem Consider investing a part of Rich’s wealth in a risky security or stock that earns an expected rate of return that equals  > r . The problem of Rich, known now as Rich Investor, is to optimally allocate his wealth between the risky-free savings account and the risky stock over time and also consume over time so as to maximize his total utility of consumption. The savings account is easy to model.

Modeling the stock is more complicated. where  is the average rate of return on stock, is the standard deviation associated with the return, and zt is a standard Wiener process. The price process Pt given by (13.74) is often referred to as a logarithmic Brownian Motion.

Notation: Wt= the wealth at time t , Ct = the consumption rate at time t , Qt = the fraction of the wealth invested in stock at time t, 1-Qt= the fraction of the wealth kept in the savings account at time t , U(c) = the utility of consumption when consumption is at the rate c; the function U(c) is assumed to be increasing and concave,  = the rate of discount applied to consumption utility, B = the bankruptcy parameter to be explained later.

The term QtWtdt represents the expected return from the risky investment of QtWt dollarsduring the period from t to t+dt. The term QtWtdztrepresents the risk involved in investing QtWt dollars in stock. The term (1-Qt)rWt dt is the amount of interest earned on the balance of(1-Qt)Wt dollars in the savings account. Finally, Ctdt represents the amount of consumption during the interval from t to t+dt .

We shall say that Rich goes bankrupt at time T , when his wealth falls to zero at that time. T is a random variable, called a stopping time, since it is observed exactly at the instant of time when wealth falls to zero. Rich’s objective function is: See Sethi(1997a) for a detailed discussion of the bankruptcy parameter B .

Simplify the problem by assuming: We also assume B =-. The condition (13.81) together with B =- implies a strictly positive consumpton level at all times and no bankruptcy.

Karatzas, Lehoczky, Sethi, and Shreve(1986) assumed that the value function is strictly concave and, therefore, Vx is monotonically decreasing in x . This means that The function c(•) defined in (13.83) has an inverse X(•) such that (13.84) can be written as Note that c(X(c))=c, c’(X)X’(c)=1, and therefore, c’(X)= 1/ X’(c).

Differentiation with respect to c yields the intended second-order, linear ordinary differential equation This problem and its many extensions have been studied in great detail. See,e.g., Sethi(1997a).

13.6 Concluding Remarks For stochastic optimal control application to manufacturing problems, see Sethi and Zhang (1994a) and Yin and Zhang (1997). For applications to problems in finance, see Sethi(1997a) and Karatzas and Shreve(1998). For applications in marketing, see Tapiero(1988). For applications in economics including economics of natural resources, see Derzko and Sethi(1981a), and Malliaris and Brock(1982).

Chapter 13 Stochastic Optimal Control

Chapter 13 Stochastic Optimal Control

Presentation Transcript

Optimal Dynamical Decoherence Control

Optimal Control of Systems

Inventory Control with Stochastic Demand

Chapter 7 Multivariable and Optimal Control

Optimal protocols and optimal transport in stochastic termodynamics

Optimal control

Chapter 6 - Burl Optimal Quadratic Control

Stochastic Control of Heterogeneous Networks

Optimal control for integrodifferencequations

An Incremental Sampling-based Algorithm for Stochastic Optimal Control

Optimal Control Theory

Optimal Control Theory

Optimal control subsumes harmonic control

Chapter 13 Control Structures

Optimal Control

PARAMETRIC OPTIMAL CONTROL PROBLEMS

Chapter 13 Control

YLE13: Optimal control theory

Distributed Stochastic Model Predictive Control

Optimal Dynamical Decoherence Control

OPTIMAL CONTROL SYSTEMS

Optimal Sampling Strategies for Multiscale Stochastic Processes