Integration Methods for Multidimensional Probability Integrals

Integration Methods for MultidimensionalProbability Integrals Abebe Geletu abebe.geletu@tu-ilmenau.de www.tu-ilmenau.de/simulation IFAC2011 Pre-Conference Tutorial August 27-28, 2011, Milano Group of Simulation and Optimal Processes (SOP) Institute for Automation and Systems Engineering Technische Universität Ilmenau

Introduction One-dimensional probability integrals Cubature rules for multidimensional 3.1. Full-grid cubature rules 3.2. Sparse-grid cubature rules Application Conclusions Resources and References Content

1. Introduction Task: to solve the optimization problem Define:

Introduction … The problem CCOPT is equivalently written as • Question: How to solve the problem NLP? • Use either gradient-based or gradient-free optimization algorithms or a combination of both. • (A) Any optimization algorithm requires • value of the objective function • Value of the constraint function • for each given .

Introduction … • The most difficult task in solving CCOPT is the evaluation of the values of the chance constraint • for a given . • Note that is a random variable, since is random. • If is non-linear w.r.t. it is difficult to determine the distribution of the random variable from that of . • Approach : back-projection through a montony relation • Let be a 1D random variable with a known distribution. • Let and is a strictly increasing real valued function. • Then The latteris simpler tocompute!!

Introduction … • In higher dimensions (experimentally or analytically) study the equation • and among find a which has a strict monotony relation with ; so that : • is either strictly increasing or striclty decreasing

Introduction… • (B) Gradient-based algorithms further require • gradient of the objective function and • Gradient of the constraints function • when these derivatives exist. • In both (A) and (B) values and gradient are computed through evaluation of multidimensional integrals of the form • Except in some special cases • these integrals cannot be computed analytically • integrals on higher dimensions are computationally expensive

Introduction… Numerical methods for evaluation of multidimensional integrals. • Fast and efficient evaluation of probability integrals reduces computational expenses in the overall optimization strategy for CCOPT. • Since integration is done with respect to we drop the parameterization with and consider only • for the sake of simplicity.

Introduction … Methods for Multidimensional Integrals Deterministic Methods • Quadrature rules – for 1D integrals • Full or Spars-grid cubature rules – for MD integrals Sampling-based Methods • Monte Carlo Methods • Quasi-Monte Carlo methods • Deterministic integration rules for multidimensional integrals (commonly called cubature rules) are usually constructed from 1D quadrature rules.

2. Qudrature Rules Given the 1D integral where is a finite or an infinite interval and is a non-negative weight function. • The weight function corresponds to a probability density function of a random variable on the set . • represents expected value. Weights and integration domains for some standard 1D probability integrals

Qudrature Rules … Idea: to approximate the integral by a weighted sum • so that • the approximation error • is as small as possible; • the quadrature rule be capable of approximating 1D integrals • for a large class of functions . Important issues: How to generate • quadrature nodes: • corresponding weights: and • the number of function evaluations ?

Qudrature Rules … • The number of nodes is a trade-off between accuracy and efficiency. In general, the use of several nodes may not reduce the approximation error!! Methods for 1D integrals: • Newton-Cotes Formulas • -less accurate for probability integrals • Monte-Carlo and Quasi-Monte Carlo • - mainly used for multidimensional integrals • Gauss quadrature rules and Kronord/Patterson extenstions • - highly accurate and suitable for probability integrals • Clenshaw-Curtis quadrature rule • - suitable for probability integrals • Required property: • All the quadrature weights are non-negative.

Quadrature Rules … • Non-negative quadrature weights have advantages: • the use of non-negative weights reduces the danger of numerical cancellations in ; • in stochastic optimization, if is convex w.r.t. u for each fixed , then • is convex and due to the non-negative of weights the approximation • preserves the convexity.

Orthogonal Polynomials and Gauss Quadrature Rules Ref: Walter Gatushi : Orthogonal Polynomials • For two functions and • defines a scalar product w.r.t. on the set . • A degree n polynomial and a degree m polynomial are orthogonal on w.r.t. if • Different pairs lead to different sets of orthogonal polynomials. • Example: Sets of orthogonal polynomials for standard pair: • Hermit polynomials - • Jacobi polynomials - etc. • Quadrature nodes and corresponding weights • are computed based on orthogonal polynomials with respect to a given pair .

Generation of orthogonal polynomials Ref: Walter Gatushi : Orthogonal Polynomials • Given any pair . Let and all moments • exist and finite. Then • there is a unique set of orthogonal polynomials • corresponding to satisfying • (ii) the set of polynomials are uniquely determined by the three-term recurrence relation using and • where and

Computation of Gauss quadrature nodes and weights Ref: Golub, G.H., Welsch, J.H. Calculation of Gauss quadrature rules. • the computation of the coefficients of the recurrence relation requires algorithms (commonly known as Steleje’s procedures see Gander & Karp for a stable algorithm. • For known recurrence coefficients the N nodes • and weights of a Gauss • quadrature rules are computed from the Jacobi matrix • Note that: all the quadrature weights computed above are nonnegative Gauss quadrature rules have non-negative weights. • the quadrature nodes all lie in the interior of .

Standard Gauss Quadrature Rules Ref: Abramowitz, M. and Stegun, I. A. Handbook of Mathematical Functions with Furmulas, Graphs, and Mathematical Tables • There are software and lookup tables for standard quadrature nodes and corresponding weights. Table : Standard Gauss-quadrature rules NB: All 1D probability integrals corresponding to standard Gauss-quadrature rules can be computed with non-negative quadrature weights.

Gauss Quadrature Rules – Exactness • An N-point Gauss quadrature rule computes all polynomials with degree less or equal to exactly; i.e. • the quadrature rule has a degree of (polynomial) exactness equal to 2N-1 a probability integrals with a polynomial integrand can be computed exactly. • if the integrand is a general function of the uncertain variable, the approximation error depends on the smoothness of the function and the number N of integration nodes; • integration nodes and weights are generated independent of the integrand; • quadrature rules with odd number N of nodes are usually more preferred.

Example: Gauss-Hermite quadrature Ref: Abramowitz, M. and Stegun, I. A. Handbook of Mathematical Functions with Furmulas, Graphs, and Mathematical Tables Gauss-Hermite quadrature nodes and weights 3-Point Gauss-Hermite quadrature Polynomial exactness up to degree 7-Point Gauss-Hermite quadrature Polynomial exactness up to degree

Example: Gauss-Hermite quadrature Suppose to evaluate where Variable transformation :

Example: Gauss-Hermite quadrature Using quadrature nodes based on the weight function we have For special case A 3-Point quadrature is not efficient to compute

Embedded quadrature rules Ref: Trefethen: Is Gauss quadrature better than Clenshaw-Curtis? • Let If the set of nodes for is a • subset of the set of nodes for ; i.e., • the quadature rule is embedded. Clenshaw-Curtis QuadratureRule on • Quadrature nodes • All quadrature weights in are positive; • have polynomial exactness equal to only (less than Gauss-quadrature rules); • it is an embedded quadrature rule Given , construction of requires only additional points; values of already computed for can be reused in saves time.

Advantages of Embedded quadrature rules • Unfortunately, Gaussquadratturerulesarenotembedded. Example: Gauss-Hermite quadrature Someadvantages of embeddedrules: • nodes of lowerdegree quadrature canbeusedwhenconstructinghigherdegreequadratures; • provideseasiererrorestimationfor quadrature rule; • embeddednessis a highlydesiredpropertyfortheconstruction of multidimensional cubaturetechniques, etc.

Kronord and Patterson Extensions Ref: Laurie, D.P. Calculation of Gauss-Krnord quadrature rules. • ExtendGaussquadraturerulesto makethemembedded. Kronord‘s Extension (Gauss-Krnord quadrature) • Given Gauss quadrature nodes , between every two nodes add one new node: • so that the new set of nodes • embeds the former ones. • the new quadrature weights are non-negative • Nodes of lower degree quadrature can be used in constructing higher degree quadratures. Degree of exactenss

Patterson’s Extensions Ref: Patterson, T.N. L. The optimum addition of points to quadrature formula. • To existing Gauss quadrature nodes add p new quadrature nodesso that the resulting rule has a maximum degree of accuracy • and the weights so that the new set of are non-negative. • Hence, • In general, • pre-fixing integration nodes reduces degree of exactness; • construction of embedded Gauss quadrature rules is not a trivial task. (see Laurie 1997)

Quadrature for non-standard integrals Note that: transform integrals on non-standard intervals onto the standard ones. • Examples of some possible transformations: • etc. • transformation is done in such a way the resulting integral is easier to compute; • when possible to try to match the resulting weight function and integration domain, so that available results can be easily used.

Quadrature for non-standard integrals Example This can be applied to a chance constraint as • The transformed integral can be computed using either Gauss-Legendre or Gauss-Chebychev quadrature rule.

From Chance to Expected Value Constraint Ref: Geletu et al: Monotony Analysis and Sparse-Grid Integration for nonlinear Chance Constrained Process Optimization Example: with Monotony relation

From Chance to Expected Value Constraint … where The inner integral can be transformed into using the change of variables

3. Multidimensional probability integrals Problem: Given a (continuous) function and a non-negative weight function how to compute the integral: In many practical applications the indefinite integral does not have analytic expression. The domain of integration commonly has a product form

Multidimensional probability integrals … Standard integration domains Note: Transform non-standard integrals into standard forms. Example: Let be a random variable w.r.t. the probability measure such that and Then represents the expected value of w.r.t. the probability measure .

Multidimensional probability integrals … Assumptions: (A1) The weight (probability density) function can be written as where Assumption (A1) holds true if the are independent random variables. (A2) The domain of integration Example:

Numerical method for multidimensional Integrals Two major approaches • Cubature Techniques (rules) • 3.1. Full-Grid Integration Techniques • 3.2. Sparse-Grid Integration Techniques • Sampling based Techniques will not be discusses here • (IIIA) Monte-Carlo (MC) Integration Techniques • (IIIB) Quasi-Monte-Carlo (QMC) Integration Techniques • Cubature techniques are constructed based on one dimensional quadrature rules. • One-dimensional interpolatory Gauss quadrature rules (and their extensions) are found to be efficient, due to their higher degree of accuracy. • MC methods use randomly generated samples from • QMC methods use sequence of integration nodes from with lower discrepancy.

Suppose assumptions (A1) and (A2 ). Let for 3.1. Full-Grid Cubature techniques are quadrature nodes; corresponding weights; for the one-dimensional integral on with the weight function A full-grid cubature rule to compute I[f] is called full-grid tensor-product of one dimensional quadrature-rules or product rule.

Important questions: How many grid-points are there in the full-grid cubature rule ? That is, the set Is it necessary to use all the grid points in ? Is there redundancy in the full-grid scheme? What is the polynomial (or degree) of exactness of ? How good are full-grid cubature techniques? The number of grid-points (integration nodes) in : If then the number of grid points will be exponential growth!!

Examples – full-grid Techniques Example: For a 5-dimensional integral, a full-grid quadrature rule using 11-quadrature nodes in each dimension uses cubature nodes. • requires large number of function evaluations even for moderate dimensions; • not efficient for problems of higher dimensions. In particular, computationally expensive for stochastic optimization. Example: Let 7-Point 2D Full-grid nodes

Multidimensional polynomials and exactness of cubature techniques • One measure of quality for a cubature rule is related with the largest degree polynomial that it can integrate exactly. • For the variables a monomial of degree in the variables is an expression of the form • where • Example:For two variables • the following are monomials of degree 3 • all monomials in the two • variable of degree less equal to 3 are

Multidimensional polynomials and exactness of cubature techniques … • The number of distinct monomials in variables, degree less than equal to is equal to • A multidimensional polynomialof degree in the variables is a linear combination of monomials of degree less or equal to ; i.e. Hence, Example: (a) (b)

Multidimensional polynomials and exactness of cubature techniques … Ref: Cools, R: Advances in multidimensional Integration. • A cubature rule is said to be exact for a polynomial if • That is, • A cubature rule is said to have a polynomialexactness (or degree of accuracy) if it is exact for all polynomials of degree less than equal to .

Multidimensional polynomials and exactness of cubature techniques … Ref: Cools, R: Advances in multidimensional Integration. Theorem(Cools 2002). A cubature rule constructed as a tensor-product of one-dimensional Gauss-quadrature rules: with degree of exactness of the quadrature rule equal to then the degree of exactness of is equal to In particular, if , then the degree of exactness of will be equal to Good Idea: Let be an arbitrary function. For an accurate evaluation of use a cubature rule with a higher degree of accuracy . If itself is a polynomial with degree less or equal to , then

Multidimensional polynomials and exactness of cubature techniques … Fact:Higher accuracy in computing can be achieved by using a cubature rule with higher degree of exactness. Example: Consider a full-grid 2D cubaturer for the integral Almost equal result from two full-grids

Redundancy in the full-grid integration technique Ref: Davis, P.J., Rabinowitz, P.: Methods of numerical integration. The use of Gauss-Hermite full-grid cubature nodes leads to, too many function evaluations only with a little gain in accuracy. • Redundancy in the full-grid cubature techniques. • The use of cubature nodes can lead the curse of dimensions. Question: How many integration nodes are sufficient to obtain a polynomial exactness ? Answer: Theorem (Möller 1976, Mysovskikh 1968, Tchakaloff 1957) To attain a polynomial exactness equal to , the (optimal) required number of grid points in has lower and upper bounds given by • Known as Möller’s lower bound, while is Mysovskikh’s upper bound • (for unbounded ) or Tchakaloff upper bound (for bounded ).

Efficient cubature rules - represents the largest integer less than or equal to a; Definition(Davis & Rabinowitz) A cubature rule is said to be efficient (optimal) if it uses integration nodes. • represents the smallest integer greater or equal to a; • eg. • Now it is obvious that , for large n.

How many of cubature nodes? • For a 10-dimensiona integral we find the following values • For the computation of a 10–dimensional probability integral, using the full-grid technique with a degree of precision d=7, requires 1,048,576 function evaluations. Unafforable in the conext of stochastic optimization.

Construction of Efficient Cubature Rules Ref: Smolyak, S.:Quadrature and interpolation formulas for tensor products of certain classes of functions. Remark: The Theorems of Möller, Mysovskikh or Tchakaloff are non-constructive. Question: How to construct cubature rules with minimal number of nodes; i.e. number of nodes near or equal to ? If not, rules with number of nodes lying between the bounds and ? • The construction of cubature techniques with minimum number of integration nodes and higher polynomial exactness is still a hot research topic!! In fact, construction of cubature rules is partly an art as well as a science (Cools 1997). In 1963 Smolyak gave a scheme for construction of cubature techniques with number of nodes between and . Leading to a class of cubature rules known as Smolyak’s tensor-product integration rules or sparse-grid integration techniques.

3.2. Sparse-grid integration techniques Recall the integral with assumptions (A1) and (A2). Assumption(A3): For the sake of simplicity, we assume The random variables are independent and identically distributed. • In general, assumption (A3) is not required. Sparse-grid cubature rules can be constructed for independent but non-identically distributed random variables • However, correlated variables need to be de-correlated (or transformed ) for construction a sparse-grid integration rule. Now, according assumption (A3), consider the same quadrature rules on each using , so drop the index .

Sparse-grid integration techniques… Assumption(A4): For each one dimensional cubature rule on , there is a sequence of sets of quadrature nodes with The 1D quadrature rule with nodes is for • If the quadrature nodes in assumption (A4) satisfy the property that, • Then corresponding sequence of quadrature rules is called a nested or embedded quadrature rules. • Such sequence of quadrature rules can be constructed based on • Curtis-Clenshaw, Krnord/Patterson extension rules, etc .

Construction of sparse-grid integration rules Let be a multi-index such that • Smolyak 1963 (also Wasilkowski & Woznikowski 1995 ): • A sparse-grid rule based on the sequence of quadrature rules • for the approximation of the n-dimensional integral with a degree of accuracy is • where • for

Sparse-grid integration rules - Examples • For n=2 and d=7, the sparse grid technique takes the form • Observe that

Sparse-grid integration rules - Examples For n=2 and d=7, the sparse-grids are spread as follows: Fig: with 29 grid-points.

Integration Methods for Multidimensional Probability Integrals

Integration Methods for Multidimensional Probability Integrals

Presentation Transcript

Non-Probability sampling methods

Probability in Methods

Methods for Data Integration

2. Definite Integrals and Numeric Integration

Vector integrals Line integrals Surface integrals Volume integrals Integral theorems

Approximate methods for calculating probability of failure

Multidimensional Gradient Methods in Optimization

Multidimensional Access Methods

Integration by Parts Integration Using Tables of Integrals Numerical Integration

Probability Sampling Methods

Integration Methods

Counting Methods and Probability

INTEGRALS

Integration Methods for Multidimensional Probability Integrals

Numeric Integration Methods

Numeric Integration Methods

Discrete variational derivative methods: Geometric Integration methods for PDEs

Multidimensional Integration Part I

Integration Methods

Numeric Integration Methods

Formal Methods for System Integration