High-dimensional model representation technique for the solution of stochastic PDEs

High-dimensional model representation technique for the solution of stochastic PDEs Nicholas Zabaras and Xiang Ma Materials Process Design and Control Laboratory Sibley School of Mechanical and Aerospace Engineering101 Frank H. T. Rhodes Hall Cornell University Ithaca, NY 14853-3801 Email: zabaras@cornell.edu URL: http://mpdc.mae.cornell.edu/ CSE09, SIAM Conference on Computational Science and Engineering, Miami, FL, March 2-6, 2009

Outline of the presentation • Problem definition • Basic concepts of high-dimensional model • representation (HDMR) • Adaptive sparse grid collocation (ASGC) method for • interpolating component functions • Examples • Conclusions

Motivation All physical systems have inherent associated randomness • SOURCES OF UNCERTAINTIES • Multiscale nature inherently statistical • Uncertainties in process conditions • Material heterogeneity • Model formulation – approximations, • assumptions Why uncertainty modeling ? Assess product and process reliability Estimate confidence level in model predictions Identify relative sources of randomness Provide robust design solutions

Motivation • Previous developed conventional and adaptive collocation methods are not suitable for high- dimensional problems due to their weakly dependence on the dimensionality (logarithmic) in the error estimate. Although ASGC can alleviate the problem to some extent, it depends on the regularity of the problem and it is only effective when some random dimensions are more important than others. CSGC • High-dimensional model representation (HDMR) is an efficient tool to capture the model output as finite number of hierarchical correlated function-expansions in terms of the inputs starting from lower-order to higher-order. There does not exist a closed form for implementation to higher-order expansion. We try to develop a general formulation of HDMR for efficient computer implementation and to achieve higher accuracy. To represent the component functions of HDMR, either low-dimensional data table or tensor-product type Lagrange polynomial interpolation are currently used, which significantly affects its accuracy of interpolation. • In the current work, we combine the strength of the HDMR and ASGC methods to develop a stochastic dimension reduction method.

Problem definition • Define a complete probability space . We are interested to find a stochastic function such that for P-almost everywhere (a.e.) , the following equation holds: where are the coordinates in , L is (linear/nonlinear) differential operator, and B is a boundary operator. • In the most general case, the operators L and B as well as the driving terms f and g, can be assumed random. • In general, we require an infinite number of random variables to completely characterize a stochastic process. This poses a numerical challenge in modeling uncertainty in physical quantities that have spatio-temporal variations, hence necessitating the need for a reduced-order representation.

Representing randomness • Interpreting random variables • Distribution of the random variable • e.g. inlet velocity, inlet temperature Interpreting random variables as functions Random variable x MAP S Real line Each outcome is mapped to a corresponding real value Sample space of elementary events Collection of all possible outcomes 3. Correlated data ex. presence of impurities, porosity Usually represented with a correlation function We specifically concentrate on this A general stochastic process is a random field with variations along space and time – A function with domain (Ω, Τ, S)

Representing Randomness • Representation of random process • - Karhunen-Loève, Polynomial Chaos expansions • 2. Infinite dimensions to finite dimensions • - depends on the covariance Karhunen-Loèvè expansion Based on the spectral decomposition of the covariance kernel of the stochastic process Random process Mean Set of random variables to be found • Need to know covariance • Converges uniformly to any second order process Eigenpairs of covariance kernel Set the number of stochastic dimensions, N Dependence of variables Pose the (N+d) dimensional problem

Karhunen-Loeve expansion ON random variables Mean function Stochastic process Deterministic functions • Deterministic functions ~ eigen-values, eigenvectors of the covariance function • Orthonormal random variables ~ type of stochastic process • In practice, we truncate (KL) to first N terms

The finite-dimensional noise assumption • By using the Doob-Dynkin lemma, the solution of the problem can be described by the same set of random variables, i.e. • So the original problem can be restated as: Find the stochastic function such that • In this work, we assume that are independent random variables with probability density function . Let be the image of . Then is the joint probability density of with support

High dimensional model representation (HDMR)1 • Let be a real-value multivariate stochastic function: , which depends on a N-dimensional random vector . A HDMR of can be described by where the interior sum is over all sets of integers , that satisfy .This relation means that can be viewed as a finite hierarchical correlated function expansion in terms of the input random variables with increasing dimensions. • The basic conjecture underlying HDMR is that the component functions arising in typical “real problems” will not likely exhibit high-order cooperativity among the input variables such that the significant terms in the HDMR expansion are expected to satisfy the relation: for 1. O. F. Alis, H. Rabitz, General foundations of high dimensional representations, Journal of Mathematical Chemistry, 25 (1999) 127-142.

High dimensional model representation (HDMR) In this expansion: • denotes the zeroth-order effect which is a constant. • The component function gives the effect of the variable acting independently of the other input variables. • The component function describes the interactive effects of the variables and . Higher-order terms reflect the cooperative effects of increasing numbers of variables acting together to impact upon . • The last term gives any residual dependence of all the variables locked together in a cooperative way to influence the output . • Depending on the method to represent component functions, there are two types of HDMR: ANOVA-HDMR and Cut-HDMR.

ANOVA - HDMR • ANOVA-HDMR is the same as the analysis of variance (ANOVA) decomposition used in statistics. Its component functions are given as • This expansion is useful for measuring the contributions of the variance of individual component functions to the overall variance of the output. • A significant drawback of this method is the need to compute the above integrals to extract the component functions, which in general are high- dimensional integrals and thus difficult to carry out. • To circumvent this difficulty, a computationally more efficient cut-HDMR expansion which can be combined with ASGC is introduced in the next few slides.

Cut - HDMR • In this work, Cut – HDMR is used for the multivariate interpolation problem. For Cut-HDMR, we fix a reference point . For this work, we fix it at the mean of the random input. The component functions of Cut-HDMR are given as follows: where the notation stands for the function with all the remaining variables of the input random vector set to , e.g. stands for which is a univariate function.

Cut - HDMR • The cut-HDMR expansion up to the pth order with minimizes the following functional: with the constraints • The null property introduced above serves to assure that the functions are orthogonal with respect to the inner product induced by the measure for at least one index differing in and , and may be the same as . This orthogonality condition implies that the cut-HDMR expansion to pth order is exact along lines, planes and hyperplanes of dimension up to p which pass through the reference point .

Cut – HDMR VS Taylor expansion • Suppose defined in a unit hypercube can be expanded as a convergent Taylor series at reference point , i.e., • It is shown that the first order component function is the sum of all the Taylor series terms which contain and only contain variable . Similarly, the second order component function is the sum of all the Taylor series terms which contain and only contain variables and . • Thus, the infinite number of terms in the Taylor series are partitioned into finite groups and each group corresponds to one Cut-HDMR component functions. • Therefore, any truncated Cut-HDMR expansion gives a better approximation of than any truncated Taylor series because the later only contains a finite number of terms of Taylor series.

Truncation error of Cut - HDMR • Assume that all mixed derivatives that include not more than one differentiation with respect to each variable are piecewise continuous. These are the derivatives Denote and let Then the truncation error1 up to pth order is 1. I. M. Sobol, Theorems and examples on high dimensional model representation, Reliability Engineering and System safety 79 (2003) 187-193.

A general formulation of Cut-HDMR • In its current form, each component function is defined through lower order component functions which are unknown a priori. Therefore, it is not easy for computer programming. We develop here a general form of Cut-HDMR. • For lower-order component functions, it is easy to rewrite them as Each component function is expressed in terms of the function output at a specific input. However, the procedure to derive these formulas becomes tedious to carry out along these lines.

A general formulation of Cut-HDMR • Instead, since HDMR has an analogous structure to the many body expansion used in molecular physics to represent a potential energy surface, we generalize its idea to the current work and get the following formulation of a order expansion of Cut-HDMR: where each component function can be obtained via the Mobius inversion approach from number theory as follows: where follows the notation defined before with all the remaining variables of the input random vector set to .

A general formulation of Cut-HDMR • Now we have a general formulation of truncated Cut-HDMR to pth order: • Previously, the components functions (i.e. , ) of Cut-HDMR were typically provided numerically, at discrete values (often tensor product) of the input variables to construct the numerical data tables. Then the value of the function for arbitrary input was determined by performing only low-dimensional interpolation over , , … • This method is computational expensive and not accurate since construction of each component function involves lower order inaccurate component functions. • Now, through the general formulation, the problem is transformed to several small sub problems to interpolate the function

Cut – HDMR and ASGC Interpolated with ASGC • Now, instead of interpolating one N-dimensional function using ASGC, the problem is transformed to several at most P-dimensional interpolation problems, where, in general, for real problems, for • By using ASGC for the approximation, there is no need to search the numerical data tables. Interpolation is done quickly through simple weighted sum of the interpolation function and corresponding hierarchical surplus. In addition, the mean and variance can be deduced easily.

Stochastic Collocation based framework Need to represent this function Sample the function at a finite set of points Use polynomials (Lagrange polynomials) to get a approximate representation Function value at any point is simply Stochastic function in 2 dimensions Spatial domain is approximated using a FE, FD, or FV discretization. Stochastic domain is approximated using multidimensional interpolating functions 21

Conventional sparse grid collocation (CSGC) • Denote the one dimensional interpolation formula as LET OUR BASIC 1D INTERPOLATION SCHEME BE SUMMARIZED AS • In higher dimension, a simple case is the tensor product formula IN MULTIPLE DIMENSIONS, THIS CAN BE WRITTEN AS • Using the 1D formula, the sparse interpolant , where is the depth of sparse grid interpolation and is the number of stochastic dimensions, is given by the Smolyak algorithm as • Here we define the hierarchical surplus as:

Choice of collocation points and nodal basis functions • In the context of incorporating adaptivity, we have used the Newton-Cotes grid using equidistant support nodes and use the linear hat function as the univariate nodal basis. • Furthermore, by using the linear hat function as the univariate nodal basis function, one ensures a local support in contrast to the global support of Lagrange polynomials. This ensures that discontinuities in the stochastic space can be resolved. The piecewise linear basis functions can be defined as 23

Adaptive sparse grid collocation (ASGC)1 Let us first revisit the 1D hierarchical interpolation • For smooth functions, the hierarchical surpluses tend to zero as the interpolation level increases. • On the other hand, for non-smooth function, steep gradients/finite discontinuities are indicated by the magnitude of the hierarchical surplus. • The bigger the magnitude is, the stronger the underlying discontinuity is. • Therefore, the hierarchical surplus is a natural candidate for error control and implementation of adaptivity. If the hierarchical surplus is larger than a pre-defined value (threshold), we simply add the 2N neighbor points of the current point. 1 .X. Ma, N. Zabaras, An hierarchical adaptive sparse grid collocation method for the solution of stochastic differential equations, JCP, in press, 2009. 24

Integrating HDMR and ASGC • Unlike using numerical tables, there is no need to search in the table. Interpolation is done quickly through simple weighted sum of the basis functions and the corresponding hierarchical surpluses.

Integrating HDMR and ASGC • The mean of the random solution can be evaluated as follows: • Denoting we rewrite the mean as • To obtain the variance of the solution, we need to first obtain an approximate expression for

Implementation • The first procedure is to find the different ordered index set from the set, i.e. . • The number of components is given by Although this number still grows quickly with increasing number of dimensions, we can solve each of the sub-problems more efficiently than solving the N-dimensional problem directly. • We use a recursive loop to generate all the index sets. After solving each sub problem, the surpluses for each interpolant are stored. • Each sub-problem is uniquely determined by its corresponding index set. Therefore, it is important to transform the index set to a unique key to store surplus and perform interpolation.

Implementation • We use the following convention to generate the unique key as a string “d(dimension of subproblem)_(index set)” • Only using index set is not enough. For example, for a 25 dimensional problem, the index set {12} can be either considered as a one- dimensional problem in the 12th dimension, or as a two-dimensional problem in the 1st and 2nd dimension. Therefore, we put a dimension in front of the index set, e.g. “1_12” and “2_12”. • To compute the value using the HDMR, it is simple to perform interpolation for different sub-problems according to the index set. Therefore, we use <map> from the C++ standard template library to store the different interpolants. We store each interpolant according to its unique key value defined above. Then we can perform the interpolation by calling map[key]->interpolate()

Numerical example: Mathematical functions • It is noted that according to the right hand side of HDMR, if the function has an additive structure, the HDMR is exact, i.e. when can be additively decomposed into functions of single variables, then the first-order expansion is enough to exactly represent the original function. • Let us consider the following simple function: It is expected that, no matter what the dimensionality is, it is enough to use just 1st order expansion of HDMR. • A 3rd order expansion is used although only 1st order is enough. After constructing the HDMR, we generate 100 random points in the domain and compute the normalized L2 interpolation error with the exact value. The result is shown in the following table.

Numerical example: Mathematical functions As expected, 1st order HDMR is enough to capture the function. • Next, we compare it using 1st HDMR with ASGC with same epsilon and CSGC. • Therefore, when the target function has an additive structure, the converged HDMR is better than both ASGC and CSGC. • The #points for CSGC at q = 6 is 171425. However, to go to the next level, the #points is 652065.

Numerical example: Mathematical functions • Next, we investigate the convergence with respect to the error threshold while fixing the expansion order to 1. • Therefore, as expected the error threshold ε also controls the error of a converged HDMR. This can be partially explained that each component has error ε, and HDMR is a linear combination of components, therefore the error is M ε, where M is a constant. • If however, the multiplicative nature of the function is dominant then all the right hand side components of HDMR are needed to obtain the best result. However, if HDMR requires all 2N components to obtain a desired accuracy, the method becomes very expensive. This is demonstrated through the following example.

Numerical example: Mathematical functions • Next we consider the function “product peak” from GENZ test package: • The normalized interpolation error is defined the same as before. The integration error is defined as the relative error to the exact value. • So we need at least 5th order HDMR to get a converged expansion, due to the structure of this function. • It is also interesting to note that for integration, the HDMR converges faster than for interpolation. N = 10

Numerical example: Mathematical functions • Thus the HDMR method is indeed not optimal in this case which requires about two billion points. Therefore, HDMR is not equally useful for interpolation of all types of mathematical functions. • We will investigate further these aspects of HDMR for the solution of stochastic PDEs.

Numerical example: Mathematical functions • We consider three different reference points in the above plots. It is shown that the results near the center of the domain is more accurate. Therefore, we always take our reference point as the mean vector.

Numerical example: Stochastic elliptic problem • Here, we adopt the model problem with the physical domain . To avoid introducing errors of any significance from the domain discretization, we take a deterministic smooth load with homogeneous boundary conditions. • The deterministic problem is solved using the finite element method with 900 bilinear quadrilateral elements. Furthermore, in order to eliminate the errors associated with a numerical K-L expansion solver and to keep the random diffusivity strictly positive, we construct the random diffusion coefficient with 1D spatial dependence as where are independent uniformly distributed random variables in the interval

Numerical example: Stochastic elliptic problem • In the earlier expansion, and • The parameter can be taken as and the parameter L is

Numerical example: Stochastic elliptic problem • This expansion is similar to a K-L expansion of a 1D random field with stationary covariance • Small values of the correlation correspond to a slow decay, i.e. each stochastic dimension weighs almost equally. On the other hand, large values of result in fast decay rates, i.e., the first several stochastic dimensions corresponding to large eigenvalues weigh relatively more. • By using this expansion, it is assumed that we are given an analytic stochastic input. Thus, there is no truncation error. This is different from the discretization of a random filed using the K-L expansion, where for different correlation lengths we keep different terms accordingly. • In this example, we fix N and change to adjust the importance of each stochastic dimension. In this way, we investigate the effects of . • The normalized error is the same as before, where the exact solution is taken as the statistics from 106 Monte Carlo samples

Numerical example: Stochastic elliptic problem N=7 Convergence with orders of expansion

Numerical example: Stochastic elliptic problem N=7 PDF at (0.5,0.5)

Numerical example: Stochastic elliptic problem N=7

Numerical example: Stochastic elliptic problem N=7 • The convergence of mean is faster than that of std, since mean is a first order statistics and first order expansion is enough to give good result. • For large correlation length, in order to achieve the same accuracy as other methods, a higher order expansion is needed since the problem is highly anisotropic and higher order effects for std cannot be neglected. However, as seen from the plots, as the correlation length decreases, the problem becomes smoother in the random space, even first order expansion can give enough accuracy for std, e.g. when Lc = 1/64, 1st order can give a error of O(10-3). • For large correlation length, ASGC is still the most effective method. When correlation length is small, the effect of ASGC becomes the same as CSGC. In this case, lower order expansion of HDMR is enough and thus becomes favorable than others. • From the PDF figures, we can have the same conclusions. For small correlation length, the PDF of first order expansion is nearly the same as that of MC result.

Numerical example: Stochastic elliptic problem N=21 Std along y = 0.5

Numerical example: Stochastic elliptic problem N=21 • For such a small correlation length, the effect of ASGC is the same as CSGC. So we only use HDMR + CSGC. • In such a high dimensional problem, CSGC or ASGC is not optimal in the sense that a large number of collocation points is needed to get good accuracy. The level 4 dimension 21 CSGC has 145377 points, where the convergence rate is even worst than MC for std. • On the other hand, 2nd order HDMR + level 4 CSGC can give us a much higher accuracy for both mean and std, and the convergence rate is much better than MC method. • To improve the accuracy for CSGC method, we need to go to the next interpolation level, where the number of points is 1285761 which is definitely not good compared with MC method . Therefore, it is clear that sparse grid collocation method still depends weakly on the dimensionality of the problem.

production well injection well Numerical example: Flow through random media Basic equation for pressure and velocity in a domain where denotes the source/sink term. • To impose the non-negativity of the permeability, we will treat the permeability as a log- normal random field obtained from the K-L expansion where is a zero mean Gaussian random field with covariance function

Numerical example: Flow through random media Std along y = 0.5 • We first fix and vary the correlation length, where the KLE is truncated such that the eigenvalues correspond to 99.7% of the energy. • It is noted that in all three cases, even first-order expansion can give accuracy as good as O(10-3). Higher-order terms do not provide significant improvements. • The reason is that when the truncated KLE can accurate represent the random field, the random variables are uncorrelated. Therefore, their cooperative effects on the output are very weak while the individual effects have strong impact on the output. 45

Numerical example: Flow through random media • Since only first-order expansion is used, the convergence of HDMR + ASGC is very obvious. The # points is nearly two- orders of magnitude lower compared with the MC method. • The employed covariance kernel has a fast decay of eigenvalues. Therefore, even correlation L= 0.1 will result in different importance in each dimension, which can be effectively detected by the ASGC method. • The MC result is obtained with 50, 100, 100 terms in KLE expansion, respectively, which also shows that the truncation error of KLE is negligible. 46

Numerical example: Flow through random media • Next we fix the correlation length and vary the variance of the covariance to account for large variability in the input uncertainty. • The coefficient of variation is 53%, 253%, 732%, respectively. • Although 10 KLE terms are used for calculation, 100 terms are kept for MC simulation . • Again, first-order expansion successfully captures the input uncertainty 47

Numerical example: Flow through random media Std along y = 0.5 Error convergence Relative error is 48

Conclusions • A general framework for stochastic high-dimensional model representation technique is developed. It applies the deterministic HDMR in the stochastic space together with ASGC to represent the stochastic solutions. • Various statistics can be easily extracted from the final reduced representation. Numerical examples show the efficiency and accuracy for both stochastic interpolation and integration. • Numerical examples show that for problems with not-equally weighted random dimensions, a higher-order expansion is needed to get accurate results so that ASGC remains favorable in this case. On the other hand, for small correlation length when all dimensions weigh equally, the lower-order HDMR combined with ASGC or CSGC has the best convergence rate over MC and direct CSGC method. • It is interesting that when the random input was truncated from KLE, only 1st order expansion was enough to capture the input uncertainty • For high-dimensional problems with equally weighted random dimensions, the sparse grid collocation method depends strongly on the dimensionality and HDMR is the best choice.

High-dimensional model representation technique for the solution of stochastic PDEs

High-dimensional model representation technique for the solution of stochastic PDEs

Presentation Transcript

High Dimensional Chaos

19. Series Representation of Stochastic Processes

Discretization for PDEs

Stochastic model of order book

Discretization for PDEs

A stochastic dimension reduction for stochastic PDEs

High Dimensional Indexing

Knowledge Representation Meets Stochastic Planning

High-Dimensional Data

Visualization of High dimensional Datasets

Stochastic Model Checking

Multi-dimensional Dynamic Knowledge Representation

Stochastic climate model

A Stochastic Nonparametric Technique for Space-time Disaggregation of Streamflows

Dimensional model

Creating the Dimensional Model

A Stochastic Model for Intrusions

Dynamics of High-Dimensional Systems

One-dimensional model

Discretization for PDEs

High Dimensional Data

Dynamics of High-Dimensional Systems