160 likes | 323 Views
Computer methods for evaluation of the confidence intervals of molecular modeling prediction. V.A. Dementiev Vernadsky Institute of Geo & Analytical Chemistry, Moscow dementiev@geokhi.ru. General Problem. Given the experimental data as an complicated set of points in multi-dimension space
E N D
Computer methods for evaluation of the confidence intervals of molecularmodeling prediction V.A. Dementiev Vernadsky Institute of Geo & Analytical Chemistry, Moscow dementiev@geokhi.ru
General Problem Given the experimental data as an complicated set of points in multi-dimension space and the set of corresponding points predicted with adequate model & relevant theory, there is …. the problem. For we shell compare both in terms of some confidence intervals.
To provide confidence intervals for complicated data is the task of chemometrics • To provide theoretical intervals is the methodical task of the theory • The last has no systematic solution • An attempt we made solving our specific problem illustrates the general one, as we hope
No-etalon spectroscopic analysis • We have the IR spectrum of mixture of organic compounds • and the theoretical prediction of spectral curve for the assumed mixture • We shall obtain the optimal proximity of both curves varying assumed concentrations of components in the prognostic spectrum • Then we do know the concentrations in the sample without making expensive etalons
But what is the precision of the result? • Now the theory of molecular vibrations (by L. Gribov & Co Ltd.) has no means to assess the errors of either predicted frequencies or integral intensities in the calculated spectrum taking into account errors of the model parameterization • It is our specific problem • To be accurate, it was
Let consider spectrum prediction as one computing step X is molecular parameters vector of dimension n; Y is spectral characteristics matrix of dimension (m, 2); Y(:,1) is vector of frequencies; Y(:,2) is vector of absolute intensities.
Let us add in the program one loop and random errors generator for i = 1: nb Xi = X0 + random_error(i); random_error(i) is generated in the step i with given law; nb of Yi are collected into Yb array for statistical processing. b stands for bootstrap. next % i
Then we can build histograms of Yb set and its subsets • Given the histograms of frequencies one can get succinct representation of confidence intervals for individual frequency or a overlapping group • Taking into account the set of bootstrap intensities one can build up a fuzzy spectral curve to compare with experiment instantly
Results on Ethylene • n = 14 : let be varied all nonzero types of force parameters fc • m = 12 : to calculate frequencies of all normal vibrations • nb = 1000 • Calculations, including histogram output, took 0.5 secusing PCwith the program written in MatLab language
Spectral distribution of frequencies dencity in Yb set of Ethylene fc errors were distributed due to restricted normal law; σ = 0.05abs(fc); Restriction value = 2 σ.
Statistical properties of predicted frequencies Mean(ν) Std(ν) Min(ν) Max(ν) 824.2 22.2 772.5 876.0 937.1 18.7 892.5 979.9 957.0 24.2 899.0 1011.0 1020.9 21.4 969.1 1069.7 1246.3 24.2 1189.8 1304.7 1338.1 15.6 1295.0 1381.9 1443.6 24.2 1377.1 1507.3 1621.3 27.5 1544.9 1692.2 3002.7 65.1 2848.8 3146.2 3019.2 64.5 2865.9 3162.6 3100.8 68.2 2937.0 3250.5 3104.0 68.0 2942.2 3253.7
Spectral distribution of frequencies dencity in Yb set of Ethylene fc errors were distributed due to uniform law; halfwidth = 0.05abs(fc).
What is restricted normal law? Restriction = sσ
Not only measurement errors follow the law of restriction, but errors of molecular modeling as well
Histograms of zero eigenvalues of Ts spoiled by modeling errors, and true eigenvalues of Ts The left resembles the restricted normal law of errors; when the right reflects physical properties of the certain molecular model.
The ROxYLaw Given Stable physical property to be measured with an instrument free of systematic errors, There are random errors, always restricted. Given Adequate model and the theory, proved with solving of inverse problems, And defined above data as model parameters, One can obtain interpolation and near extrapolation prognosis with ever limited uncertainty. ROxY stands for Rodionova Oxana Yevgenievna, for this person was the first to collect & discuss valuable phenomena as the basis of the law.