1 / 16

Data fitting

What are they?. Correlation- tells how much two variables are relatedX and Y measured independentlyLine fitting derives a best-fitting model between two variables.Least squares (linear regression - straight line)Curved lines (polynomial or spline fit)Typically, for known X and measured Y (fu

maalik
Download Presentation

Data fitting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Data fitting Correlation & line fitting

    2. What are they? Correlation- tells how much two variables are related X and Y measured independently Line fitting – derives a best-fitting model between two variables. Least squares (linear regression - straight line) Curved lines (polynomial or spline fit) Typically, for known X and measured Y (function of time, etc)

    3. correlation

    4. Correlation coefficient

    5. correlation

    6. Confidence interval for correlation Possible to define a variable w

    7. Use this mean and variance to set the normal distribution Now can check confidence intervals Often useful to check confidence interval of the null hypotheses (rxy=0)

    8. Least squares line fitting (linear regression) For perfect linear correlation, it is straightforward to define an equation so that Need to determine the coefficients A and constant B so that they define a straight line that fits the data as “well” as possible We are “estimating” the best value of A and B. We are assuming that the “x” value is known exactly and that the y value is uncertain.

    9. Least squares fit Common to use a least-squares fit. The error between the best-fitting line and each data point is (y-y’) where y is the data and y’ is the best fit (in a vertical distance). We seek to minimize the sum of all the errors squared. Why squared? Well, it has some nice properties.

    10. Some details

    11. More details We can think of the best fit line as a sort of mean value. The scatter is measured by the estimated standard error. This is analogous to the standard deviation.

    12. Confidence intervals 95 % confidence interval for y (i.e., we are 95% sure that y lies between the values a and b is defined by: (a,b) = (y’-k,y’+k) where k is

    13. Some problems Outliers tend to skew the line away from other data. Results in a poor fit. Line is weighted by the square of the vertical distance between the data point and the trend. One large offset counts more than several small ones.

    14. Why square? Could use 3rd power Or just absolute value Also provide a straight line More complicated and less elegant mathematics. May be useful for some data Absolute value handles outliers better.

    15. Least-squares fit and Excel Three ways (at least) to make a least squares fit to data in Excel. Use linest(y,x,b,stats) and then plot. Allows calculation of statistics Powerful but complicated. Use regression in Analysis ToolPak add-in Make data plot (without line), then left click on data point. Then add trend line – much easier but it is not clear how it does it.

    16. Excel output for regression

    17. Fitting a curved line Suppose the data are exponential or something you expect is curved. Use a polynomial fit - click box under add trendline Spline fit Nonlinear least squares

More Related