Non-linear regression

1 / 16

# Non-linear regression - PowerPoint PPT Presentation

Non-linear regression. All regression analyses are for finding the relationship between a dependent variable (y) and one or more independent variables (x), by estimating the parameters that define the relationship.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Non-linear regression' - schuyler

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Non-linear regression
• All regression analyses are for finding the relationship between a dependent variable (y) and one or more independent variables (x), by estimating the parameters that define the relationship.
• Non-linear relationships whose parameters can be estimated by linear regression: e.g, y = axb, y = abx, y = aebx
• Non-linear relationships whose parameters can be estimated by non-linear regression, e.g,
• Non-linear relationships that cannot be represented by a function: loess
Growth curve of E. coli
• A researcher wishes to estimate the growth curve of E. coli. He put a very small number of E. coli cells into a large flask with rich growth medium, and take samples every half an hour to estimate the density (n/L).
• 14 data points over 7 hours were obtained.
• What is the instantaneous rate of growth (r). What is the initial density (N0)?
• As the flask is very large, he assumed that the growth should be exponential, i.e., y = a·ebx (Which parameter correspond to r and which to N0?)
• Three approaches
• Log-Transform to linear relationship
• Direct least-square solution (EXCEL solver)
• Direct least-absolute-difference solution (EXCEL solver)
Scatter plot

In EXCEL:

Log-transform DRun linear regressionObtain D0 and r

EXCEL solver

Get initial value for r:

Initial value for D0 is obtained with t = 0

Body weight of wild elephant
• A researcher wishes to estimate the body weight of wild elephants.
• He measured the body weight of 13 captured elephants of different sizes as well as a number of predictor variables, such as leg length, trunk length, etc. Through stepwise regression, he found that the inter-leg distance (shown in figiure) is the best predictor of body weight.
• He learned from his former biology professor that the allometric law governing the body weight (W) and the length of a body part (L) states thatW = aLb
• Use the three approaches to fit the equation
Scatter plot

W = aLbIn EXCEL:

Log-transform W and LRun linear regressionObtain a and b

EXCEL solver

W=aLb

Initial values:

DNA and protein gel electrophoresis
• How to estimate the molecular mass of a protein?
• A ladder: proteins with known molecular mass
• Deriving a calibration curve relating molecular mass (M) to migration distance (D): D = F(M)
• Measure D and obtain M
• The calibration curve is obtained by fitting a regression model
Protein molecular mass
• The equation D=aebM appears to describe the relationship between D and M quite well. This relationship is better than some published relationships, e.g., D = a – b ln(M)
• The data are my measurement of D and M for a subset of secreted proteins from the gastric pathogen Helicobacter pylori (Bumann et al., 2002).
• Homework: use the data and the three approaches to estimate parameters a and b (You don’t need to submit)

Bumann, D., Aksu, S., Wendland, M., Janek, K., Zimny-Arndt, U., Sabarth, N., Meyer, T.F., and Jungblut, P.R., 2002, Proteome analysis of secreted proteins of the gastric pathogen Helicobacter pylori. Infect. Immun. 70: 3396-3403.

What is the functional relationship between the area and the radius? Homework (you do not need to submit): Measure the area A (by counting the squares) and radius r for each circle and estimate the parameters c and d in the equation A = crd by using the three approaches.

Toxicity study: pesticide

What transformation to use?

Probit and probit transformation
• Probit has two names/definitions, both associated with standard normal distribution:
• the inverse cumulative distribution function (CDF)
• quantile function
• CDF is denoted by (z), which is a continuous, monotone increasing sigmoid function in the range of (0,1), e.g.,(z) = p(-1.96) = 0.025 = 1 - (1.96)
• The probit function gives the 'inverse' computation, formally denoted -1(p), i.e.,probit(p) = -1(p) probit(0.025) = -1.96 = -probit(0.975)
• [probit(p)] = p, and probit[(z)] = z.
Non-linear regression
• In rapidly replicating unicellular eukaryotes such as the yeast, highly expressed intron-containing genes requires more efficient splicing sites than lowly expressed genes.
• Natural selection will operate on the mutations at the slicing sites to optimize splicing efficiency.
• Designate splicing efficiency as SE and gene expression as GE.
• Certain biochemical reasoning suggests that SE and GE will follow the following relationships:
Scatter plot

Initial values:

  0.4 (inferred when GE = 0)/  1 or    (inferred when GE is very large)When GE = 8, we have (0.4+8 )/(1+8 ) = 0.78