slide1 n.
Skip this Video
Download Presentation
Methods for Cost Estimation in CEA: the GLM Approach Henry Glick University of Pennsylvania

Loading in 2 Seconds...

play fullscreen
1 / 45

Methods for Cost Estimation in CEA: the GLM Approach Henry Glick University of Pennsylvania - PowerPoint PPT Presentation

  • Uploaded on

Methods for Cost Estimation in CEA: the GLM Approach Henry Glick University of Pennsylvania AcademyHealth Issues in Cost-Effectiveness Analysis Washington, DC 06/10/2008. Outline. Policy-relevant parameter for cost-effectiveness Problems posed by nonnormality of cost data

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Methods for Cost Estimation in CEA: the GLM Approach Henry Glick University of Pennsylvania' - britanni-guerrero

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Methods for Cost Estimation in CEA: the GLM Approach

Henry Glick

University of Pennsylvania


Issues in Cost-Effectiveness Analysis

Washington, DC


  • Policy-relevant parameter for cost-effectiveness
  • Problems posed by nonnormality of cost data
  • Generalized linear models as a response to the problems
    • Identifying links and families (gets a little technical)
  • General comments

My objective is to provide practical advice for ways implement GLM models. Slides available at:

policy relevant parameter for cea
Policy Relevant Parameter for CEA
  • Policy relevant parameter: differences in the arithmetic, or sample, mean
    • In welfare economics, a project is cost-beneficial if the winners from any policy gain enough to be able to compensate the losers and still be better off themselves
      • Thus, we need a parameter that allows us to determine how much the losers lose, or cost, and how much the winners win, or benefit
    • From a budgetary perspective, decision makers can use the arithmetic mean to determine how much they will spend on a program
policy relevant parameter for cea 2
Policy Relevant Parameter for CEA (2)
  • In both cases, substitution of some other parameter for the sample mean can be justified only if it provides a better estimate of gains and losses or spending
are sample means always the best estimator
Are Sample Means Always the Best Estimator?
  • In simulation, when cost data are sufficiently nonnormal, the relative bias (truth - observed)2 for other parameters such as the median or adjusted geometric mean can sometimes be lower than the relative bias observed for the arithmetic mean
    • Distribution required to be sufficiently nonnormality that ln(cost) is also substantially nonnormally distributed
    • In actual data, since we never know truth, it is difficult to determine whether other parameters will have lower relative bias than sample mean
the problem
The Problem
  • Common feature of cost data is right-skewness (i.e., long, heavy, right tails)
the problem cont
The Problem (cont.)
  • Distributions with long, heavy, right tails will have a mean that differs from the median, independent of “outliers”
  • Cost data also can’t be negative, and can have large fractions of observations with 0 cost
  • Nonnormality of cost data can pose problems for common parametric tests such as t-test, ANOVA, and OLS regression
common relatively bad responses to violation of normality
Common (Relatively Bad) Responses To Violation Of Normality
  • Adopt nonparametric tests of other characteristics of the distribution that are not as affected by the nonnormality of the distribution (“biostatistical” approach)
  • Transform the data so they approximate a normal distribution (“classic econometric” approach)
recommended response adopt more flexible models
Recommended Response: Adopt More Flexible Models
  • Generalized Linear Models (GLM)
    • Have the advantages of the log models, but
      • don’t require normality or homoscedasticity
      • and evaluate a direct of the difference in cost and don’t raise problems related to retransformation from the scale of estimation to the raw scale
    • To build them, one must identify a "link function" and a "family“ (based on the data)
stata and sas code
Stata and SAS Code
  • STATA code:

glm y x, link(linkname) family (familyname)

  • General SAS code (not appropriate for gamma family / log link):

proc genmod;

model y=x/ link=linkname dist=familyname;


sas code for a gamma family log link
SAS Code for a Gamma Family / Log Link
  • When running gamma/log models, the general SAS code drops observations with an outcome of 0
  • If you want to maintain these observations and are predicting y as a function of x (M Buntin):

proc genmod;

a = _mean_;

b = _resp_;

d = b/a + log(a)

variance var = a2

deviance dev =d;

model y = x / link = log;


the link function
The Link Function
  • Link function directly characterizes how the linear combination of the predictors is related to the prediction on the original scale
    • e.g., predictions from the identity link -- which is used in OLS -- equal:
the log link
The Log Link
  • Log link is most commonly used in literature
  • When we adopt the log link, we are assuming:


  • GLM with a log link differs from log OLS in part because in log OLS, one is assuming:


  • ln(E(y/x) ≠ E(ln(y)/x)

i.e. log of the mean  mean of the log costs

ln e y x e ln y x
ln(E(y/x) ≠ E(ln(y)/x)

* Difference = 0; † Difference = 0.179047

the power link function
The Power Link Function
  • Stata’s power link provides a flexible link function
  • It allows generation of a wide variety of named and unnamed links, e.g.,
    • power 1 = Identity link; = BiXi
    • power .5 = Square root link; = (BiXi)2
    • power .25: = (BiXi)4
    • power 0 = log link; = exp(BiXi)
    • power -1 = reciprocal link; = 1/(BiXi)
    • power -2 = inverse quadratic; = 1/(BiXi)0.5
negative power links
Negative Power Links
  • Retranslation of negative power links to the raw scale:
  • When using a negative power link, negative coefficients yield larger estimates on the raw scale; positive coefficients yield smaller estimates
selecting a link function
Selecting a Link Function
  • There is no single test that identifies the appropriate link
  • Instead can employ multiple tests of fit
    • :Pregibon link test checks linearity of response on scale of estimation
    • Modified Hosmer and Lemeshow test checks for systematic bias in fit on raw scale
    • Pearson’s correlation test checks for systematic bias in fit on raw scale
    • Ideally, all 3 tests will yield nonsignificant p-values
  • Others (e.g., Hardin and Hilbe) have proposed use of (larger) log likelihood, (smaller) deviance, AIC and BIC statistics
the family
The Family
  • Specifies the distribution that reflects the mean-variance relationship
    • Gaussian: Constant variance
    • Poisson: Variance is proportional to mean
    • Gamma: Variance is proportional to square of mean
    • Inverse Gaussian or Wald: Variance is proportional to cube of mean
  • Use of the poisson, gamma, and inverse Gausian families relax the assumption of homoscedasticity
modified park test
Modified Park Test
  • A “constructive” test that recommends a family given a particular link function
  • Implemented after GLM regression that uses the particular link
  • The test predicts the square of the residuals (res2) as a function of the log of the predictions (lnyhat) by use of a GLM with a log link and gamma family to
    • Stata code

glm res2 lnyhat,link(log) family(gamma), robust

  • If weights or clustering are used in the original GLM, same weights and clustering should be used for modified Park test
recommended family modified park test
Recommended Family, Modified Park Test
  • Recommended family derived from the coefficient for lnyhat:
    • If coefficient ~=0, Gaussian
    • If coefficient ~=1, Poisson
    • If coefficient ~=2, Gamma
    • If coefficient ~=3, Inverse Gaussian or Wald
  • Given the absence of families for negative coefficients:
    • If coefficient < -0.5, consider subtracting all observations from maximum-valued observation and rerunning analysis
example glm gamma log
Example, GLM gamma/log

glm cost treat dis* bl*,link(log) family(gamma)


passes tests but can we improve the link
Passes Tests, But Can We Improve the Link?
  • Iteratively evaluate power links (in 0.1 intervals) between -2 and 2
    • Use the modified Park test to select a family
    • Evaluate the fit statistics
    • Don’t show you the results here, but I then fine tune the power link in 0.01 intervals within candidate regions of the power link
why not simply use aic and bic
Why Not Simply Use AIC and BIC?
  • In the current example:
    • AIC, BIC, log likelihood, and deviance all agreed
    • They yielded an answer that was similar answer to that from the Pearson correlation test, Pregibon link test, and modified Hosmer and Lemeshow tests
      • Power link 0.3
  • AIC, BIC, log likelihood, (and deviance?) already commonly used for decisions about model fit
  • Why do we need the new tests?
aic bic
  • There are at least 3 reasons why in the long run log likelihood, AIC, BIC, and deviance are unlikely to be the recommended tests for identifying the appropriate link function
  • First, when there are a large fraction of observations with zero cost:
    • The recommendations from log likelihood / AIC agree with each other
    • The recommendations from BIC / deviance agree with each other
    • But the log likelihood/AIC recommendations differ from the BIC/deviation recommendations
aic bic 2
AIC / BIC (2)
  • Second, the 4 statistics aren’t stable across families, and the shifts in their magnitude across families do not provide information about which family/link is best
  • For example, in a dataset where the modified Park test recommends a gamma family for power links < 0.4, but recommends a poisson family for power links > 0.5, the magnitude of the AIC statistic shifts from ~18 for the gamma family to ~454 for the poisson family
    • Although the smaller AIC values associated with power links < 0.4 suggest that these links have the better fit, the Pearson, Pregibon, and H&M tests all suggest that the power links > 0.5 are actually superior
aic bic 3
AIC / BIC (3)
  • Third, while this instability across families is less of a problem when our statistical packages offer 4 continuous families only, it will eliminate comparability across links when statistical packages begin to offer more flexible GLM power families
    • i.e., when we don’t have to choose between Gaussian (0) and poisson (1) families, but instead can use a family of 0.7
      • In this case, given that each power will be associated with a slightly different family, it will be impossible to compare the resulting AIC/BIC statistics
extended estimating equations
Extended Estimating Equations
  • Basu and Rathouz (2005) have proposed use of extended estimating equations (EEE) which estimate the link function and family along with the coefficients and standard errors
  • Tends to need a large number of observations (thousands not hundreds) to converge
  • Currently can’t take the results and use them with a simple GLM command (makes bootstrapping of resulting models cumbersome)
glm disadvantages
GLM Disadvantages
  • Disadvantages
    • Can suffer substantial precision losses
      • If heavy-tailed (log) error term, i.e., log-scale residuals have high kurtosis (>3)
      • If family is misspecified
  • The distribution of cost can pose problems for common parametric tests of cost
  • Responses in the literature that suggest that we should evaluate something other than the difference in the sample mean (or a direct transformation of this difference) – e.g., nonparametric tests of other characteristics of the distribution or transformations of cost – generally create more problems than they solve
  • Use of more flexible models that evaluate a direct transformation of the difference in cost generally pose fewer problems
    • Does require we identify functional forms for the relationship between the predictors and the mean and for the variance structure
fine tuning 1
Fine Tuning (1)


fine tuning 2
Fine Tuning (2)


aic bic1
  • (When there are a large fraction of 0s) log likelihood / AIC recommendations differ from BIC / deviance recommendations
  • 4 statistics aren’t stable across families, and shifts in their magnitude across families do not provide information about which family/link is best
limitations if power family becomes available
Limitations If Power Family Becomes Available
  • Generally not a big problem when we are limited to the 4 named continuous families
    • Because, as in the example, within families we can look for the power at which the statistics reach a maximum (ll) or maximum (AIC, BIC, deviance)
  • When a more flexible GLM family is added to our statistical packages that allows a family of 0.7 or 1.3, rather than forcing us to round to a poisson family (1.0), the change in the scale of the AIC, etc., will make these statistics difficult, if not impossible, to use