I qualitative or dummy independent variables
Sponsored Links
This presentation is the property of its rightful owner.
1 / 90

I.Qualitative (or Dummy) Independent Variables PowerPoint PPT Presentation


  • 59 Views
  • Uploaded on
  • Presentation posted in: General

?. I.Qualitative (or Dummy) Independent Variables. “Binary” vs. “Dummy”. I was taught to call these variables “binary variables” I now believe that “binary variables” is a better (more descriptive) name So, let’s call them “binary variables” from now on.

Download Presentation

I.Qualitative (or Dummy) Independent Variables

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


?

I.Qualitative (or Dummy) Independent Variables


“Binary” vs. “Dummy”

  • I was taught to call these variables “binary variables”

  • I now believe that “binary variables” is a better (more descriptive) name

  • So, let’s call them “binary variables” from now on


  • This page was intentionally left blank


III.Introduction

  • A.New type of variable

    • 1. Past: used quantitative variables (numerically measurable); continuous

    • 2. Now: variables that take small number of values; discrete

      • a)Gender

      • b)Market size

      • c)Region of country

      • d)Marital status (married vs. not), etc


Introduction (cont.)

  • B.Used as IV in this section

  • C.Used as DV later in course


Introduction (cont.)

  • Institute of Management Accounts (IMA) publishes an annual Salary Guide

    • In Strategic Finance magazine

      • sfmag@imanet.org

    • Annual survey of members

    • “…based on a regression equation derived from survey results.”


IMA Salary Guide (cont.)

SALARY = 35,491 + 18393TOP + 8392SENIOR – 10615ENTRY +914YEARS +10975ADVDEGREE – 8684NODEGREE + 9195PROFCERT + 8417MALE

  • TOP=1 if top level mgmt, 0 if not

  • SENIOR=1 if senior level mgmt , 0 if not

  • ENTRY=1 if entry level , 0 if not

  • ADVDEGREE=1 if advanced degree , 0 if not

  • NODEGREE=1 if no degree , 0 if not

  • PROFCERT=1 if hold professional certification , 0 if not

  • MALE=1 if male , 0 if not

  • YEARS=years of experience


IMA Salary Guide (cont.)

  • Average IMA member (1999)

    • Male

    • 14.5 years experience

    • Professional certification

    • Salary = $66,356

      • Figure obtained from substituting values into regression equation


Are Wins Worth More in a Large Market?

See regression output for binary variables as IVs. (note)


Introduction (cont.)

  • D.Example #1

    • 1. Y =  + X2 + 

    • 2. Y: social program expenditures per state

    • 3. X2: state’s total revenue

    • 4. Suppose states’ legislatures controlled by Democrats spend more from same revenue than those controlled by Republicans

    • 5. How account for this in model?

    • 6. What’s the categorical variable?


Introduction (cont.)

  • E.Example #2

    • 1. Y =  + X2 + 

    • 2. Y: coach’s earnings

    • 3. X2: coach’s experience

    • 4. Suppose women earn less than men with equal experience (& other characteristics)

    • 5. How account for this in model?

    • 6. What’s the categorical variable?


Introduction (cont.)

  • F.Example #3

    • 1. Y =  + X2 + 

    • 2. Y: sales of swimsuits in Minnesota

    • 3. X2: Minnesota’s population

    • 4. Suppose sales peak in warm months

    • 5. How account for this in model?

    • 6. What’s the categorical variable?


Introduction (cont.)

  • G.Example #4

    • 1. Y =  + X2 + 

    • 2. Y: profits of NBA teams

    • 3. X2: wins

    • 4. Suppose teams in large markets make more profit on their wins than teams in other markets

    • 5. How account for this in model?

    • 6. What’s the categorical variable?


Introduction (cont.)

  • G.Will use Binary (or Dummy) Independent Variables

    • 1. Create a special variable that takes a value of

      • a) if the unit of observation falls into one category

      • b) if the unit falls into the other category

1

0


Why Called “Dummy” Variables? (Multiple Choice Question)

A. A MAN NAMED “ALFRED DUMMY” INVENTED THEM

C. THEY REPRESENT CATEGORICAL VARIABLES

B. ANYONE WHO USES THEM IS. . . A DUMMY

?


Introduction (cont.)

  • 2. Example

    • a)GENDER = 1 for all females in the sample

    • b)GENDER = 0 for all males


Introduction (cont.)

  • c. you pick which category gives a value of 1 and which category gives a value of 0

  • EXAMPLE: the variable GENDER = 1 for all females in the sample and GENDER = 0 for all males

    OBS #GENDER

    1male

    2male

    3female

    4male

    5female

    6female


Introduction (cont.)

  • c. you pick which category gives a value of 1 and which category gives a value of 0

  • EXAMPLE: the variable GENDER = 1 for all females in the sample and GENDER = 0 for all males

    OBS #GENDER

    1male 0

    2male

    3female

    4male

    5female

    6female


Introduction (cont.)

  • c. you pick which category gives a value of 1 and which category gives a value of 0

  • EXAMPLE: the variable GENDER = 1 for all females in the sample and GENDER = 0 for all males

    OBS #GENDER

    1male 0

    2male 0

    3female 1

    4male

    5female

    6female


Introduction (cont.)

  • c. you pick which category gives a value of 1 and which category gives a value of 0

  • EXAMPLE: the variable GENDER = 1 for all females in the sample and GENDER = 0 for all males

    OBS #GENDER

    1male 0

    2male 0

    3female 1

    4male 0

    5female 1

    6female 1


Only Use 0 & 1 Values

  • Only use 0 & 1 values

  • Never use 1, 2, 3,… (for example)

  • Why not?

    • 2 is how many times bigger than 1? (2/1)

    • 3 is how many times bigger than 2? (3/2)

    • 1 is how many times bigger than 0? (1/0)


Sample With Binary Variable


IV.Binary Variables Change Intercept-Two Categories

  • A.Intercept term changes according to the two values of the one binary variable

    • intercept is one value when D = 0

    • intercept is different value when D = 1

  • B.Use only ONE binary variable per variable with two categories


Binary Variables Change Intercept-Two Categories (cont.)

  • C.In model Y =  + X2 + 

    • 1. for same value of X,

      Y (in group #1) not = Y(in group #2)

      Male: when X2 = 16, Y = 34

      Female: when X2 = 16, Y = 29

    • 2. Since X is the same value for both groups,

      • a)either  or  must be different to cause

        Y (in group #1) not = Y(in group #2)


Binary Variables Change Intercept-Two Categories (cont.)

  • D.Different cases

    • 1.  differs between groups OR

    • 2.  differs between groups OR

    • 3. both and  differ between groups

      Y =  + X2 + 


Binary Variables Change Intercept-Two Categories (cont.)

  • E.In the model Y =  + X2 + 

    • 1. Y: profits per NBA team ($1,000,000s)

    • 2. X2: wins per season

    • 3. Suppose teams in large markets make more profit on same number of wins than teams in other markets


Binary Variables Change Intercept-Two Categories (cont.)

  • 4. How account for this in model?

  • 5. D is the binary variable

    • a)D = 1 if the team is in a large market

    • b)D = 0 if the team is in a mid-sized or small market


Binary Variables Change Intercept-Two Categories (cont.)

  • D = 1 if the team is in a large market

  • D = 0 if the team is not

  •  = 0 + 1D

  • Y =  + X2 + 

  • Y = 0 + 1D + X2 + 


  • Students

    • Write Y = 0 + 1D + X2 + 

      for two cases:

      • mid or small market (D = 0)

      • large market (D = 1)


    Binary Variables Change Intercept-Two Categories (cont.)

    • 7. Y = 0 + 1D + X2 + 

    • 8. mid/small: Y = 0 + X2 +  (D=0)

    • 9. large: Y = (0 + 1) + X2 +  (D=1)

    • 10. What differs between 2 models?


    Changing Intercept

    Profits per team

    Large market:

    Y = (0 + 1)+ X2 + 

    Mid/small market:

    Y = 0 + X2 + 

    1

    0

    Wins

    per team

    (assuming 1 > 0)


    1 : 3 Equivalent Meanings

    • 13. 1 shows change in intercept relative to control group

    • 14. 1 shows change in intercept due to change in market size

    • 15. 1 measures difference in profits for same number of wins between teams in large markets vs. those in other markets


    Changing Intercept

    Profits per team

    Large market: (LA Clippers)

    Y = (0 + 1)+ X2 + 

    Mid/small market

    (SD Clippers)

    Y = 0 + X2 + 

    $16.5M

    What’s value of 1?

    $0.5M

    1

    0

    Wins

    per team

    50

    1 measures difference in profits for same number of wins between teams in large markets vs. those in other markets


    Binary Variables Change Intercept-Two Categories (cont.)

    • 16. Comparison group or control group

      • a)Group for which binary variable = 0

    • 17. Who decides which group is control group?

      • a)You do

      • b)It doesn’t matter statistically

      • c)Remember which group is control when interpret results


    Binary Variables Change Intercept-Two Categories (cont.)

    • 18. Hypothesis Test (Y = 0 + 1D + X2 +  )

      • a)H0: no difference in Y (for same X) between markets OR

      • b)H0: 1 = 0 Both: Y = 0 + X2 + 

      • c)HA: is difference in Y (for same X) between markets OR

      • d)HA: 1  0 Large: Y = (0 + 1)+ X2 + 

        Mid/small: Y = 0 + X2 + 

      • e)What test statistic use?


    Binary Variables Change Intercept-Two Categories (cont.)

    • F.Example

    See regression output for binary variables as IVs – case #1A. (note)


    Binary Variables Change Intercept-Two Categories (cont.)

    • How interpret p-value on LARGE in model?

    • Interpret coefficient on LARGE. (see note p.)

      • a) "Between 2 teams with same number of wins, the one in the large market is expected to earn $ ??? more (or less?) ”

    PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE


    Changing Intercept

    PROFIT

    PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE

    LARGE market

    0.282

    MID/SMALL market

    16.524

    WINS

    -8.340


    Coefficient Interpretation Exercise

    • SEE DRAWING

    • Questions repeated on next three slides

    • Q1: Interpret the number –8.339

    • Q2: Interpret the number 0.282

    • Q3: Interpret the number 16.524


    Changing Intercept

    PROFIT

    PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE

    Q1: Interpret the number -8.340

    LARGE market

    0.282

    MID/SMALL market

    16.524

    WINS

    -8.340


    Changing Intercept

    PROFIT

    PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE

    Q2: Interpret the number 0.282

    LARGE market

    0.282

    MID/SMALL market

    16.524

    WINS

    -8.340


    Changing Intercept

    PROFIT

    PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE

    Q3: Interpret the number 16.524

    LARGE market

    0.282

    MID/SMALL market

    16.524

    WINS

    -8.340


    Review

    • F.Example

      • A. PRICE = 1 + 2SQFT + 

      • IGNORE OTHER IVs for this example

      • 2. Add POOL to model

        • a) POOL = 1 if house has pool

        • b) POOL = 0 otherwise


    Review (cont.)

    A. PRICE = 1 + 2SQFT + 

    B. PRICE = 1 + 2SQFT + 5POOL + 

    • ESTIMATE MODEL B


    Review (cont.)

    Variable Model B

    CONSTANT 22.673

    (0.09)

    SQFT 0.1444

    (0.001)

    POOL 52.790

    (0.03)

    Adj. R2 0. 890


    Review (cont.)

    • How interpret p-value on POOL in Model B?

    • Interpret coefficient on POOL. (see note p.)

    • a) "Between 2 houses of same size, the one with a pool is expected to sell for $

      more (or less?) ”

    52,790


    Review (cont.)

    • Estimated model:

      PRICE = 22.673 + 0.1444SQFT +52.79POOL

      No Pool: (POOL=0)

      PRICE = 22.673 + 0.1444SQFT

      With Pool: (POOL=1)

      PRICE = 22.673 + 0.1444SQFT +52.79*1

      = (22.673+ 52.79 ) + 0.1444SQFT


    Review (cont.)

    • What’s price for 1000 sq. ft house . . .

    • And NO pool?

      (notes page)

    • WITH pool?


    Review (cont.)

    PRICE

    Model F:

    with POOL

    Model F:

    no POOL

    0.1444

    52.790

    22.673

    SQFT

    Q2: Interpret the number 52.790


    Exercise

    Binary Variables #3


    Suppose reverse 0 & 1 cases for POOL:

    POOL = 1 if NO pool

    = 0 if HAVE pool

    PRICE = 1 + 2SQFT + 5POOL + 

    ESTIMATE THIS MODEL

    One-Minute Essay Response


    One-Minute Essay Response

    • Estimated model:

      PRICE = 75.463 + 0.1444SQFT - 52.79POOL

      HAVE Pool: (POOL=0)

      PRICE = + 0.1444SQFT

      NO Pool: (POOL=1)

      PRICE = 75.463 + 0.1444SQFT - 52.79*1

      = + 0.1444SQFT

      =+ 0.1444SQFT

    75.463

    75.463 - 52.79

    22.673


    V. Binary Variables Change Intercept-Many Categories

    • A.Intercept term changes according to the two values of each binary variable

    • B.Use MULTIPLE binary variables for variable with many categories

    • C.Contrast with “two categories” case above


    Binary Variables Change Intercept-Many Categories (cont.)

    • D.In model Y =  + X2 + 

      • 1. Y: profits per NBA team ($1,000,000s)

      • 2. X2: wins per season


    Binary Variables Change Intercept-Many Categories (cont.)

    • 3. Expect profit level per team to differ (for same wins) across different size markets

      • Probably expect profits to rise as market size increases (for same number of wins)

    • 4. How account for this in model?


    Binary Variables Change Intercept-Many Categories (cont.)

    • 6. L & M are the binary variables

      • a)L = 1 if team in large market

      • L = 0 otherwise

      • b)M = 1 if team in mid-sized market

      • M = 0 otherwise

      • c)Which market size is the control

        group?

    • 7.  = 0 + 1L + 2M


    Binary Variables Change Intercept-Many Categories (cont.) EXAMPLE

    TeamL MType

    100small

    210large

    301mid

    410large

    501mid

    600small


    Students

    Write Y = 0 + 1L + 2M + X2 + 

    for three cases:

    • Small (L= 0, M = 0)

    • Medium (L = 0, M = 1)

    • Large (L = 1, M = 0)


    Binary Variables Change Intercept-Many Categories (cont.)

    • 8. Y = 0 + 1L + 2M+ X2 + 

    • 9. Small

      • a)L = 0; M = 0

      • b)Y = 0 + X2 + 

    • 10. Large

      • a)L = 1; M = 0

      • b)Y = (0 + 1) + X2 + 


    Binary Variables Change Intercept-Many Categories (cont.)

    • 11. Mid-sized

      • a)L = 0; M = 1

      • b)Y = (0 + 2) + X2 + 

        Y = 0 + X2 + small

        Y = (0 + 1) + X2 + large

        Y = (0 + 2) + X2 + mid

    • 12. What differs among the three models?

    • 13. 1 & 2 show changes in intercept relative to control group


    1 & 2 : Equivalent Meanings

    • 14. 1 & 2 show changes in intercept due to change in market size

    • 15. 1 measures difference in Y between large and small market teams

    • 16. 2 measures difference in Y between mid-sized and small market teams


    Are Wins Worth More in a Large Market?

    See regression output for binary variables as IVs – case #1B.


    Binary Variables Change Intercept-Many Categories (cont.)

    • E.Dummy Variable Trap

    • 1. Notice:

      • a)One variable with two categories

        • (1)Use one binary variable

      • b)One variable with three categories

        • (1)Use two binary variables

          (note: despite the fact that we’re using “binary”, this still uses the word “dummy”)


    Binary Variables Change Intercept-Many Categories (cont.)

    • 2. What's the rule for how many binary variables to create?

    • 3.

    Use one less binary variable than the

    number of categories


    Binary Variables Change Intercept-Many Categories (cont.)

    SALARY = 35,491 + 18393TOP + 8392SENIOR – 10615ENTRY +914YEARS +10975ADVDEGREE – 8684NODEGREE + 9195PROFCERT + 8417MALE

    • TOP=1 if top level mgmt, 0 if not

    • SENIOR=1 if senior level mgmt , 0 if not

    • ENTRY=1 if entry level , 0 if not

    • ADVDEGREE=1 if advanced degree , 0 if not

    • NODEGREE=1 if no degree , 0 if not

    • PROFCERT=1 if hold professional certification , 0 if not

    • MALE=1 if male , 0 if not

    • YEARS=years of experience


    Binary Variables Change Intercept-Many Categories (cont.)

    SALARY = 35,491 + 18393TOP + 8392SENIOR – 10615ENTRY +914YEARS +10975ADVDEGREE – 8684NODEGREE + 9195PROFCERT + 8417MALE

    • Male workers earn, on average, ?? more (or less?) than females.

    • An advanced degree is worth ??% more (or less?) than what education level?

    • A professional certification is worth ??% more (or less?) than what?


    VI. Binary Variables Change Slope-Two Categories

    • A.Slope changes according to the two values of the one binary variable

    • B.Use only ONE binary variable per variable with two categories


    Binary Variables Change Slope-Two Categories (cont.)

    • C.In the model Y =  + X2 + 

      • 1. Y: profits per NBA team ($1,000,000s)

      • 2. X2: wins per season

      • 3. Suppose teams in large markets make more profit on each additional win than teams in other markets

        • a)How does this differ from intercept shifting case?


    Binary Variables Change Slope-Two Categories (cont.)

    • C.In the model Y =  + X2 + 

      • Suppose teams in large markets make more profit on each additional win than teams in other markets

        Change InterceptChange Slope

        Different Total profitDifferent Extra profit

        from from

        Total no. of winsOne MORE win


    Binary Variables Change Slope-Two Categories (cont.)

    • 4. How account for this in model?

    • 5. D is the binary variable

      • a)D = 1 if the team is in a large market

      • b)D = 0 if the team is in a mid-sized or small market

    • 6.  = 0 + 1D


    Binary Variables Change Slope-Two Categories (cont.)

    • Recall: Y =  + X2 + 

       = 0 + 1D

    • 7. Y =  + (0 + 1D)X2 + 

    • 8. Y =  + 0X2 + 1DX2 + 

      ESTIMATE MODEL 8. ABOVE


    Binary Variables Change Slope (cont.)

    • recall: Y =  + 0X2 + 1DX2 + 

    • mid/small: Y =  + 0X2 +  (D=0)

    • large: Y =  + (0 + 1) X2+  (D=1)

    • large: Y =  + * X2+  (D=1)

    • 13. What differs between 2 models?


    Changing Slope

    Large market:

    Y =  + (0 + 1) X2 + 

    Profits per team

    Mid/small market:

    Y = 0 + 0X2 + 

    Wins per team

    assuming 1 > 0


    1 : 3 Equivalent Meanings

    • 15. 1 shows change in slope relative to control group

    • 16. 1 shows change in slope due to change in market size

    • 17. 1 measures difference in profits from each extra win between teams in different size markets, on average


    Changing Slope

    Large market:

    Y =  + (0 + 1) X2 + 

    Profits per team

    Mid/small market:

    Y = 0 + 0X2 + 

    0 + 1

    0

    Wins per team

    1 measures difference in profits from each extra win between teams in different size markets, on average


    Large market:

    Y =  + (0 + 1) X2 + 

    Profits per team

    Mid/small market:

    Y = 0 + 0X2 + 

    0

    One extra win

    Wins per team

    Changing Slope

    0 + 1

    1 measures difference in profits from each extra win between teams in different size markets, on average


    Large market:

    Y =  + (0 + 1) X2 + 

    Profits per team

    Mid/small market:

    Y = 0 + 0X2 + 

    246

    One extra win

    Wins per team

    Changing Slope

    556

    1 measures difference in profits from each extra win between teams in different size markets, on average


    Are Wins Worth More in a Large Market?

    See regression output for binary variables as IVs – case #2.


    Binary Variables Change Slope-Two Categories (cont.)

    Y =  + 0X2 + 1DX2+ 

    Y =  + 0X2 + 

    • 18. Hypothesis Test

      • a)H0: no difference in extra profit (per extra win) between markets

      • b)H0: 1 = 0

      • c)HA: is difference in extra profit (per extra win) between markets

      • d)H0: 1 not= 0

      • e)What test statistic use?


    Binary Variables Change Slope-Two Categories (cont.)

    • D.Exercise (notes page )

      • 1. Suppose you estimate

        Y =  + 0X2 + 1DX2 + and . .

      • 2. 1-hat = .310

      • 3. 0-hat = .246

      • 4. What is value of slope for large market teams?

      • 5. What is interpretation of each number?


    VII. Using Binary Variables: Intercept or Slope Coefficient Vary?

    • A.Intercept Different Across Qualitative Variable's (Political Party, Gender,...) Categories

      • 1. This happens whenever the relationship between Y and X (in ) unchanged for different categories of the qualitative variable, yet Y differs, even for the same values of X.


    Intercept or Slope Vary? (cont.)

    • 2. Examples

      • Profit for same total number of wins differs across market size

      • Earnings for females less than earnings for males even with same years of schooling


    Intercept or Slope Vary? (cont.)

    • B.Slope Coefficient Different Across Qualitative Variable's Categories

      • 1. This happens whenever the relationship between Y and X (in ) differs for different categories of the qualitative variable and causes Y to differ, even for the same values of X.


    Intercept or Slope Vary? (cont.)

    • 2. Example

      • Profit from each further win differs across market size

      • It’s plausible (likely?) that an additional year of experience yields less extra earnings for females than for males


    Intercept or Slope Vary? (cont.)

    • C.How know which is correct?

      • 1. Possibilities

        • a)intercept binary variable correct

        • b)slope binary variable correct

        • c)both

        • d)neither

      • 2. Try and let tests tell you


    Exercise

    • For each case below, state whether you would use a slope or intercept dummy. (NOTES)

      • 1. Southern workers earn less than those in other regions of the country even with same years of schooling & experience

      • 2. An additional year of education yields less extra earnings for single males than for married males

      • 3. More swimsuits are bought in Minnesota during the summer than during the winter

      • 4. Each extra year of experience yields less additional earnings for female coaches than for males


    Exercise

    Binary Variables #2


    Exercise

    • “The Advertising Experiment”


    Test Next Class


    Test Next Class

    • 8-10 questions - most have several parts

    • some: calculations

    • some: interpret regression results

    • some: show you understand a concept

    • bring calculator – NO CELL PHONES!

    • bring 5 x 8 card


  • Login