i qualitative or dummy independent variables
Download
Skip this Video
Download Presentation
I. Qualitative (or Dummy) Independent Variables

Loading in 2 Seconds...

play fullscreen
1 / 90

I. Qualitative (or Dummy) Independent Variables - PowerPoint PPT Presentation


  • 94 Views
  • Uploaded on

?. I. Qualitative (or Dummy) Independent Variables. “Binary” vs. “Dummy”. I was taught to call these variables “binary variables” I now believe that “binary variables” is a better (more descriptive) name So, let’s call them “binary variables” from now on.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' I. Qualitative (or Dummy) Independent Variables' - lamar-james


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
binary vs dummy
“Binary” vs. “Dummy”
  • I was taught to call these variables “binary variables”
  • I now believe that “binary variables” is a better (more descriptive) name
  • So, let’s call them “binary variables” from now on
iii introduction
III. Introduction
  • A. New type of variable
    • 1. Past: used quantitative variables (numerically measurable); continuous
    • 2. Now: variables that take small number of values; discrete
      • a) Gender
      • b) Market size
      • c) Region of country
      • d) Marital status (married vs. not), etc
introduction cont
Introduction (cont.)
  • B. Used as IV in this section
  • C. Used as DV later in course
introduction cont1
Introduction (cont.)
  • Institute of Management Accounts (IMA) publishes an annual Salary Guide
    • In Strategic Finance magazine
    • Annual survey of members
    • “…based on a regression equation derived from survey results.”
ima salary guide cont
IMA Salary Guide (cont.)

SALARY = 35,491 + 18393TOP + 8392SENIOR – 10615ENTRY +914YEARS +10975ADVDEGREE – 8684NODEGREE + 9195PROFCERT + 8417MALE

  • TOP=1 if top level mgmt, 0 if not
  • SENIOR=1 if senior level mgmt , 0 if not
  • ENTRY=1 if entry level , 0 if not
  • ADVDEGREE=1 if advanced degree , 0 if not
  • NODEGREE=1 if no degree , 0 if not
  • PROFCERT=1 if hold professional certification , 0 if not
  • MALE=1 if male , 0 if not
  • YEARS=years of experience
ima salary guide cont1
IMA Salary Guide (cont.)
  • Average IMA member (1999)
    • Male
    • 14.5 years experience
    • Professional certification
    • Salary = $66,356
      • Figure obtained from substituting values into regression equation
slide9

Are Wins Worth More in a Large Market?

See regression output for binary variables as IVs. (note)

introduction cont2
Introduction (cont.)
  • D. Example #1
    • 1. Y =  + X2 + 
    • 2. Y: social program expenditures per state
    • 3. X2: state’s total revenue
    • 4. Suppose states’ legislatures controlled by Democrats spend more from same revenue than those controlled by Republicans
    • 5. How account for this in model?
    • 6. What’s the categorical variable?
introduction cont3
Introduction (cont.)
  • E. Example #2
    • 1. Y =  + X2 + 
    • 2. Y: coach’s earnings
    • 3. X2: coach’s experience
    • 4. Suppose women earn less than men with equal experience (& other characteristics)
    • 5. How account for this in model?
    • 6. What’s the categorical variable?
introduction cont4
Introduction (cont.)
  • F. Example #3
    • 1. Y =  + X2 + 
    • 2. Y: sales of swimsuits in Minnesota
    • 3. X2: Minnesota’s population
    • 4. Suppose sales peak in warm months
    • 5. How account for this in model?
    • 6. What’s the categorical variable?
introduction cont5
Introduction (cont.)
  • G. Example #4
    • 1. Y =  + X2 + 
    • 2. Y: profits of NBA teams
    • 3. X2: wins
    • 4. Suppose teams in large markets make more profit on their wins than teams in other markets
    • 5. How account for this in model?
    • 6. What’s the categorical variable?
introduction cont6
Introduction (cont.)
  • G. Will use Binary (or Dummy) Independent Variables
    • 1. Create a special variable that takes a value of
      • a) if the unit of observation falls into one category
      • b) if the unit falls into the other category

1

0

why called dummy variables multiple choice question
Why Called “Dummy” Variables? (Multiple Choice Question)

A. A MAN NAMED “ALFRED DUMMY” INVENTED THEM

C. THEY REPRESENT CATEGORICAL VARIABLES

B. ANYONE WHO USES THEM IS. . . A DUMMY

?

introduction cont7
Introduction (cont.)
  • 2. Example
    • a) GENDER = 1 for all females in the sample
    • b) GENDER = 0 for all males
introduction cont8
Introduction (cont.)
  • c. you pick which category gives a value of 1 and which category gives a value of 0
  • EXAMPLE: the variable GENDER = 1 for all females in the sample and GENDER = 0 for all males

OBS # GENDER

1 male

2 male

3 female

4 male

5 female

6 female

introduction cont9
Introduction (cont.)
  • c. you pick which category gives a value of 1 and which category gives a value of 0
  • EXAMPLE: the variable GENDER = 1 for all females in the sample and GENDER = 0 for all males

OBS # GENDER

1 male 0

2 male

3 female

4 male

5 female

6 female

introduction cont10
Introduction (cont.)
  • c. you pick which category gives a value of 1 and which category gives a value of 0
  • EXAMPLE: the variable GENDER = 1 for all females in the sample and GENDER = 0 for all males

OBS # GENDER

1 male 0

2 male 0

3 female 1

4 male

5 female

6 female

introduction cont11
Introduction (cont.)
  • c. you pick which category gives a value of 1 and which category gives a value of 0
  • EXAMPLE: the variable GENDER = 1 for all females in the sample and GENDER = 0 for all males

OBS # GENDER

1 male 0

2 male 0

3 female 1

4 male 0

5 female 1

6 female 1

only use 0 1 values
Only Use 0 & 1 Values
  • Only use 0 & 1 values
  • Never use 1, 2, 3,… (for example)
  • Why not?
    • 2 is how many times bigger than 1? (2/1)
    • 3 is how many times bigger than 2? (3/2)
    • 1 is how many times bigger than 0? (1/0)
iv binary variables change intercept two categories
IV. Binary Variables Change Intercept-Two Categories
  • A. Intercept term changes according to the two values of the one binary variable
    • intercept is one value when D = 0
    • intercept is different value when D = 1
  • B. Use only ONE binary variable per variable with two categories
b inary variables change intercept two categories cont
Binary Variables Change Intercept-Two Categories (cont.)
  • C. In model Y =  + X2 + 
    • 1. for same value of X,

Y (in group #1) not = Y(in group #2)

Male: when X2 = 16, Y = 34

Female: when X2 = 16, Y = 29

    • 2. Since X is the same value for both groups,
      • a) either  or  must be different to cause

Y (in group #1) not = Y(in group #2)

binary variables change intercept two categories cont
Binary Variables Change Intercept-Two Categories (cont.)
  • D. Different cases
    • 1.  differs between groups OR
    • 2.  differs between groups OR
    • 3. both and  differ between groups

Y =  + X2 + 

binary variables change intercept two categories cont1
Binary Variables Change Intercept-Two Categories (cont.)
  • E. In the model Y =  + X2 + 
    • 1. Y: profits per NBA team ($1,000,000s)
    • 2. X2: wins per season
    • 3. Suppose teams in large markets make more profit on same number of wins than teams in other markets
b inary variables change intercept two categories cont1
Binary Variables Change Intercept-Two Categories (cont.)
  • 4. How account for this in model?
  • 5. D is the binary variable
    • a) D = 1 if the team is in a large market
    • b) D = 0 if the team is in a mid-sized or small market
b inary variables change intercept two categories cont2
Binary Variables Change Intercept-Two Categories (cont.)
    • D = 1 if the team is in a large market
    • D = 0 if the team is not
  •  = 0 + 1D
  • Y =  + X2 + 
  • Y = 0 + 1D + X2 + 
students
Students
  • Write Y = 0 + 1D + X2 + 

for two cases:

    • mid or small market (D = 0)
    • large market (D = 1)
b inary variables change intercept two categories cont3
Binary Variables Change Intercept-Two Categories (cont.)
  • 7. Y = 0 + 1D + X2 + 
  • 8. mid/small: Y = 0 + X2 +  (D=0)
  • 9. large: Y = (0 + 1) + X2 +  (D=1)
  • 10. What differs between 2 models?
changing intercept
Changing Intercept

Profits per team

Large market:

Y = (0 + 1)+ X2 + 

Mid/small market:

Y = 0 + X2 + 

1

0

Wins

per team

(assuming 1 > 0)

1 3 equivalent meanings
1 : 3 Equivalent Meanings
  • 13. 1 shows change in intercept relative to control group
  • 14. 1 shows change in intercept due to change in market size
  • 15. 1 measures difference in profits for same number of wins between teams in large markets vs. those in other markets
changing intercept1
Changing Intercept

Profits per team

Large market: (LA Clippers)

Y = (0 + 1)+ X2 + 

Mid/small market

(SD Clippers)

Y = 0 + X2 + 

$16.5M

What’s value of 1?

$0.5M

1

0

Wins

per team

50

1 measures difference in profits for same number of wins between teams in large markets vs. those in other markets

binary variables change intercept two categories cont2
Binary Variables Change Intercept-Two Categories (cont.)
  • 16. Comparison group or control group
    • a) Group for which binary variable = 0
  • 17. Who decides which group is control group?
    • a) You do
    • b) It doesn’t matter statistically
    • c) Remember which group is control when interpret results
binary variables change intercept two categories cont3
Binary Variables Change Intercept-Two Categories (cont.)
  • 18. Hypothesis Test (Y = 0 + 1D + X2 +  )
    • a) H0: no difference in Y (for same X) between markets OR
    • b) H0: 1 = 0 Both: Y = 0 + X2 + 
    • c) HA: is difference in Y (for same X) between markets OR
    • d) HA: 1  0 Large: Y = (0 + 1)+ X2 + 

Mid/small: Y = 0 + X2 + 

    • e) What test statistic use?
binary variables change intercept two categories cont4
Binary Variables Change Intercept-Two Categories (cont.)
  • F. Example

See regression output for binary variables as IVs – case #1A. (note)

binary variables change intercept two categories cont5
Binary Variables Change Intercept-Two Categories (cont.)
  • How interpret p-value on LARGE in model?
  • Interpret coefficient on LARGE. (see note p.)
    • a) "Between 2 teams with same number of wins, the one in the large market is expected to earn $ ??? more (or less?) ”

PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE

changing intercept2
Changing Intercept

PROFIT

PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE

LARGE market

0.282

MID/SMALL market

16.524

WINS

-8.340

coefficient interpretation exercise
Coefficient Interpretation Exercise
  • SEE DRAWING
  • Questions repeated on next three slides
  • Q1: Interpret the number –8.339
  • Q2: Interpret the number 0.282
  • Q3: Interpret the number 16.524
changing intercept3
Changing Intercept

PROFIT

PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE

Q1: Interpret the number -8.340

LARGE market

0.282

MID/SMALL market

16.524

WINS

-8.340

changing intercept4
Changing Intercept

PROFIT

PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE

Q2: Interpret the number 0.282

LARGE market

0.282

MID/SMALL market

16.524

WINS

-8.340

changing intercept5
Changing Intercept

PROFIT

PROFIT = -8.339 + 0.282 WINS + 16.524 LARGE

Q3: Interpret the number 16.524

LARGE market

0.282

MID/SMALL market

16.524

WINS

-8.340

review
Review
  • F. Example
    • A. PRICE = 1 + 2SQFT + 
    • IGNORE OTHER IVs for this example
    • 2. Add POOL to model
      • a) POOL = 1 if house has pool
      • b) POOL = 0 otherwise
review cont
Review (cont.)

A. PRICE = 1 + 2SQFT + 

B. PRICE = 1 + 2SQFT + 5POOL + 

  • ESTIMATE MODEL B
review cont1
Review (cont.)

Variable Model B

CONSTANT 22.673

(0.09)

SQFT 0.1444

(0.001)

POOL 52.790

(0.03)

Adj. R2 0. 890

review cont2
Review (cont.)
  • How interpret p-value on POOL in Model B?
  • Interpret coefficient on POOL. (see note p.)
  • a) "Between 2 houses of same size, the one with a pool is expected to sell for $

more (or less?) ”

52,790

review cont3
Review (cont.)
  • Estimated model:

PRICE = 22.673 + 0.1444SQFT +52.79POOL

No Pool: (POOL=0)

PRICE = 22.673 + 0.1444SQFT

With Pool: (POOL=1)

PRICE = 22.673 + 0.1444SQFT +52.79*1

= (22.673+ 52.79 ) + 0.1444SQFT

review cont4
Review (cont.)
  • What’s price for 1000 sq. ft house . . .
  • And NO pool?

(notes page)

  • WITH pool?
review cont5
Review (cont.)

PRICE

Model F:

with POOL

Model F:

no POOL

0.1444

52.790

22.673

SQFT

Q2: Interpret the number 52.790

exercise
Exercise

Binary Variables #3

slide51
Suppose reverse 0 & 1 cases for POOL:

POOL = 1 if NO pool

= 0 if HAVE pool

PRICE = 1 + 2SQFT + 5POOL + 

ESTIMATE THIS MODEL

One-Minute Essay Response

one minute essay response
One-Minute Essay Response
  • Estimated model:

PRICE = 75.463 + 0.1444SQFT - 52.79POOL

HAVE Pool: (POOL=0)

PRICE = + 0.1444SQFT

NO Pool: (POOL=1)

PRICE = 75.463 + 0.1444SQFT - 52.79*1

= + 0.1444SQFT

=+ 0.1444SQFT

75.463

75.463 - 52.79

22.673

v binary variables change intercept many categories
V. Binary Variables Change Intercept-Many Categories
  • A. Intercept term changes according to the two values of each binary variable
  • B. Use MULTIPLE binary variables for variable with many categories
  • C. Contrast with “two categories” case above
binary variables change intercept many categories cont
Binary Variables Change Intercept-Many Categories (cont.)
  • D. In model Y =  + X2 + 
    • 1. Y: profits per NBA team ($1,000,000s)
    • 2. X2: wins per season
binary variables change intercept many categories cont1
Binary Variables Change Intercept-Many Categories (cont.)
  • 3. Expect profit level per team to differ (for same wins) across different size markets
    • Probably expect profits to rise as market size increases (for same number of wins)
  • 4. How account for this in model?
binary variables change intercept many categories cont2
Binary Variables Change Intercept-Many Categories (cont.)
  • 6. L & M are the binary variables
    • a) L = 1 if team in large market
    • L = 0 otherwise
    • b) M = 1 if team in mid-sized market
    • M = 0 otherwise
    • c) Which market size is the control

group?

  • 7.  = 0 + 1L + 2M
binary variables change intercept many categories cont example
Binary Variables Change Intercept-Many Categories (cont.) EXAMPLE

Team L MType

1 0 0 small

2 1 0 large

3 0 1 mid

4 1 0 large

5 0 1 mid

6 0 0 small

students1
Students

Write Y = 0 + 1L + 2M + X2 + 

for three cases:

  • Small (L= 0, M = 0)
  • Medium (L = 0, M = 1)
  • Large (L = 1, M = 0)
binary variables change intercept many categories cont3
Binary Variables Change Intercept-Many Categories (cont.)
  • 8. Y = 0 + 1L + 2M+ X2 + 
  • 9. Small
    • a) L = 0; M = 0
    • b) Y = 0 + X2 + 
  • 10. Large
    • a) L = 1; M = 0
    • b) Y = (0 + 1) + X2 + 
binary variables change intercept many categories cont4
Binary Variables Change Intercept-Many Categories (cont.)
  • 11. Mid-sized
    • a) L = 0; M = 1
    • b) Y = (0 + 2) + X2 + 

Y = 0 + X2 +  small

Y = (0 + 1) + X2 + large

Y = (0 + 2) + X2 + mid

  • 12. What differs among the three models?
  • 13. 1 & 2 show changes in intercept relative to control group
1 2 equivalent meanings
1 & 2 : Equivalent Meanings
  • 14. 1 & 2 show changes in intercept due to change in market size
  • 15. 1 measures difference in Y between large and small market teams
  • 16. 2 measures difference in Y between mid-sized and small market teams
slide62

Are Wins Worth More in a Large Market?

See regression output for binary variables as IVs – case #1B.

binary variables change intercept many categories cont5
Binary Variables Change Intercept-Many Categories (cont.)
  • E. Dummy Variable Trap
  • 1. Notice:
      • a) One variable with two categories
        • (1) Use one binary variable
      • b) One variable with three categories
        • (1) Use two binary variables

(note: despite the fact that we’re using “binary”, this still uses the word “dummy”)

binary variables change intercept many categories cont6
Binary Variables Change Intercept-Many Categories (cont.)
  • 2. What\'s the rule for how many binary variables to create?
  • 3.

Use one less binary variable than the

number of categories

binary variables change intercept many categories cont7
Binary Variables Change Intercept-Many Categories (cont.)

SALARY = 35,491 + 18393TOP + 8392SENIOR – 10615ENTRY +914YEARS +10975ADVDEGREE – 8684NODEGREE + 9195PROFCERT + 8417MALE

  • TOP=1 if top level mgmt, 0 if not
  • SENIOR=1 if senior level mgmt , 0 if not
  • ENTRY=1 if entry level , 0 if not
  • ADVDEGREE=1 if advanced degree , 0 if not
  • NODEGREE=1 if no degree , 0 if not
  • PROFCERT=1 if hold professional certification , 0 if not
  • MALE=1 if male , 0 if not
  • YEARS=years of experience
binary variables change intercept many categories cont8
Binary Variables Change Intercept-Many Categories (cont.)

SALARY = 35,491 + 18393TOP + 8392SENIOR – 10615ENTRY +914YEARS +10975ADVDEGREE – 8684NODEGREE + 9195PROFCERT + 8417MALE

  • Male workers earn, on average, ?? more (or less?) than females.
  • An advanced degree is worth ??% more (or less?) than what education level?
  • A professional certification is worth ??% more (or less?) than what?
vi binary variables change slope two categories
VI. Binary Variables Change Slope-Two Categories
  • A. Slope changes according to the two values of the one binary variable
  • B. Use only ONE binary variable per variable with two categories
binary variables change slope two categories cont
Binary Variables Change Slope-Two Categories (cont.)
  • C. In the model Y =  + X2 + 
    • 1. Y: profits per NBA team ($1,000,000s)
    • 2. X2: wins per season
    • 3. Suppose teams in large markets make more profit on each additional win than teams in other markets
      • a) How does this differ from intercept shifting case?
binary variables change slope two categories cont1
Binary Variables Change Slope-Two Categories (cont.)
  • C. In the model Y =  + X2 + 
    • Suppose teams in large markets make more profit on each additional win than teams in other markets

Change InterceptChange Slope

Different Total profit Different Extra profit

from from

Total no. of wins One MORE win

binary variables change slope two categories cont2
Binary Variables Change Slope-Two Categories (cont.)
  • 4. How account for this in model?
  • 5. D is the binary variable
    • a) D = 1 if the team is in a large market
    • b) D = 0 if the team is in a mid-sized or small market
  • 6.  = 0 + 1D
binary variables change slope two categories cont3
Binary Variables Change Slope-Two Categories (cont.)
  • Recall: Y =  + X2 + 

 = 0 + 1D

  • 7. Y =  + (0 + 1D)X2 + 
  • 8. Y =  + 0X2 + 1DX2 + 

ESTIMATE MODEL 8. ABOVE

binary variables change slope cont
Binary Variables Change Slope (cont.)
  • recall: Y =  + 0X2 + 1DX2 + 
  • mid/small: Y =  + 0X2 +  (D=0)
  • large: Y =  + (0 + 1) X2+  (D=1)
  • large: Y =  + * X2+  (D=1)
  • 13. What differs between 2 models?
changing slope
Changing Slope

Large market:

Y =  + (0 + 1) X2 + 

Profits per team

Mid/small market:

Y = 0 + 0X2 + 

Wins per team

assuming 1 > 0

1 3 equivalent meanings1
1 : 3 Equivalent Meanings
  • 15. 1 shows change in slope relative to control group
  • 16. 1 shows change in slope due to change in market size
  • 17. 1 measures difference in profits from each extra win between teams in different size markets, on average
changing slope1
Changing Slope

Large market:

Y =  + (0 + 1) X2 + 

Profits per team

Mid/small market:

Y = 0 + 0X2 + 

0 + 1

0

Wins per team

1 measures difference in profits from each extra win between teams in different size markets, on average

changing slope2

Large market:

Y =  + (0 + 1) X2 + 

Profits per team

Mid/small market:

Y = 0 + 0X2 + 

0

One extra win

Wins per team

Changing Slope

0 + 1

1 measures difference in profits from each extra win between teams in different size markets, on average

changing slope3

Large market:

Y =  + (0 + 1) X2 + 

Profits per team

Mid/small market:

Y = 0 + 0X2 + 

246

One extra win

Wins per team

Changing Slope

556

1 measures difference in profits from each extra win between teams in different size markets, on average

slide78

Are Wins Worth More in a Large Market?

See regression output for binary variables as IVs – case #2.

binary variables change slope two categories cont4
Binary Variables Change Slope-Two Categories (cont.)

Y =  + 0X2 + 1DX2+ 

Y =  + 0X2 + 

  • 18. Hypothesis Test
    • a) H0: no difference in extra profit (per extra win) between markets
    • b) H0: 1 = 0
    • c) HA: is difference in extra profit (per extra win) between markets
    • d) H0: 1 not= 0
    • e) What test statistic use?
binary variables change slope two categories cont5
Binary Variables Change Slope-Two Categories (cont.)
  • D. Exercise (notes page )
    • 1. Suppose you estimate

Y =  + 0X2 + 1DX2 + and . .

    • 2. 1-hat = .310
    • 3. 0-hat = .246
    • 4. What is value of slope for large market teams?
    • 5. What is interpretation of each number?
vii using binary variables intercept or slope coefficient vary
VII. Using Binary Variables: Intercept or Slope Coefficient Vary?
  • A. Intercept Different Across Qualitative Variable\'s (Political Party, Gender,...) Categories
    • 1. This happens whenever the relationship between Y and X (in ) unchanged for different categories of the qualitative variable, yet Y differs, even for the same values of X.
intercept or slope vary cont
Intercept or Slope Vary? (cont.)
  • 2. Examples
    • Profit for same total number of wins differs across market size
    • Earnings for females less than earnings for males even with same years of schooling
intercept or slope vary cont1
Intercept or Slope Vary? (cont.)
  • B. Slope Coefficient Different Across Qualitative Variable\'s Categories
    • 1. This happens whenever the relationship between Y and X (in ) differs for different categories of the qualitative variable and causes Y to differ, even for the same values of X.
intercept or slope vary cont2
Intercept or Slope Vary? (cont.)
  • 2. Example
    • Profit from each further win differs across market size
    • It’s plausible (likely?) that an additional year of experience yields less extra earnings for females than for males
intercept or slope vary cont3
Intercept or Slope Vary? (cont.)
  • C. How know which is correct?
    • 1. Possibilities
      • a) intercept binary variable correct
      • b) slope binary variable correct
      • c) both
      • d) neither
    • 2. Try and let tests tell you
exercise1
Exercise
  • For each case below, state whether you would use a slope or intercept dummy. (NOTES)
    • 1. Southern workers earn less than those in other regions of the country even with same years of schooling & experience
    • 2. An additional year of education yields less extra earnings for single males than for married males
    • 3. More swimsuits are bought in Minnesota during the summer than during the winter
    • 4. Each extra year of experience yields less additional earnings for female coaches than for males
exercise2
Exercise

Binary Variables #2

exercise3
Exercise
  • “The Advertising Experiment”
slide90

Test Next Class

  • 8-10 questions - most have several parts
  • some: calculations
  • some: interpret regression results
  • some: show you understand a concept
  • bring calculator – NO CELL PHONES!
  • bring 5 x 8 card
ad