statistics for international relations research i n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Statistics for International Relations Research I PowerPoint Presentation
Download Presentation
Statistics for International Relations Research I

Loading in 2 Seconds...

play fullscreen
1 / 58

Statistics for International Relations Research I - PowerPoint PPT Presentation


  • 133 Views
  • Uploaded on

IHEID - The Graduate Institute Academic year 2010-2011. Statistics for International Relations Research I . Dr. NAI Alessandro, visiting professor. Dec 03, 2010 Lecture 8 : Regression analysis III. Lecture content. Feedback on Assignment VII Logistic regression models: binary

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Statistics for International Relations Research I


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
statistics for international relations research i

IHEID - The Graduate Institute

Academic year 2010-2011

Statistics for International Relations Research I

Dr. NAI Alessandro, visiting professor

Dec 03, 2010

Lecture 8:

Regression analysis III

slide2

Lecture content

  • Feedback on Assignment VII
  • Logistic regression models: binary
  • Predicted probabilities
  • Logistic regression models: multinomial
slide3

Binary logistic regression models [i / xxxviii]

Correlation

Statistical relationship between two scale variables

(see lecture 5)

Regression

Method for model the effect of one or more independent scale variables on a dependent scale variable

slide4

Binary logistic regression models [ii / xxxviii]

Two major uses for regression models

Prediction analysis:

Develop a formula for making predictions about the dependent variable based on observed values

Ex: predict GNP for next year

Causal analysis:

Independent variables are regarded as causes of the dependent variable

Ex: uncover the causes for a higher criminality rate

slide5

Binary logistic regression models [iii / xxxviii]

Two main types of regression

OLS (Ordinary Least Squares): linear relationship between variables, scale dependent variable

(see lectures 6 and 7)

Logistic regression: curvilinear relationship between variables, dummy (binomial logistic regression) or nominal dependent variable (multinomial logistic regression)

All regression models may be bi- or multivariate

slide6

Binary logistic regression models [iv / xxxviii]

Independent variables in (all) regression models may take the following form:

- Scale (optimal measurement level in regressions)

- Ordinal (metrical, or close)

- Binary (0,1)

Nominal variables are allowed (almost) only in logistic regressions

slide7

Binary logistic regression models [v / xxxviii]

Why a regression is not efficient with qualitative variables?

slide8

Binary logistic regression models [vi / xxxviii]

For scale (and, sometimes, ordinal) dependent variables, OLS estimations are applied

(see lectures 6 and 7)

What if the dependent variable is qualitative?

- For binary DVs: binary logistic regressions

- For nominal DVs: multinomial logistic regressions

slide9

Binary logistic regression models [vii / xxxviii]

Let’s take a (bivariate) example:

Is the likelihood of participate in illegal protest activities dependent on citizens’ positioning on the left-right scale?

Working hypothesis: citizens on the left side of the scale are more likely to participate in illegal protest activities

slide12

Binary logistic regression models [x / xxxviii]

In our example, the dependent variable is binary, and the independent variable is ordinal (almost scale)

How to answer the working hypothesis?

Through a crosstab?

OLS Regression?

slide14

Binary logistic regression models [xii / xxxviii]

The crosstab shows a significant and strong relationship

However:

- The dispersion of individual observation is too high

- The dependent variable is highly skewed

Therefore, the results of the bivariate analysis (and especially the gamma score) are not robust enough

slide17

Binary logistic regression models [xv / xxxviii]

The OLS regression shows a significant but quite weak relationship

Furthermore:

- The relationship is clearly not linear

- The postulates of OLS regressions are not met

Therefore, the results of the bivariate analysis are not robust enough

slide18

Binary logistic regression models [xvi / xxxviii]

Solution?

Binary logistic regression

Regression model that allows to estimate the occurrence likelihood of a situation (binary dependent variable) through one or more independent variables

Independent variables may take every level of measurement (nominal, ordinal, scale)

slide20

Binary logistic regression models [xviii / xxxviii]

A logistic transformation is applied

The (transformed) regression equation may be written as:

y = f(z) = (ez) / (1 + ez)

Where:

z = exposure to a set of explanatory variables

f(z) = probability of a given outcome (y), given that set of explanatory variables

e = constant, base of the natural logarithm = 2.71828...

slide21

Binary logistic regression models [xix / xxxviii]

The regression equation becomes:

z = β0 + β1*x1 + β2*x2 + … + βk*xk

Where:

β0 = constant (intercept)

β1 = regression coefficient (“slope”) for the variable x1

β2 = regression coefficient (“slope”) for the variable x2

βk = regression coefficient (“slope”) for the variable xk

slide23

Binary logistic regression models [xxi / xxxviii]

SPSS procedure: Analyze / Regression / Binary logistic

slide24

Binary logistic regression models [xxii / xxxviii]

Please note!

Binary logistic regressions always estimate the likelihood for the presence of a phenomenon (coded 1) on its absence (coded 0)

Here, the models will estimate the presence of a “No” (i.e., not likely to take part in illegal protest activities)

slide25

Binary logistic regression models [xxiii / xxxviii]

As for OLS regressions, SPSS provides scores on the overall quality of the model for logistic regressions

Nagelkerke’s R square: similar in interpretation as the R2 for OLS regressions

Here, the model predicts 11% of the DV variance, which is quite good with only one independent variable

slide26

Binary logistic regression models [xxiv / xxxviii]

Unstandardized coefficients (Bs)

Needed to build the regression equation

Here:

y = f(z) = (ez) / (1 + ez)

z = 1.918 + .54*lrscale

slide27

Binary logistic regression models [xxv / xxxviii]

Significance test

Based on Wald scores

Interpretation similar to Chi-square and Fisher tests

If p<.05, the effect is statistically significant

slide28

Binary logistic regression models [xxvi / xxxviii]

Logistic regression coefficients (“Log Odds”)

Provide information on the direction and the strength of the IV effect on the probability that the outcome y exists

slide29

Binary logistic regression models [xxvii / xxxviii]

Interpretation of the Log Odds

Based on the closeness to 1.0

If Exp(B) = 1.0, no relationship

If 0.0<Exp(B)<1.0, negative relationship

If 1.0<Exp(B), positive relationship

Rule: the closer to 1.0, the weaker the relationship

slide30

max

Relationship strenght

min

0

1

Exp(B) value

Binary logistic regression models [xxviii / xxxviii]

slide31

Binary logistic regression models [xxix / xxxviii]

In our example

Exp(B) = 1.72*** [*p<.05, **p<.01, ***p<.001]

Therefore, the relationship is significant, positive, and quite strong

Being more on the right on the left-right scale increases the likelihood that no illegal protest activities are done

Working hypothesis confirmed statistically

slide32

Binary logistic regression models [xxx / xxxviii]

A multivariate example:

Is the likelihood of participate in illegal protest activities dependent on citizens’ positioning on the left-right scale and the party voted in the last national election?

slide33

Binary logistic regression models [xxxi / xxxviii]

Qualitative (categorical) variable!

slide38

Binary logistic regression models [xxxvi / xxxviii]

Overall quality of the model

The model explains about 17% of the DV variance (the previous binary model explained about 11%)

slide39

Binary logistic regression models [xxxvii / xxxviii]

Variables’ effects

- “lrscale” has a significant, positive, and strong effect

- “partyvoted” has a significant effect (but only for p<.1)

Strength and direction effects for party voted?

slide40

Binary logistic regression models [xxxviii / xxxviii]

Qualitative variables’ effects are decomposed into modalities

Only “partyvoted(3)” (social-democrats!) has a significant effect

Interpretation always compared to the reference category (here: “other parties”)

Having votes SD (instead of “other”) strongly deceases the likelihood of participating in illegal protest activities

slide41

Predicted probabilities [i / vii]

Interpretation of Log Odds in logistic regression is easy (difference from 1.0), but not always straightforward

A complement: predicted probabilities

Since logistic models are based on the likelihood estimation that an event occurs, thinking about “probabilities” makes sense

slide42

Predicted probabilities [ii / vii]

Predicted probabilities are calculated though the following formula:

P(Y=1) = e(β0 + β1*x1 + β2*x2 + … + βk*xk) / (1 + e(β0 + β1*x1 + β2*x2 + … + βk*xk))

But may also be computed through SPSS

slide43

Predicted probabilities [iii / vii]

SPSS procedure to compute predprob

slide45

Predicted probabilities [v / vii]

A new variable (PRE_1) is created in the SPSS database

The variable shows, for each individual, the probability (0-100%) to having 1 on the dependent variable (in our example, not participating in illegal protest activities) under the control of the IV of the model (lrscale and partyvoted)

slide48

Multinomial logistic regression models [i / x]

If the dependent variable is binary, regression models will assume the binary logistic form

What if the dependent variable is qualitative but not binary?

What for nominal dependent variables?

Multinomial logistic regressions

slide49

Multinomial logistic regression models [ii / x]

Main logic:

A binary logistic regression model is computed for each modality of the dependent variable (one modality being the reference category)

Results represent the individuals’ likelihood of being in that modality instead than in the reference modality

Coefficients are to be interpreted as in binary logistic regressions

slide50

Multinomial logistic regression models [iii / x]

Example:

Explain the party voted in last national elections

Four independent variables:

- Education level

- Trust in institutions

- Positioning on left-right scale

- Domicile (big city, town, village, …)

slide52

Multinomial logistic regression models [v / x]

SPSS procedure: Analyze / Regression / Multinomial logistic

Categorical (qualitative) independent variables: nominal, binary, ordinal non-metrical

Qualitative independent variables: scale or ordinal metrical

slide53

Multinomial logistic regression models [vi / x]

Choose a reference category

Here we choose the first modality (“Swiss People’s Party”) as reference category.

All results will be interpreted in opposition to this category

Note: always choose a reference category that makes sense conceptually/theoretically

slide54

Multinomial logistic regression models [vii / x]

Overall quality of the model

As for binary logistic regression, SPSS compute a score measuring the overall quality of the model in terms of explained variance

Here, 42.3% of variance explained; very good model

slide55

Multinomial logistic regression models [viii / x]

Significance levels: Wald test, check if p<.05

Regression coefficients: Exp(B) (Log Odds), as in binary reg

Here, having a higher trust in institutions has a significant, positive but quite weak effect on the Chris-dem vote (Swiss People’s Party being the reference category)

slide56

Multinomial logistic regression models [ix / x]

Here, all components of the model have a significant effect

Being more educated has a strong effect on the Soc-dem vote (SPP reference!), whereas being more on the right of the left-right scale has a negative effect

Concerning the domicile, living in a big city (domicil=1) has a very strong effect on the Soc-dem vote

slide58

Any questions?

Thank you for your attention!