1 / 45

# Welcome to BUAD 310 - PowerPoint PPT Presentation

Welcome to BUAD 310. Instructor: Kam Hamidieh Lecture 4, Monday January 27, 2014. Agenda &amp; Announcement. Today : Chapter 6 (Only Sections 6.1 &amp; 6.2, will say more when we get to linear regression.) Read pages 105-110. Chapters 7. Read all of it! Note:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Instructor: Kam Hamidieh

Lecture 4, Monday January 27, 2014

Agenda & Announcement
• Today:
• Chapter 6 (Only Sections 6.1 & 6.2, will say more when we get to linear regression.) Read pages 105-110.
• Chapters 7. Read all of it!
• Note:
• Make sure you are getting my email messages!
• I changed the syllabus slightly.
• HW 1 is due this Wednesday by 5 PM. No extensions will be given.
• Recall my office hours?

Answer to the In Class Exercise 2 from Last Time

Data: {271, 275, 285, 288, 288, 289, 292, 294, 295, 305, 313, 332}

Max = 332

313

Q3 = 300

Q1 = 287

Min = 271

Very Interesting Article!

See:

http://www.wired.com/wiredscience/2014/01/how-to-hack-okcupid/

(Many thanks to Sonny N.)

From Last Time
• Five Number Summary:Min, Q1, Q2 (Median), Q3, Max
• The five number summaries are used to create boxplots:
• Another way to visually summarize the distribution of data.
• Tool for detecting outliers.
• Can hide certain things: bimodality for example.
• Range: largest - smallest
• IQR: Q3 – Q1
• Standard Deviation: roughly the average distance of your points from the overall mean. (SD2 = Variance)

A Realistic Example

• Investors are often interested involatilityof a stock (or an asset.)
• Volatility of a stock is roughly the standard deviation of the stock’s returns over a certain time period.
• For example, a volatility of 13% per year means that on average, the stock value could go up or down by 13% in a year.

A Realistic Example

NassimTaleb

Check “Market Volatility” and “Market Fear” on

A Realistic Example

• Which company, Apple or ExxonMobil, has more volatility?
• You can download Exxon-Mobile, and Apple and compare the distribution of the returns.
• HOW? (Go to finance.yahoo.com!)

What can you say?

s ≈ 1%

s ≈ 1.8%

FYI….

> round(summary(xom.returns), 1)

Min. 1st Qu. Median Mean 3rd Qu. Max.

-3.4 -0.5 0.0 0.0 0.6 3.3

> round(sd(xom.returns),1)

[1] 0.9

> round(summary(apple.returns),1)

Min. 1st Qu. Median Mean 3rd Qu. Max.

-13.2 -0.9 0.0 0.1 1.1 8.5

> round(sd(apple.returns),1)

[1] 1.8

Scatterplots

• Scatterplots are two dimensional plots of data.
• They tell us about the strength, direction, and the nature of the relationship between two variables.
• Direction:
• Two variables have a positive association when the values of one variable go up, the values of the other variable go up on average.
• Two variables have a negative associationwhen the values of one variable go up, the values of the other variable go down on average.
• Nature of the relationship: linear or curved? (Curvature)
• Strength: how tightly the points are clustered around some straight line or a curve.

No relationship

Nonlinear

Generic Example

Linear

Strength of Relationship

The strength of the relationship between the two variables can be seen by the amount of variation:

With a strong relationship, you can get a pretty good estimate of y if you know x.

With a weak relationship, for any x you might get a wide range of y values.

Example of a Scatterplot

State Verbal Math PctTook

1 AL 562 558 8

2 AK 521 520 52

3 AZ 525 528 32

4 AR 568 555 6

5 CA 497 516 47

6 CO 537 542 31

7 CT 510 509 80

8 DE 501 493 70

9 DC 488 476 83

10 FL 500 501 52

11 GA 486 482 64

12 HI 483 513 55

13 ID 545 544 16

14 IL 564 581 13

15 IN 497 500 59

16 IA 593 601 5

17 KS 582 585 9

• The data consists of the average math and verbal SAT scores in 1998 for the 50 states and the District of Columbia.
• The PctTook variable is the percent of graduating seniors who took the test that year.
• Is there a relationship between the average math and the average verbal scores?

Example of a Scatterplot

The nature of the relationship: There seem to be a linear relationship. See the line!

Direction: The relationship is positive. As the math scores go up, the verbal scores go up on the average.

Strength: The relationship seems to be strong. The points are bunched very close to the possible underlying line.

WARNING: You can NOT conclude that high math scores cause high verbal scores!

In Class Exercise 1

The plot shows the daily percentage changes (returns) of Chevron versus Crude oil prices from Jan 2, 2013 to Aug 19, 2013.

• Comment on the relationship: strength, nature, direction, unusual points, etc.
• Comment on the following statements:
• “The plot proves that crude oil returns cause changes in Chevron returns.”
• “As oil price goes up, so does chevron’s stock.”

Random Phenomenon

• A random phenomenon is one in which the outcome of some situation is uncertain.
• The word random in statistics in not synonymous with “haphazard” or “chaotic” but a description of a kind of order that emerges in the long run.
• Examples:
• Toss of a fair coin: heads or tails.
• Toss of a fair die: any number between 1 and 6

What is probability?

• A probability is a number between 0 and 1 that is assigned to a possible outcome of a random phenomenon.
• Probability theory is the major tool in statistical inference.

Interpretation

• We interpret the probability of a random phenomenon as the proportion (or relative frequency) of times it would occur over the long run.
• Show coin tossing applet.

Aside…

• Around 1900, English statistician Karl Pearson tossed a coin 24,000. He got 12,012 heads or 50.05%.
• While imprisoned by the Germans during WW II, the South African statistician John Kerrich tossed a coin 10,000 times and got heads 50.67%.

K. Pearson

?

J. Kerrich

The Law of Large Numbers (LLN)

• LLN:The relative frequency of an outcome converges to a number, the probability of the outcome, as the number of observed outcomes increases.
• Notes:
• There are technical but intuitive conditions for the law to apply: roughly the conditions under which the “experiment” is done does not change too much and the attempts don’t influence each other too much.
• LLN only applies in the long run.

Determining Probabilities in Practice

• Relative frequency probabilities can be determined by:
• Making some assumptions about the physical world
• Observing the results of random phenomenon over a long run
• Measuring a representative sample and observing the relative frequencies of the sample that fall into various categories.
• ….
• Too morbid?

“In the long run we are all dead.”

?

Probability Models

• We are interested in a random phenomenonand we want to “model” it.
• In the broadest sense, probability modeling means being able to assign probability values (between 0 and 1) to the possible outcomes of a random phenomenon .
• Two parts to it:
• list of outcomes, and
• their probabilities.

Some Terms

• Sample space, S or Ω: the collection of unique, non-overlapping possible outcomes of a random phenomenon. Generally our question of interest determines the sample space.
• Simple event: one outcome in the sample space
• Event: a collection of one or more simple events in the sample space; often written as A, B, C, and so on.

Example – Tossing a Fair Die Once

• Throw a die once.
• The sample space is S = {1,2,3,4,5,6}.
• A simple event is what we observe when we throw the die once: we could get a {1} or {2} or … {6}.
• Possible events:
• Event A = { all even numbers } = {2,4,6}.
• Event B = {all numbers divisible by 3} = {3,6}.
• Note when we say “Event A happened” we mean that we threw a 2 or 4 or 6.

Example – Tossing Coins

• We’ll toss a fair coin 2 times.
• The sample space is S = {HH, HT, TH, TT}.
• A simple event is what we observe when we throw the coin twice: we could get a {HH} or {HT} or {TH} or {TT}.
• Possible events:
• Event A = { all outcomes containing at least one head } = {HH, HT, TH}
• Event B = {all outcomes containing no heads } = {TT}

H - HHH

H

M - HHM

H

S = { HHH, HHM, HMH, HMM, MHH, MHM, MMH, MMM }

Note: 8 elements, 23

H - HMH

M

M - HMM

M …

More Examples of Sample Spaces

1.A basketball player shoots three free throws. What are the possible sequences of hits (H) and misses (M)?

S = { 0, 1, 2, 3 }

3.A loan is made to a company. What is the time to default?

S = [0, ∞)= (all numbers ≥ 0)

Equally Likely Events

Sometimes, e.g. by symmetry, we assume all possible outcomes are equally likely. In this case:

More Examples of Equally Likely Case

Tossing a die twice:S = {1-1,1-2,1-3,1-4,5-1,6-1,…,6-1,6-2,6-3,6-4,6-5,6-6}. Then P(1-1) = P(1-2) = … = P(6-6) = 1/36.Toss a die once: S = {1,2,3,4,5,6} Then P(1) = P(2) = … = P(6) = 1/6.

Tossing a coin twice: S = {HH, HT , TH, TT}So P(HH) = … = P(TT) = ¼.

Complementary Events

• One event is the complement of another event if the two events do not contain any of the same simple events and together they cover the entire sample space.
• To emphasize, complementary events can not happen at the same time.

Complementary Events

• Toss a die: S = {1,2,3,4,5,6}.
• Possible complementary events are:
• A = {1,2,3} and AC = {4,5,6}
• B = {Odd outcome} = {1,3,5} and BC = {Even Outcome} = {2,4,6}.
• F = {outcome divisible by 3} = {3,6} and FC = {outcome not divisible by 3} = {1,2,4,5}.

In Class Exercise 2

Suppose you toss a die twice and record the sum of the numbers you see. Here: S = {2, 3, 4, 5, 6, 7, …, 10, 11, 12}

• What is the probability that the sum is 7?
• What is the probability the sum is not 7?
• What is the probability that the sum is even?
• What is the probability that the sum is odd?

First Throw

Second Throw

Mutually Exclusive Events

• Two or more events are mutually exclusive if they do not contain any of the same simple events or outcomes.
• The term disjoint is synonymous with mutually exclusive.
• When two events are mutually exclusive, they can not happen at the same time.

Example

• Throw a fair die once. S = {1,2,3,4,5,6}.
• Possible mutually exclusive events are:
• B = {Odd outcome} = {1,3,5} and C = {6}
• F = {outcome divisible by 3} = {3,6}, D={1}, and E={5}
• A = {1,2,3} and AC = {4,5,6}

Union of Events

• Union of two events A and B is defined as the event consisting of all outcomes that are in A or B or both.
• It is represented in two ways:
• A or B = C
• A  B = C
• When we say event C happened, we mean that either event A or B or both happened.

Example

• Throw a die once. S = {1,2,3,4,5,6}.
• Consider the following events:
• A = {1,2,3} and AC = {4,5,6}
• B = {Odd outcome} = {1,3,5}
• F = {Outcome divisible by 3} = {3,6}
• Some unions:
• A  AC = S
• A  A = A
• A  B = {1,2,3,5}
• A  B  F = {1,2,3,5,6}
• A  Ω = Ω

Always True!

Always True!

Always True!

Intersection of Events

• Intersection of two events A and B is defined as the event consisting of all outcomes that are both in A AND in B.
• It is represented in two ways:
• A and B = C
• A  B = C
• When we say event C happened, then that means both A and B happened.

Examples

• Our experiment is to toss a fair die once. The sample space is {1,2,3,4,5,6}.
• Consider the following events:
• A = {1,2,3} and AC = {4,5,6}
• B = {Odd outcome} = {1,3,5}
• F = {Outcome divisible by 3} = {3,6}
• Some intersections:
• A  AC = {}
• A  A = A
• A  B = {1,3}
• A  B  F = {3}
• A  S = A

Always True!

Always True!

Always True!

Venn Diagrams

• Venn diagrams are a visual way of representing how events are combined.
• They are very helpful when you are starting with probability.
• However, more complicated combination of events are very difficult to show using Venn diagrams.

Venn Diagrams - Basics

Events, such as A are represented by circled objects inside. Generally the size of A is proportional to P(A)

The sample space, S, is presented by large box.

S

A

Venn Diagrams - Intersection

The intersection of A and B is the shaded area where A and B intersect or have points in common, the blue area.

S

Venn Diagrams - Union

The union of A and B is the shaded area of A and B put together, the blue area.

S

Venn Diagrams - Complement

The complement of A, AC is the shaded area of everything that is not contained in A, the blue area.

S

What would two mutually exclusive events (but not complementary) look?

In Class Exercise 3

• T or F: Union of two mutually exclusive events will give you S.
• T or F: Union of two complementary events will always give you S.
• T or F: If two events are mutually exclusive, then they are complementary as well.
• T or F: If two events are complementary, then they are mutually exclusive as well.