1 / 56

Dependence and Measuring of Its Magnitudes

Dependence and Measuring of Its Magnitudes. Boyan Dimitrov Math Dept. Kettering University, Flint, Michigan 48504, USA. Outline. What I intend to tell you here, can not be found in any contemporary textbook.

zenda
Download Presentation

Dependence and Measuring of Its Magnitudes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dependence and Measuringof Its Magnitudes Boyan Dimitrov Math Dept. Kettering University, Flint, Michigan 48504, USA

  2. Outline What I intend to tell you here, can not be found in any contemporary textbook. I am not sure if you can find it even in the older textbooks on Probability & Statistics. I have read it (1963) in the Bulgarian textbook on Probability written by Bulgarian mathematician Nikola Obreshkov . Later I never met it in other textbooks, or monographs. There is not a single word about these basic measures in the Encyclopedia on Statistical Science published more than 20 years later.

  3. Introduction • All the prerequisites are: • What is probability P(A) of a random event A, • When we have dependence between random events, • What is conditional probability P(A|B) of a random event A if another event B occurs, and • Several basic probability rules related to pairs of events. • Some interpretations of the facts, related to probability content. • Who are familiar with Probability and Statistics, will find some known things, and let do not blame the textbooks for the gaps we fill in here. • For beginners let the written here be a challenge to get deeper into the essence of the concept of dependence.

  4. 2. Dependent events. Connection between random events • LetА andB be two arbitrary random events. • А andB are independent only when the probability for their joint occurrence is equal to the product of the probabilities for their individual appearance, i.e. when it is fulfilled (1)

  5. 2. Connection between random events(continued) • the independence is equivalent to the fact, that the conditional probability of one of the events, given that the other event occurs, is not changed and remains equal to its original unconditional probability, i.e. (2) • The inconvenience in (2) as definition of independence is that it requires P(B)>0i.e.Bhas to be a possible event. • Otherwise, when P(B)=0, the conditional probability (3) is not defined

  6. 2. Connection between random events(continued) • A zeroevent, as well as a sure event is independent with any other event, including themselves. • The most important fact is that when equality in does not hold, the events А and В are dependent. • The dependence in the world of the uncertainty is a complex concept. • The textbooks do avoid any discussions in this regard. • In the classical approach equation (3) is used to determine the conditional probabilities and as a base for further rules in operations with probability. • We establish a concept of dependence and what are the ways of its interpretation and measuring when А and В are dependent events.

  7. 2. Connection between random events(continued) • Definition 1.The number is called connection between the events А and В. Properties • δ1)The connection between two random eventsequals zero if and only if these events are independent. This includes the cases when some of the events are zero, or sure events. • δ2)The connection between eventsАandВis symmetric, i.e. it is fulfilled

  8. 2. Connection between random events(continued) • δ3) • δ4) • δ5) • δ6) These properties show that the connection function between events has properties, analogous to properties of probability function (additive, continuity).

  9. 2. Connection between random events(continued) • δ7) The connection between the complementary events andis the same as the one between АandВ; • δ8) If the occurrence of Аimplies the occurrence of В, i.e. if then and the connection between Аand В is positive. The two events are called positively associated. • δ9)WhenАandВ are mutually exclusive, then the connection between А and В is negative. • δ10)Whenthe occurrence of one of the two events increases the conditional probability for the occurrence of the other event. The following is true:

  10. 2. Connection between random events(continued) • δ11)The connection between any two eventsАandВsatisfies the inequalities • We call it Freshe-Hoefding inequalities. They also indicate that the values of the connection as a measure of dependence is between – ¼ and + ¼ .

  11. 2. Connection between random events(continued) • One more representation for the connection between the two events А andВ If the connection is negative the occurrence of one event decreases the chances for the other one to occur. Knowledge of the connection can be used for calculation of posteriori probabilities similar to the Bayes rule! • We call АandВpositively associated when, and negatively associated when .

  12. 2. Connection between random events(continued) • Example There are 1000 observations on the stock market. In 80 cases there was a significant increase in the oil prices (event A). Simultaneously it is registered significant increase at the Money Market (event В) в 50 cases. Significant increase in both investments (event ) is observed in 20 occasions. The frequency estimated probabilities produce: , , • Definition 1 says =.02 – (.08)(.05) = .016. • If it is known that there is a significant increase in the investments in money market, then the probability to see also significant increase in the oil price is = .08 + (.016)/(.08) = .4.

  13. 2. Connection between random events(continued) • Analogously, if we have information that there is a significant increase in the oil prices on the market, then the chances to get also significant gains in the money market at the same day will be: = .05 + (.016)/(.08) = .25. • Here we assume the knowledge of the connection , and the individual prior probabilities Р(А) andР(В) only. It seems much more natural in the real life than what Bayes rule requires.

  14. 2. Connection between random events(continued) • Remark 1.If we introduce the indicators of the random events, i.e. , when event A occurs, and when the complementary event occurs, then and • Therefore, the connection between two random events equals to the covariance between their indicators.

  15. 2. Connection between random events(continued) • Comment:The numerical value of the connection does not speak about the magnitude of dependence between АandВ. • The strongest connection must hold when А=В.In such cases we have • Let see the numbers. Assume А = В, and P(A)= .05. Then the event A with itself has a very low connection value .0475. Moreover, the value of the connection varies together with the probability of the event A. • Let P(A)=.3, P(B)=.4butАmay occur as with В,as well as with andP(A|B)=.6. Then • The value of this connection is about 3 times stronger then the previously considered, despite the fact that in the firs case the occurrence of В guarantees the occurence of А.

  16. 3. Regression coefficients as measure of dependence between random events. • The conditional probability is the conditional measure of the chances for the event A to occur, when it is already known that the other event B occurred. • When В is a zero event the conditional probability can not be defined. It is convenient in such cases to set =P(A). • Definition 2.Regression coefficient of the eventАwith respect to the eventВ is = -- . • The regression coefficient is always defined, for any pair of events АandВ (zero, sure, arbitrary).

  17. 3. Regression coefficients (continued) • Properties (r1)The equality to zero = = 0 takes place if and only if the two events are independent. • (r2) ; . • (r3) • (r4) • (r5) The regression coefficients and are numbers with equal signs - the sign of their connection . However, their numerical values are not always equal. To be valid = it is necessary and sufficient to have =

  18. 3. Regression coefficients (continued) (r6) The regression coefficients and are numbers between –1 and 1, i.e. they satisfy the inequalities ; (r6.1) The equality= 1 holds only when the random event Аcoincides (is equivalent) with the event В. Тhen is also valid the equality =1; (r6.2) The equality= - 1 holds only when the random event Аcoincides (is equivalent) with the event - the complement of the event В. Тhen is also valid = - 1, and respectively .

  19. 3. Regression coefficients (continued) (r7) It is fulfilled = - , = - , . (r8) Particular relationships , (r9)

  20. 3. Regression coefficients (continued) • For the case wherewe have = = =

  21. 3. Regression coefficients (continued) • For the case wherewe have = = =

  22. 3. Regression coefficients (continued) • In the general case the measures of dependence may be positive, or negative. • If Р(А)=Р(В)=.5, and=.3, then the connection and both regression coefficients are positive; if =.1, these all measures are negative. • The sign and magnitude of the dependence measured by the regression coefficients could be interpreted as a trend in the dependence toward one of the extreme situations , , or and the two events are independent

  23. 3. Regression coefficients (continued) • Example 1 (continued): We calculate here the two regression coefficients and according to the data of Example 1. • The regression coefficient of the significant increase of the gas prices on the market (event A), in regard to the significant increase in the Money Market return (event B) has a numerical value =(.016)/[(.05)(.95)] = .3368 • In the same time we have = (.016)/[(.08)(.92)] = .2174 .

  24. 3. Regression coefficients (continued) • There exists some asymmetry in the dependence between random events - it is possible one event to have stronger dependence on the other than the reverse. • The true meaning of the specific numerical values of these regression coefficients is still to be clarified. • We guess that it is possible to use it for measuring the magnitude of dependence between events. • In accordance with the distance of the regression coefficient from the zero (where the independence stays) the values within .05 distance could be classified as “one is almost independent on the other”; • Distances between .05 to .2 from the zero may be classified as weakly dependent case • Distances is between .2 and .45 could be classified as moderately dependent; • Cases with from.45 to .8 to be called as in average dependent, • above .8 to be classified as strongly dependent. • This classification is pretty conditional, made up by the author.

  25. 3. Regression coefficients (continued) • The regression coefficients satisfy the inequalities • These also are called Freshe-Hoefding inequalities.

  26. 4. Correlation between two random events • Definition 3.Correlation coefficient between two eventsA and B we call the number = Its sign, plus or minus, is the sign either of the two regression coefficients. • An equivalent representation = =

  27. 4. Correlation (continued) • Remark 3.The correlation coefficient between the events А and Вequals to the correlation coefficient between the indicators of the two random events A and B . • Properties • R1.It is fulfilled = 0 if and only if the two events А and В are independent. • R2.The correlation coefficient always is a number between –1 and +1, i.e. it is fulfilled -1≤ ≤ 1. • R2.1.The equalityto 1 holds if and only if the events А and В are equivalent, i.e. when А = В. • R2.2.The equalityto - 1 holds if and only if the events А and are equivalent

  28. 4. Correlation (continued) • R3.The correlation coefficient has the same sign as the other measures of the dependence between two random eventsА and В , and this is the sign of the connection. • R4.The knowledge of allows calculating the posterior probability of one of the events under the condition that the other one is occurred. For instance,P(B| A) will be determined by the rule = P(B) + • The net increase, or decrease in the posterior probability compare to the prior probability equals to the quantity added to P(B), and depends only on the value of the mutual correlation.

  29. 4. Correlation (continued) • = - • R5.It is fulfilled = = - ; = • R6. • R7.Particular Cases. When , then ; If then

  30. 4. Correlation (continued) • The use of the numerical values of the correlation coefficient is similar to the use of the two regression coefficients. • As closer is to the zero, as “closer” are the two events А and В to the independence. • Let us note once again that = 0 if and only if the two events are independent.

  31. 4. Correlation (continued) • As closer is to 1, as “dense one within the other” are the events АandВ,and when= 1, the two events coincide (are equivalent). • As closer is to -1, as “dense one within the other” are the events Аand,and when= - 1 the two events coincide (are equivalent). • These interpretations seem convenient when conducting research and investigations associated with qualitative (non-numeric) factors and characteristics. • Such studies are common in sociology, ecology, jurisdictions, medicine, criminology, design of experiments, and other similar areas.

  32. 4. Correlation (continued) • Freshe-Hoefding inequalities for the Correlation Coefficient

  33. 4. Correlation (continued) • Example 1 (continued): We have the numerical values of the two regression coefficients and from the previous section. In this way we get = = .2706. • Analogously to the cases with the use of the regression coefficients, could be used the numeric value of the correlation coefficient for classifications of the degree of the mutual dependence. • Any practical implementation will give a clear indication about the rules of such classifications. • The correlation coefficient is a number in-between the two regression coefficients. It is symmetric and absorbs the misbalance (the asymmetry) in the two regression coefficients, and is a balanced measure of dependence between the two events.

  34. 4. Correlation (continued) • Examples can be given in variety areas of our life. For instance: • Consider the possible degree of dependence between tornado touch downs in Kansas (event A), and in Alabama (event B). • In sociology a family with 3, or more children (event A), and an income above the average (event B); • In medicine someone gets an infarct (event A), and a stroke (event B). • More examples, far better and meaningful are expected when the revenue of this approach is assessed.

  35. 5. Empirical estimations • The measures of dependence between random events are made of their probabilities. It makes them very attractive and in the same time easy for statistical estimation and practical use.

  36. 5. Empirical Estimations(contd) • Let in N independent experiments (observations) the random event Аoccurs times, the random event Вoccurs times, and the event occurs times. Then statistical estimators of our measures of dependence will be respectively:

  37. 5. Empirical Estimations(contd) • The estimators of the two regression coefficients are ; = • The correlation coefficient has estimator =

  38. 5. Empirical Estimations(contd) • Each of the three estimators may be simplified when the fractions in numerator and denominator are multiplied by , we will not get into detail. • The estimators are all consistent; the estimator of the connection δ(А,В) is also unbiased, i.e. there is no systematic error in this estimate. • The proposed estimators can be used for practical purposes with reasonable interpretations and explanations, as it is shown in our discussion, and in our example.

  39. 6.Some warningsand some recommendations • The introduced measures of dependence between random events are not transitive. • It is possible that Аis positively associated with B, and this event В to be positively associated with a third event С, but the event Аto be negatively associated with С. • To see this imagine Аand В compatible (non-empty intersection) as well as Вand С compatible, while Аand Сbeing mutually exclusive, and therefore, with a negative connection. • Mutually exclusive events have negative connection; • For non-exclusive pairs (А, В) and (В, С)every kind of dependence is possible.

  40. 6.Some warnings and some recommendations (contd) • One can use the measures of dependence between random events to compare degrees of dependence. • We recommend the use of Regression Coefficient for measuring degrees of dependence. • For instance, let then we say that the event Аhas stronger association with Сcompare to its association with B. • In a similar way an entire rank of association of any fixed event can be given with any collection of other events.

  41. 7. An illustration of possible applications • Alan Agresti Categorical Data Analysis, 2006. • Table 1: Observed Frequencies of Income and Job Satisfaction

  42. 7. An illustration … (contd) • Table 2: Empirical Estimations of the probabilities for each particular case

  43. 7. An illustration … (contd) • Table 3: Empirical Estimations of the connection function for each particular category of Income and Job Satisfaction

  44. 7. An illustration … (contd) • Surface of the Connection Function (Z variable), between Income level (Y variable) and the Job Satisfaction Level (X variable) , according to Table 3.

  45. 7. An illustration … (contd) • Table 4: Empirical Estimations of the regression coefficient between each particular category of income with respect to the job satisfaction

  46. 7. An illustration … (contd) • Surface of the Regression Coefficient Function

  47. 7. An illustration … (contd) • Table 5: Empirical Estimations of the regression coefficient between each particular category of the job satisfaction with respect to the income groups

  48. 7. An illustration … (contd) • Surface of the Regression Coefficient • Function, according to Table 5.

  49. 7. An illustration … (contd) • Table 6: Empirical Estimations of the correlation coefficient between each particular income group and the categories of the job satisfaction

  50. 7. An illustration … (contd) • Surface of the Correlation Coefficient Function according to Table 6.

More Related