Rohit Kate

Computational Intelligence in Biomedical and Health Care InformaticsHCA 590 (Topics in Health Sciences) Rohit Kate Probabilistic Reasoning

Reading • Chapter 10, Main Textbook.

What is Probability? • In ordinary language, probability is the degree of certainty of an event. For example, “It is very probable that it will rain today”. • Probability theory gives a formal mathematical framework to work with numerical estimates of certainty of events, using this one can: • Predict likelihood of events and their combinations • Predict most likely outcome • Predict some event given that some other events have happened

Why is Probability Theory Important in Medicine? • Clinical data is imperfect: results of diagnostic test, history given by the patient, outcomes of treatment are often uncertain • Probabilistic reasoning can help health care providers to deal with uncertainties inherent in medical decisions • Instead of using some ad hoc arithmetic to encode estimates of certainty, probability theory is preferable because it has sound mathematics behind it, for example, rules for combining probabilities

Definitions • Sample Space (Ω): Space of all possible outcomes • Outcomes of throwing a dice: {1,2,3,4,5,6} • Event: Subset of a sample space • An even number will show up: {2,4,6} • Number 2 will show up: {2}

Definitions • Probability function (or probability distribution), denoted by P(event): Mapping from events to a real number, such that: 0 <= P(Any event) <= 1 Probability of any event is between 0 and 1, both inclusive. P(Sure event) = 1 Probability of a sure event is 1. P(Impossible event) = 0 Probability of an impossible event is 0.

Definitions • P(AПB) or simply P(A,B) indicates both events A and B occur (intersection of events) • P(AUB) indicates event A or event B occur (union of events) • P(AUB) = P(A) + P(B) – P(A,B) A U B AAA BB A A П B B

Disjoint Events • If A and B are disjoint, i.e. AПB = empty, then P(AUB) = P(A) + P(B) Corollary: Probability of all possible disjoint outcomes must add to 1. For example, either A or not A occurs hence, P(A U not A) = 1, and as they are disjoint, P(A U not A) = P(A) + P(not A) = 1

Examples • For throwing a dice P({1,2,3,4,5,6}) = 1 (some outcome shows up, a sure event) • Suppose each basic outcome is equally likely, since they are all disjoint and add up to 1, we will have P({1}) = 1/6, P({2})=1/6, P({3})=1/6… • P(an even number shows up) = P({2,4,6}) = P({2}) + P({4}) + P({6}) = 1/6 + 1/6 + 1/6 = 1/2 • P(an even number or 3 shows up) = 1/2 + 1/6= 2/3 • P(an even number or 6 shows up) ≠ 1/2 + 1/6 Why? • Because the two events are not disjoint • Even number or 6 = {2,4,6} • Hence P({2,4,6}) = 1/2 (as calculated above)

Definitions • Conditional Probability: Updated probability of an event given that some other event happened P({2}) = 1/6 P(Even number) = P({2,4,6}) P({2} given an even number showed up) = ? Represented as: P({2}|{2,4,6}), in general P(A|B) Definition: P(A|B) = P(A,B)/P(B) (for P(B) > 0) P({2}|{2,4,6}) = P({2})/P({2,4,6}) = 1/6/(1/2) = 1/3

Multiplicative and Chain Rules • Multiplicative rule: P(A,B) = P(B)*P(A|B) = P(A)*P(B|A) Probability that A and B happened is equal to the probability that B happened times the probability that A happened given B has happened, or vice versa • Generalization of the rule, chain rule: P(A1,A2,…,An) = P(A1)*P(A2|A1)*P(A3|A1,A2)*…*P(An|A1,..,An-1) Or in any order of As

Interpretation of Probability • Frequentist interpretation: P({3})=1/6: If a dice is thrown multiple times then in the long run 1/6th of the times 3 will show up P(It will rain tomorrow) = 1/2 ?? • Subjectivist interpretation: One’s degree of belief that the event will happen The mathematical rules should hold for both interpretations.

Estimating Probabilities • For well defined sample spaces and events, they can be analytically estimated: P({3}) = 1/6 (assuming fair dice) • For many other sample spaces it is not possible to analytically estimate, for example P(A teenager will drink and drive). • For these cases they can be empirically estimated from a good sample, P(A teenager will drink and drive) = # of Teenagers who drink and drive/# of teenagers

Estimating Probability in Medicine • Physicians often make assessments about probability based on personal experience, “What was the likelihood of disease in similar patients I have seen?” • Physicians can avoid some of these difficulties by using published research results to estimate probabilities

Bayes’ Theorem • Lets us calculate P(B|A) in terms of P(A|B) • For example, using Bayes’ theorem we can calculate P(Hypothesis|Evidence) in terms of P(Evidence|Hypothesis) which is usually easier to estimate. • Hypothesis could be a disease and Evidence could be symptoms

Bayes’ Theorem • Simple proof from definition of conditional probability: (Def. cond. prob.) (Def. cond. prob.) QED:

Bayes’ Theorem Example • P(disease|symptom) is something one often wants to compute • P(disease|symptom) = P(symptom|disease)*P(disease)/P(symptom) • The quantities on the right side of the equation are often easier to estimate from data than directly estimating the left side

Bayesian Inference for Diagnostic Test • Given that a test is 90% positive, what does that mean? • If you test positive then you have 90% chance of disease P(Disease|Test) • If you have the disease then there is 90% chance that you will test positive P(Test|Disease)

Bayesian Inference for Diagnostic Test • Given that a test is 90% positive, what does that mean? • If you test positive then you have 90% chance of disease P(Disease|Test) Incorrect • If you have the disease then there is 90% chance that you will test positive P(Test|Disease) Correct • Also know as sensitivity

Bayesian Inference for Diagnostic Test • Sensitivity: Probability of testing positive given that you have disease P(Test|Disease) • Specificity: Probability of testing negative given you do not have disease P(¬Test|¬ Disease), “¬” means “not” or “negative” • Often there is a trade-off between sensitivity and specificity of a test • The more aggressive a test is at finding the disease (high sensitivity), the more likely it will be at finding false cases (low specificity)

Bayesian Inference for Diagnostic Test • Usually sensitivity and specificity are given for a test, but you really want to know the probability that you have the disease given you tested positive P(Disease|Test) or the probability that you do not have the disease given you tested negative P(¬Disease|¬Test) • How to calculate P(Disease|Test) given P(Test|Disease)? • Bayes’ theorem helps out

Bayesian Inference for Diagnostic Test • Example (from book): • Suppose mammography has sensitivity of 0.77 and specificity of 0.95 • Overall occurrence of breast cancer in the screening population is 0.6% or 0.006 probability • What is the probability that someone has breast cancer if the test is positive?

Rohit Kate

Rohit Kate

Presentation Transcript

Rohit Kate

Rohit Kate

Rohit Kate

Rohit Kate

Rohit Kate

Natural Language Processing COMPSCI 423/723 Rohit Kate

Rohit Kate

Rohit Kate

Rohit Kate

Natural Language Processing COMPSCI 423/723 Rohit Kate

Rohit Kate

Rohit Kate

Natural Language Processing COMPSCI 423/723 Rohit Kate

Rohit Khokher

Rohit Kate

Rohit Kate

Rohit Kate

Natural Language Processing COMPSCI 423/723 Rohit Kate

Rohit Kate

Rohit Kate

Rohit Kate

Rohit Kate