1 / 56

Sensitivity Analysis of Enumerated Trees of Increasing Boolean Expressions

Sensitivity Analysis of Enumerated Trees of Increasing Boolean Expressions. Saket Anand, David Madigan, Richard Mammone, Fred Roberts. A. B. 0. C. C. 1. A. A. 1. 0. 0. 1. B. 1. 1. 0. Enumeration and Selection of Optimum Decision Tree.

dcallaway
Download Presentation

Sensitivity Analysis of Enumerated Trees of Increasing Boolean Expressions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sensitivity Analysis of Enumerated Trees of Increasing Boolean Expressions Saket Anand, David Madigan, Richard Mammone, Fred Roberts

  2. A B 0 C C 1 A A 1 0 0 1 B 1 1 0 Enumeration and Selection of Optimum Decision Tree • A set of decision trees is constructed for each complete and monotonic boolean function where inputs represent tests performed by each sensor • The cost of each tree is evaluated and the optimum tree selected. Y = f(A, B, C) where f is complete and monotonic

  3. Enumeration and Selection of Optimum Decision Tree • The decision trees are constructed using 4 sensors • For three sensors, there are 114 monotonic and complete boolean expressions. These can be implemented using 11808 distinct trees. • The trees are evaluated and ranked using the cost function1. • The tree with the lowest cost is selected as the optimum decision tree. 1Stroud, P. D. and Saeger K J., “Enumeration of Increasing Boolean Expressions and Alternative Digraph Implementations for Diagnostic Applications”, Proceedings Vol. IV, Computer, Communication and Control Technologies

  4. Cost Function used for evaluating the decision trees. CTot =CFalsePositive *PFalsePositive + CFalseNegative *PFalseNegative+ Cfixed where, CFalsePositive is the cost of false positive (Type I error) CFalseNegative is the cost of false negative (Type II error) PFalsePositive is the probability of a false positive occurring PFalseNegativeis the probability of a false negative occurring Cfixed is the fixed cost of utilization of the tree. The Error Probability of the entire tree is computed from the error probabilities of the individual sensors.

  5. Ti P(Yi|X=1) P(Yi|X=0) Characteristics of a typical sensor Probability of Error for Individual Sensors • For ith sensor, the type 1 (P(Yi=1|X=0)) and type 2 (P(Yi=0|X=1)) errors are modeled using Gaussian distributions. • State of nature X=0 represents absence of a bomb. • State of nature X=1 represents presence of a bomb. • Yi represents the outcome of sensor i. • It is characterized by: • Ki, discrimination coefficient • Ti, decision threshold • Σi, variance of the distributions Ki

  6. 1 PD Operating Point EER 0 PF 1 Ki Ti P(Yi|X=1) P(Yi|X=0) Receiver Operating Characteristic (ROC) Curve • The ROC curve is the plot of the Probability of correct detection (PD) vs. the Probability of false positive (PF). • The ROC curve is used to select an operating point, which provides the trade off between the PD and PF • Each sensor has a ROC curve and the combination of the sensors into a decision tree has a composite ROC curve. • The parameter which is varied to get different operating points on the ROC curve is the sensor Threshold and a combination of Thresholds for the decision tree. • Equal Error Rate (EER) is the operating point on the ROC curve where, PF=1 - PD

  7. Stroud-Saeger Experiments • Stroud-Saeger ranked all trees formed from four given sensors A, B, C and D according to increasing tree costs. The cost function used was as shown in earlier slides. • Values used in their experiment: • CA = .25; KA = 4.37; ΣA = 1; • CB = .25; KB = 1.53; ΣB = 1; • CC = 10; KC = 2.9; ΣC = 1; • CD = 30; KD = 4.6; ΣD = 1; • where Ci is the individual cost of utilization of sensor i, Ki is the sensor discrimination power and Σi is the relative spread factor for sensor i. • Values of other variables are not known.

  8. Cost Sensitivity to Global Parameters • Values used in the experiment: • CA = .25; P(YA=1|X=1) = .9856; P(YA=1|X=0) = .0144; • CB = 1; P(YB=1|X=1) = .7779; P(YB=1|X=0) = .2221; • CC = 10; P(YC=1|X=1) = .9265; P(YC=1|X=0) = .0735; • CD = 30; P(YC=1|X=1) = .9893; P(YC=1|X=0) = .0107; where Ci is the individual cost of utilization of sensor i. The probabilities have been computed for a threshold corresponding to the equal error rate. • CFalseNegative to be varied between 25 million and 500 billion dollars • Low and high estimates of direct and indirect costs incurred due to a false negative. • CFalsePositive to be varied between 180 and 720 dollars • Cost incurred due to false positive (4 men * (3 -6 hrs) * (15 – 30 $/hr) • P(X=1) to be varied between 3/109 and 1/100,000

  9. a a a c b b b 1 1 0 1 c c c 1 1 0 0 1 1 1 0 0 Structure of trees which came first Rank with 3 sensors (A, C and D) Tree number 37 Boolean Expr: 00011111 Tree number 49 Boolean Expr: 01010111 Tree number 55 Boolean Expr: 01111111

  10. Frequency of optimal trees with 3 sensors (A,C and D) when one parameter was varied • Randomly selected fixed parameter values

  11. Variation of CTot vs. CFalseNegative • P(X=1) and CFalsePositive were kept constant at the specified value and CTot was • computed for 10,000 randomly selected values of CFalseNegative in the specified range. • Randomly selected fixed parameter values

  12. Variation of CTot vs. CFalsePositive • P(X=1) and CFalseNegative were kept constant at the specified value and CTot was • computed for 10,000 randomly selected values of CFalsePositive in the specified range. • Randomly selected fixed parameter values

  13. Variation of CTot vs. P(X=1) • CFalsePositive and CFalseNegative were kept constant at the specified value and CTot was • computed for 10,000 randomly selected values of P(X=1) in the specified range. • Randomly selected fixed parameter values

  14. Frequency of optimal trees with 3 sensors (A,C and D) when one parameter was varied • Fixed parameter values selected at Stroud and Saeger values

  15. Variation of CTot vs. CFalseNegative • P(X=1) and CFalsePositive were kept constant at the specified value and CTot was • computed for 10,000 randomly selected values of CFalseNegative in the specified range. • Fixed parameter values selected at Stroud and Saeger values

  16. Variation of CTot vs. CFalsePositive • P(X=1) and CFalseNegative were kept constant at the specified value and CTot was • computed for 10,000 randomly selected values of CFalsePositive in the specified range. • Fixed parameter values selected at Stroud and Saeger values

  17. Variation of CTot vs. P(X=1) • CFalsePositive and CFalseNegative were kept constant at the specified value and CTot was • computed for 10,000 randomly selected values of P(X=1) in the specified range. • Fixed parameter values selected at Stroud and Saeger values

  18. Variation of CTot wrt CFalseNegative and CFalsePositive • Randomly selected fixed parameter values CTot =CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1)+ Cfixed

  19. Variation of CTot wrt CFalseNegative and P(X=1) • Randomly selected fixed parameter values CTot =CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1)+ Cfixed

  20. Variation of CTot wrt CFalsePositive and P(X=1) • Randomly selected fixed parameter values CTot =CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1)+ Cfixed

  21. Variation of CTot wrt CFalseNegative and CFalsePositive • Fixed parameter values selected at Stroud and Saeger values CTot =CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1)+ Cfixed

  22. Variation of CTot wrt CFalseNegative and P(X=1) • Fixed parameter values selected at Stroud and Saeger values CTot =CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1)+ Cfixed

  23. Variation of CTot wrt CFalsePositive and P(X=1) • Fixed parameter values selected at Stroud and Saeger values CTot =CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1)+ Cfixed

  24. a a b 1 b 1 c 1 d c d 1 0 1 1 d 0 1 0 1 Tree Structure and corresponding Boolean Expressions Tree number 11785 Boolean Expr: 0111111111111111 Tree number 11605 Boolean Expr: 0101011111111111

  25. a b 1 d d d d d c b 0 0 0 0 0 1 1 1 1 1 0 c 1 a 1 b c 0 Tree Structure and corresponding Boolean Expressions Tree number 9133 Boolean Expr: 0001010111111111 Tree number 8965 Boolean Expr: 0001010101111111

  26. a b b 0 c 1 1 c d d d b 0 0 0 0 0 1 d 1 1 1 c 1 1 a c 0 Tree Structure and corresponding Boolean Expressions Tree number 6797 Boolean Expr: 0001000101111111 Tree number 2473 Boolean Expr: 0000000101111111

  27. a d 0 1 d b 0 1 c 1 1 Tree Structure and corresponding Boolean Expressions Tree number 11305 Boolean Expr: 0101010101111111

  28. Variation of CTot vs. CFalseNegative • P(X=1) and CFalsePositive were kept constant at the specified value and CTot was • computed for 10,000 randomly selected values of CFalseNegative in the specified range. • Randomly selected fixed parameter values

  29. Variation of CTot vs. CFalsePositive • P(X=1) and CFalseNegative were kept constant at the specified value and CTot was • computed for 10,000 randomly selected values of CFalsePositive in the specified range. • Randomly selected fixed parameter values

  30. Variation of CTot vs. P(X=1) • CFalsePositive and CFalseNegative were kept constant at the specified value and CTot was • computed for 10,000 randomly selected values of P(X=1) in the specified range. • Randomly selected fixed parameter values

  31. Variation of CTot vs. CFalseNegative • P(X=1) and CFalsePositive were kept constant at the specified value and CTot was • computed for 10,000 randomly selected values of CFalseNegative in the specified range. • Fixed parameter values selected at Stroud and Saeger values

  32. Variation of CTot vs. CFalsePositive • P(X=1) and CFalseNegative were kept constant at the specified value and CTot was • computed for 10,000 randomly selected values of CFalsePositive in the specified range. • Fixed parameter values selected at Stroud and Saeger values

  33. Variation of CTot vs. P(X=1) • CFalsePositive and CFalseNegative were kept constant at the specified value and CTot was • computed for 10,000 randomly selected values of P(X=1) in the specified range. • Fixed parameter values selected at Stroud and Saeger values

  34. Frequency of optimal trees with 4 sensors when two parameters were varied. The fixed parameters were randomly selected.

  35. Randomly selected fixed parameter values

  36. Variation of CTot wrt CFalseNegative and CFalsePositive • Randomly selected fixed parameter values CTot =CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1)+ Cfixed

  37. Variation of CTot wrt CFalseNegative and P(X=1) • Randomly selected fixed parameter values CTot =CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1)+ Cfixed

  38. Variation of CTot wrt CFalsePositive and P(X=1) • Randomly selected fixed parameter values CTot =CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1)+ Cfixed

  39. Frequency of optimal trees with 4 sensors when two parameters were varied. The fixed parameters were selected at the Stroud and Saeger values.

  40. Variation of CTot wrt CFalseNegative and CFalsePositive • Fixed parameter values selected at Stroud and Saeger values CTot =CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1)+ Cfixed

  41. Variation of CTot wrt CFalseNegative and P(X=1) • Fixed parameter values selected at Stroud and Saeger values CTot =CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1)+ Cfixed

  42. Variation of CTot wrt CFalsePositive and P(X=1) • Fixed parameter values selected at Stroud and Saeger values CTot =CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1)+ Cfixed

  43. Sensitivity to Sensor Performance Following experiments have been done using sensors A, B, C and D as described below by varying the individual sensor thresholds TA, TB and TC from -4.0 to +4.0 in steps of 0.4. These values were chosen since they gave us a ROC curve for the individual sensors over a complete range P(Yi=1|X=0) and P(Yi=1|X=1) CA = .25; KA = 4.37; ΣA = 1 CB= .25; KB = 1.53; ΣB = 1 CC = 15; KC = 2.9;ΣC = 1 CD = 30; KD = 4.6;ΣD = 1 where Ci is the individual cost of utilization of sensor i, Ki is the discrimination power of the sensor and Σi is the spread factor for the sensor The probability of false positive for the ith sensor is computed as: P(Yi=1|X=0) = 0.5 erfc[Ti/√2] The probability of detection for the ith sensor is computed as: P(Yi=1|X=1) = 0.5 erfc[(Ti-Ki)/(Σ√2)]

  44. Frequency of optimal trees with 3 sensors when the Thresholds were varied. The fixed parameters ( CFalsePositive, CFalseNegative , P(X=1)) were selected randomly. Fifteen trees attained rank one, out of which tree number 37 was the most frequent.

  45. Performance (ROC) of Best Decision Tree for Tree number 37

  46. Performance (ROC) of Best Decision Tree for Tree number 37

  47. Frequency of optimal trees with 4 sensors when the Thresholds were varied. The fixed parameters ( CFalsePositive, CFalseNegative , P(X=1)) were selected randomly. 244 trees attained rank one, out of which tree number 445 was the most frequent. Only 15 most frequently occurring optimal trees out of the 241 are tabulated below.

More Related