Paper study - Application Of Variable Precision Rough Set Approach To Car Driver Assessment

Paper study - Application Of Variable Precision Rough Set Approach To Car Driver Assessment Presented by: Lichun (Jack) Zhu Course: 60-539 Winter 2006 Instructor: Dr. Christie Ezeife University of Windsor

Agenda • Introduction • Rough Set Theory • Variable Precision Rough Set Theory • Linear Hierarchy of Decision Table (HDTL) Algorithm • How the data is prepared • Result interpretation • Summary and Conclusion • Q & A

Introduction • Problem Statement: • Need to find out unsafe car drivers based on history driving records. • Driving records in the database are incomplete and inaccurate. • Solution: A new approach to analyze the data that contains inaccurate information • Variable Precision Rough Set Theory • Linear Hierarchy of Decision Table algorithm • Classification

Introduction to Rough Set Theory • Background • First introduced by Pawlak (1982) • A mathematic method to describe the uncertainty and incompleteness • Basic concept • Terms: Information System S, Universe U, Attributes A (condition attr, decision attr) • S = (U, A)

Introduction to Rough Set Theory • Domain Va: With every attribute a of A, we associate a set Va as domain of a, Such as Vs = {Male, Female} • Indiscerniblity relation I(B): If B ⊂ A, I(B) on U as (x,y) ∈ I(B), if and only if a(x) = a(y) for every a ∈ B, where a(x) is the value of attribute a for turple x. We can see I(B) is a equivalence relation. • B-elementary sets {B1, …Bi,…}: the partition on the universe U/I(B) or simply U/B, we also define B(x) = Bi: x ∈ Bi

Introduction to Rough Set Theory • An example of Information System Table 1. U = {1,2,3,4,5,6}, A={S, G, N, R} Let B = {S, G, N}I(B) = {(1,1), (1,6), (2,2), (3,3), (4,4), (5,5), (6,6)}U/B = {{1,6}, {2}, {3}, {4}, {5}} = {B1, B2, B3, B4, B5}

Introduction to Rough Set Theory • Approximation For interest set X ⊂ U, We define • B-lower(X) = ∪x∈U {B(x): B(x) ≦ X}, • B-upper(X) = ∪x∈U {B(x): B(x) ∩ X ≠ Φ} • BNR B (X) = B-upper(X) – B-lower(X) • For example: if X contains all turples with high risk, X = {2,3,4,6}, then B-lower(X) = {2,3,4}, B-upper(X) = {1,2,3,4,6} BNR B (X) = {1,6}

Introduction to Rough Set Theory Figure 1. Rough Set Concept, U= ∪{B1…B14}, B-lower(X) – Yellow Region, B-upper(X) – Yellow and Green Region BNR B (X) - Green Region

Variable Precision Rough Set Theory • Background Information • Problem of Rough Set: B-lower approximation will always be EMPTY if uncertainty widely exists. • Solution: use probability based approach • presented by Ziarko(1993), Yao and Wong (1992), Slezak and Ziarko (2002) etc • Definations • lower limit l: satisfying 0 ≤ l < P(X) < 1 • l-negative region of X: NEG l (X) = ∪{Bi: P(X|Bi) ≤ l} • upper limit u: satisfying 0 < P(X) < u ≤ 1. • u-positive region of X: POS u (X) = ∪{Bi: P(X|Bi) ≥ u} • (l,u)-boundary region of X: BNR l,u (X) = ∪{Bi: l < P(X|Bi) < u}

Variable Precision Rough Set Theory For data in Table 1, P(X) =4/6 = 2/3≈0.67 If l = 0.25 and u = 0.75 then NEG 0.25 (X)={5}, POS 0.75 (X) = {2,3,4}, BNR 0.25,0.75 (X)={1,6} Table 2. Sample Decision Table DT B,X (U) with P(X) = 0.67, l=0.25, u=0.75 • For example:

Variable Precision Rough Set Theory Figure 2. VPRS Concept, U= ∪{B1…B17}, NEG(X) – White Region, POS(X) – Yellow Region BNR B (X) – Green Region

Linear Hierarchy of Decision Table Algorithm (Ziarko,2002) • Corresponds to Tree-structured Hierarchy of Decision Table Algorithm

Linear Hierarchy of Decision Table (HDTL) Algorithm • Linear Hierarchy of Decision Table (HDTL) Algorithm • Advantage: Linear Hierarchy of Decision Table algorithm effectively eliminates the exponential growth of the decision hierarchy size

Linear Hierarchy of Decision Table (HDTL) Algorithm (supervised approach) Initialization 1. U  U’, C  C’, D  D’ 2. Compute POS u (X) and NEG l (X) Iteration 3. repeat { 4. while (POS u (X) = EMPTY and NEG l (X) = EMPTY) { 5. C  new(C, U); define new condition attributes 6. Compute POS u (X) and NEG l (X) } 7. Output DT C,X (U); output decision table based on the union of the positive and negative regions 8. if POS u (X) ∪ NEG l (X) = U then exit. 9. U  U – (POS u (X) ∪ NEG l (X)) 10. C  new (C, U); define new condition attributes 11. D  D|U; restrict decision attributes to the current set of data U 12. Compute POS u (X) and NEG l (X) } There is a problem at this point. When defining the new condition attributes failed, the procedure should terminate. Here embodies the linear approach of generating the dataset for the subsequent layer.

How the data is prepared • Attributes • Sex, Date-of-birth, City-population, Number-of-convictions, Number-of-past-accidents and Has-accident-in-last-year • Data scale: about 29,000 records • Data normalization

Result interpretation • 5 test cycles, generating 5 first layer decision tables and 3 second layer decision tables. • A problem can be found from the testing result In all the presented test cycles, the boundary sets of the first cycle all contain only one combination of attributes. Therefore the generated decision table hierarchy has no difference compared with the Tree-structured Hierarchy Decision Table algorithm at the first two layers. The author did not display his further investigation on the boundary sets that have more than one combination of attributes.

Summary and Conclusion • Strong points • provides a valuable alternative solution that can be used in rule finding and classification based on inaccurate data. • The HDTL algorithm can also avoid the exponent expansion of hierarchical data structures • Weak point • Incomplete of test results provided. The test results does not strong enough to testify the effectiveness and accuracy of Linear Hierarchy Decision Table algorithm.

References • Pawlak, Z, Decision Rules, Bayes Rule and Rough Sets, New Directions in Rough Sets, Data Mining, and Granular-Soft Computing, p.1-9, 7th International Workshop, RSFDGrC’99, Yamaguchi, Japan, November 1999 Proceedings. • Ziarko, W., Incremental Learning with Hierarchies of Rough Decision Tables, Proceedings of North American Fuzzy Information Processing Society Conf. (NAFIPS04), Banff, Alberta (2004) p.802-808.

Q & A Thanking You

Paper study - Application Of Variable Precision Rough Set Approach To Car Driver Assessment

Paper study - Application Of Variable Precision Rough Set Approach To Car Driver Assessment

Presentation Transcript

Variable Rate Application

Economics of Variable-Rate Fertilizer Application

Research Paper Rough Draft

Compare/Contrast Paper: Rough Draft

Vocabulary Study : Variable

Research Paper Rough Drafts

Variable Analytic Approach

PAPER: PRECISION ARRAY TO PROBE THE EPOCH OF REIONIZATION

Precision Variable Frequency Drive

RULE-BASED MULTICRITERIA DECISION SUPPORT USING ROUGH SET APPROACH

3. Rough set extensions

Research Paper – Rough Draft

Assessment: The “Hidden Variable” of Achievement

The variable cost approach to pricing

ROUGH SET BASED DECISION SUPPORT

Assessment of Paper Transactions

A Strategic Approach to Application Vulnerability Assessment

Steps to download & set up Acer Driver

Dominance-Bases Rough Set Approach: Features, Extensions and Application

Research Paper Rough Draft 1

Precision Variable Frequency Drive

Pco Driver Application