1 / 42

Focused Reducts

Focused Reducts. Janusz A. Starzyk and Dale Nelson. ASSUMPTION: This is ALL we know. Sampled Data. Model. What Do We Know? Major Assumption. Real World. … 1024. . . . 1602. Problem Size Dilemma. Rough Set Tutorial. Difference between rough sets and fuzzy sets Labeling data

early
Download Presentation

Focused Reducts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Focused Reducts Janusz A. Starzyk and Dale Nelson

  2. ASSUMPTION: This is ALL we know Sampled Data Model What Do We Know?Major Assumption Real World

  3. …1024 . . . 1602 Problem Size Dilemma

  4. Rough Set Tutorial • Difference between rough sets and fuzzy sets • Labeling data • Remove duplicates/ambiguities • What is a core? • What is a reduct?

  5. Rough Sets vs Fuzzy Sets Fuzzy Sets - How gray is the pixel Rough Sets - How big is the pixel

  6. ExampleSample HRR Data

  7. Labeling can be different for different columns/attributes Ranges can be different for different columns/attributes ExampleLabel Data Label 1 < .25 .25 >= Label 2 <=.45 Label 3 > .45

  8. Remove Ambiguities & Duplicates

  9. Equivalence Classes E1={1, 2, 3} E2={4, 5} E3={6} E4={7} E5={8}

  10. Definitions • Reduct - A reduct is a reduction of an information system which results in no loss of information (classification ability) by removing attributes (range bins). There may be one or many for a given information system) • Core - A core is the set of attributes (range bins) which are common to all reducts.

  11. Compute Core Signals 6 and 8 are ambiguous upon removal of Range Bin 1. Therefore, Range Bin 1 is part of core. Core - The range bins common to ALL reducts - The most essential range bins without which signals cannot be classified

  12. Compute Core No ambiguous signals therefore, Range Bin 2 is NOT part of core.

  13. Compute Core No ambiguous signals therefore, Range Bin 3 is NOT part of core.

  14. Compute Core No ambiguous signals therefore, Range Bin 4 is NOT part of core.

  15. Compute ReductsRange Bin 1 + Range Bin 2 Range Bin 1 and Range Bin 2 classify therefore, they belong to a reduct

  16. Compute ReductsRange Bin 1 + Range Bin 3 Range Bin 1 and Range Bin 3 do not classify therefore, they do NOT belong to a reduct

  17. Compute ReductsRange Bin 1 + Range Bin 4 Range Bin 1 and Range Bin 4 classify therefore, they belong to a reduct

  18. Reduct Summary • Range bins 1 and 2 are a reduct • Sufficient to classify all signals • Range bins 1 and 4 are a reduct • Sufficient to classify all signals • Range bins 1 and 3 are NOT a reduct • Cannot distinguish target classes 2 and 3 • No need to try • Range bins 1, 2, 3 • Range bins 1, 2, 4

  19. Did You Notice? • Calculating a reduct is time consuming! • n = 29 value = 536,870,911 • We are interested in n  50 • This is a BIG NUMBER requiring a lot of time to compute reduct which is a f (# signals), too

  20. Why Haven’t Rough Sets Been Used Before?

  21. The Procedure • Normalize signal • Partition signal • Block • Interleave • Wavelet transform • Binary multi-class entropy labeling • Entropy based range bin selection • Determine minimal reducts • Fuse marginal reducts for classification

  22. Data • Synthetic generated by XPATCH • Six targets • 1071 Signals per target • 128 Range bins/signal • Azimuth -25o to +25o • Elevation -20o to 0o

  23. Normalize the Data • Ensures all data is range normalized • Use the 2 Norm • Divide each signal bin value by N

  24. 1 1 128 128 1 64 65 1 2 1 32 33 64 65 96 97 128 1 2 3 4 1 16 17 32 33 48 48 64 65 80 81 96 97 112 113 128 1 2 3 4 5 6 7 8 Partition the Signal Block Partitioning

  25. 1 1 128 128 1st 2nd 3rd 4th 5th 6th 7th 8th Partition the Signal 1 Piece 2 Pieces 1 128 1 4 Pieces 1 128 8 Pieces Interleave Partitioning

  26. Best Wavelet 50/60 Signals Classified!! Original Signal Best- 20/60 signals Classified Many features are better than the best from original signal Why Use a Wavelet Transform?

  27. HRR Signal and Its Haar Transform

  28. Multi-Class Information Entropy Using this definition we define two other probabilities Let xi be range bin values across all signals for a target class Define where Without assuming any particular distribution we can define the probability as: Then multi-class entropy is defined as:

  29. Binary Multi-Class Labeling

  30. Range Bin Selection • Total range bins available depends on partition size • We chose 50 bins per reduct • Time considerations • Implications • Based on maximum relative entropy

  31. Compute Core • Computation of core is easy and fast • Eliminate one range bin at a time and see if the training set is ambiguous - only that range bin can discriminate between the ambiguous signals • Accumulate the bins resulting in ambiguous data - that is the core • These range bins MUST be in every reduct • O(n) process

  32. Compute Minimal Reducts • To the core add one range bin at a time and compute the number of ambiguities • Select the range bin(s) with the fewest ambiguities-there may be several-save these as we will use them to compute the reduct • Add that range bin to the core and repeat previous step until there are no ambiguities - this is a reduct • Calculate reducts for all bins with equivalent number of ambiguities-yields multiple reducts • O(n2) process

  33. Need 50 Time Complexity Training Set Size 50 to 400 Attributes (Range Bins) 1602 Signals Test Set Size 4823 Signals

  34. Fuzzy Rough Set Classification • Test signals may have a range bin value very close to labeling division point • If this happens we define a distance  where this is considered a “don’t care” region • Classification process proceeds without the “don’t care” range bin

  35. Weighting FormulaRequirements • We desire the following for combining classifications • All Pcc(s) = 0  weight = 0 • All Pcc(s) = 1  weight = 1 • Several low Pcc(s)  weight higher than any of the Pcc(s) • One high Pcc and several low Pcc(s)  weight higher than the highest Pcc

  36. Weighting Formula

  37. Fusing Marginal Reducts • Each signal is marked with the classification by each reduct along with the reduct’s performance (Pcc) on the training set • A weight is computed for each target class for each signal • A signal is assigned the target class with the highest weight

  38. Results - Training

  39. Results Testing

  40. Conjectures • Robust in the presence of noise • Due to binary labeling • Due to fuzzification • Robust to signal registration • Due to binary labeling • Due to averaging effect of wavelets on interleaved partitions • Due to fuzzification

  41. 0 1 TIME Rough Set Theoretic HRR ATR - Summary APPLICATIONS -1-D Signals -HRR -LADAR vibration -Sonar -Medical -Stock market -Data Mining BREAKTHROUGHS -Reduct (classifier) generation time from exponential to quadratic ! -Fusion of marginal (poor performing) reducts -Wavelet Transform Aiding -Multi partition to increase number of range bins considered -Use of binary multi-class entropy labeling -Entropy based range bin selection -Performance within 1% of theoretic best -Max problem size increased by 2 orders of magnitude METHOD -Normalize Signal -Partition Signal - Block - Interleave -Wavelet Transform -Binary Multi-class Entropy Labeling -Entropy based Range Bin Selection -Determine Minimal Reducts -Fuse marginal reducts for classification Exponential Quadratic

  42. Future Directions • Fuzz factor sensitivity study • Sensitivity to signal alignment • Sensitivity to noise • Iterated wavelet transform performance study • Effectiveness on air to ground targets • Other application areas

More Related