Fuzzy rough data mining
This presentation is the property of its rightful owner.
Sponsored Links
1 / 68

Fuzzy-rough data mining PowerPoint PPT Presentation


  • 156 Views
  • Uploaded on
  • Presentation posted in: General

Fuzzy-rough data mining. Richard Jensen Advanced Reasoning Group University of Aberystwyth [email protected] http://users.aber.ac.uk/rkj. Outline. Knowledge discovery process Fuzzy-rough methods Feature selection and extensions Instance selection Classification/prediction

Download Presentation

Fuzzy-rough data mining

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Fuzzy rough data mining

Fuzzy-rough data mining

Richard Jensen

Advanced Reasoning Group

University of Aberystwyth

[email protected]

http://users.aber.ac.uk/rkj


Outline

Outline

  • Knowledge discovery process

  • Fuzzy-rough methods

    • Feature selection and extensions

    • Instance selection

    • Classification/prediction

    • Semi-supervised learning


Knowledge discovery

Knowledge discovery

  • The process

  • The problem of too much data

    • Requires storage

    • Intractable for data mining algorithms

    • Noisy or irrelevant data is misleading/confounding


Feature selection

Feature Selection


Feature selection1

Feature selection

  • Why dimensionality reduction/feature selection?

  • Growth of information - need to manage this effectively

  • Curse of dimensionality - a problem for machine learning and data mining

  • Data visualisation - graphing data

Intractable

High dimensional

data

Dimensionality

Low dimensional

Reduction

data

Processing System


Why do it

Why do it?

  • Case 1: We’re interested in features

    • We want to know which are relevant

    • If we fit a model, it should be interpretable

  • Case 2: We’re interested in prediction

    • Features are not interesting in themselves

    • We just want to build a good classifier (or other kind of predictor)


Feature selection process

Feature selection process

  • Feature selection (FS) preserves data semantics by selecting rather than transforming

  • Subset generation: forwards, backwards, random…

  • Evaluation function: determines ‘goodness’ of subsets

  • Stopping criterion: decide when to stop subset search

Feature set

Subset

Evaluation

Generation

Subset

suitability

Stopping

Continue

Stop

Validation

Criterion


Fuzzy rough feature selection

Fuzzy-rough feature selection


Fuzzy rough set theory

Fuzzy-rough set theory

  • Problems:

    • Rough set methods (usually) require data discretization beforehand

    • Extensions, e.g. tolerance rough sets, require thresholds

    • Also no flexibility in approximations

      • E.g. objects either belong fully to the lower (or upper) approximation, or not at all


Fuzzy rough sets

Fuzzy-rough sets

Rough set

t-norm

Fuzzy-rough set

implicator


Fuzzy rough feature selection1

Fuzzy-rough feature selection

  • Based on fuzzy similarity

  • Lower/upper approximations

(e.g.)


Frfs evaluation function

FRFS: evaluation function

  • Fuzzy positive region #1

  • Fuzzy positive region #2 (weak)

  • Dependency function


Frfs finding reducts

FRFS: finding reducts

  • Fuzzy-rough QuickReduct

    • Evaluation: use the dependency function (or other fuzzy-rough measure)

    • Generation: greedy hill-climbing

    • Stopping criterion: when maximal evaluation function is reached (or to degree α)


Fuzzy rough data mining

FRFS

  • Other search methods

    • GAs, PSO, EDAs, Harmony Search, etc

    • Backward elimination, plus-L minus-R, floating search, SAT, etc

  • Other subset evaluations

    • Fuzzy boundary region

    • Fuzzy entropy

    • Fuzzy discernibility function


Ant based fs

Ant-based FS


Boundary region

Boundary region

Upper

Approximation

Set X

Lower

Approximation

Equivalence class [x]B


Frfs boundary region

FRFS: boundary region

  • Fuzzy lower and upper approximation define fuzzy boundary region

  • For each concept, minimise the boundary region

    • (also applicable to crisp RSFS)

  • Results seem to show this is a more informed heuristic (but more computationally complex)


Finding smallest reducts

Finding smallest reducts

  • Usually too expensive to search exhaustively for reducts with minimal cardinality

  • Reducts found via discernibility matrices through, e.g.:

    • Converting from CNF to DNF (expensive)

    • Hill-climbing search using clauses (non-optimal)

    • Other search methods - GAs etc (non-optimal)

  • SAT approach

    • Solve directly in SAT formulation

    • DPLL approach ensures optimal reducts


Fuzzy discernibility matrices

Fuzzy discernibility matrices

  • Extension of crisp approach

    • Previously, attributes had {0,1} membership to clauses

    • Now have membership in [0,1]

  • Fuzzy DMs can be used to find fuzzy-rough reducts


Formulation

Formulation

  • Fuzzy satisfiability

  • In crisp SAT, a clause is fully satisfied if at least one variable in the clause has been set to true

  • For the fuzzy case, clauses may be satisfied to a certain degree depending on which variables have been assigned the value true


Example

Example


Dpll algorithm

DPLL algorithm


Experimentation results

Experimentation: results


Frfs issues

FRFS: issues

  • Problem – noise tolerance!


Vaguely quantified rough sets

Vaguely quantified rough sets

y belongs to the lower approximation of A iff allelements of Ry belong to A

y belongs to the upper approximation of A iffat least oneelement of Ry belongs to A

Pawlakrough set

y belongs to the lower approximation of A iffmostelements of Ry belong to A

y belongs to the upper approximation of A iffat least someelements of Ry belong to A

VQRS


Vqrs based feature selection

VQRS-based feature selection

  • Use the quantified lower approximation, positive region and dependency degree

    • Evaluation: the quantified dependency (can be crisp or fuzzy)

    • Generation: greedy hill-climbing

    • Stopping criterion: when the quantified positive region is maximal (or to degree α)

  • Should be more noise-tolerant, but is non-monotonic


Progress

Progress

Qualitative data

Rough set theory

Quantitative data

Fuzzy rough set theory

...

Noisy data

VQRS

Fuzzy VPRS

Monotonic

OWA-FRFS


More issues

More issues...

  • Problem #1: how to choose fuzzy similarity?

  • Problem #2: how to handle missing values?


Interval valued frfs

Interval-valued FRFS

IV fuzzy rough set

  • Answer #1: Model uncertainty in fuzzy similarity by interval-valued similarity

IV fuzzy similarity


Interval valued frfs1

Interval-valued FRFS

  • When comparing two object values for a given attribute – what to do if at least one is missing?

  • Answer #2: Model missing values via the unit interval


Other measures

Other measures

  • Boundary region

  • Discernibility function


Initial experimentation

Initial experimentation

Original Dataset

Cross-validation folds

Type-1 FRFS

Data corruption

IV-FRFS methods

Reduced folds

Reduced folds

JRip

JRip


Initial experimentation1

Initial experimentation


Initial results lower approx

Initial results: lower approx


Instance selection

Instance Selection


Instance selection basic ideas

Instance selection: basic ideas

Not needed

Remove objects to keep the underlying

approximations unchanged


Instance selection basic ideas1

Instance selection: basic ideas

Noisy objects

Remove objects whose positive region membership is < 1


Fris i

FRIS-I


Fris ii

FRIS-II


Fris iii

FRIS-III


Fuzzy rough instance selection

Fuzzy rough instance selection

  • Time complexity is a problem for FRIS-II and FRIS-III

  • Less complex: Fuzzy rough prototype selection

    • More on this later...


Fuzzy rough classification and prediction

Fuzzy-rough classification and prediction


Frnn vqnn

FRNN/VQNN


Frnn vqnn1

FRNN/VQNN


Further developments

Further developments

  • FRNN and VQNN have limitations (for classification problems)

    • FRNN only uses one neighbour

    • VQNN equivalent to FNN if the same similarity relation is used

  • POSNN uses the positive region to also consider the quality of neighbours

    • E.g. instances in overlapping class regions are less interesting

    • More on this later...


Discovering rules via rst

Discovering rules via RST

  • Equivalence classes

    • Form the antecedent part of a rule

    • The lower approximation tells us if this is predictive of a given concept (certain rules)

  • Typically done in one of two ways:

    • Overlaying reducts

    • Building rules by considering individual equivalence classes (e.g. LEM2)


Quickrules framework

QuickRules framework

  • The fuzzy tolerance classes used during this process can be used to create fuzzy rules

  • When a reduct is found the resulting rules cover all instances

Feature set

Subset

Evaluation and

Generation

Rule Induction

Subset

suitability

Stopping

Continue

Stop

Validation

Criterion


Harmony search approach

Harmony search approach

  • R. Diao and Q. Shen. A harmony search based approach to hybrid fuzzy-rough rule induction, Proceedings of the 21st International Conference on Fuzzy Systems, 2012.


Harmony search approach1

Harmony search approach

Musicians

Harmony

Fitness

Notes

HarmonyMemory

Minimise ( a – 2 ) 2 + ( b – 3 ) 4 + ( c – 1 ) 2 + 3


Key notion mapping

Key notion mapping

HarmonySearch

Hybrid RuleInduction

NumericalOptimisation

Musician

Fuzzy rule rx

Variable

Note

Feature subset

Value

Harmony

Rule set

Solution

Fitness

Combined evaluation

Evaluation


Comparison vs quickrules

Comparison vsQuickRules

HarmonyRules56.33±10.00

QuickRules

63.1±11.89

Rule cardinality distribution for dataset web of 2556 features


Fuzzy rough semi supervised learning

Fuzzy-rough semi-supervised learning


Semi supervised learning ssl

Semi-supervised learning (SSL)

  • Lies somewhere between supervised and unsupervised learning

  • Why use it?

    • Data is expensive to label/classify

    • Labels can also be difficult to obtain

    • Large amounts of unlabelled data available

  • When is SSL useful?

    • Small number of labelled objects but large number of unlabelled objects


Semi supervised learning

Semi-supervised learning

  • A number of methods for SSL – self-learning, generative models etc.

    • Labelled data objects – usually small in number

    • Unlabelled data objects – usually large in number

    • A set of features describe the objects

    • Class label tells us only which labelled objects belong to

  • SSL therefore attempts to learn labels (or structure) for data which has no labels

    • Labelled data provides ‘clues’ for the unlabelled data


Co training

Co-training

Labelled Dataset

subset 1

subset 2

Unlabelled Data

Learner 1

Learner 2

Predictions

Predictions


Self learning

Self-learning

Labelled data objects

Labelled Dataset

Learner

Predictions

Unlabelled Data


Fuzzy rough self learning frsl

Fuzzy-rough self learning (FRSL)

  • Basic idea is to propagate labels using the upper and lower approximations

    • Label only those objects which belong to the lower approximation of a class to a high degree

    • Can use upper approximation to decide on ties

  • Attempts to minimise mis-labelling and subsequent reinforcement

  • Paper: N. Mac Parthalain and R. Jensen. Fuzzy-Rough Set based Semi-Supervised Learning. Proceedings of the 20th International Conference on Fuzzy Systems (FUZZ-IEEE’11), pp. 2465-2471, 2011.


Fuzzy rough data mining

FRSL

Labelled dataset

Labelled data objects

Yes

Lower

approximation

membership = 1?

No

Fuzzy-rough learner

Predictions

Unlabelled Data


Experimentation problem 1

Experimentation (Problem 1)


Ss fcm

SS-FCM


Fuzzy rough data mining

FNN


Fuzzy rough data mining

FRSL


Experimentation problem 2

Experimentation (Problem 2)


Ss fcm1

SS-FCM


Fuzzy rough data mining

FNN


Fuzzy rough data mining

FRSL


Conclusion

Conclusion

  • Looked at fuzzy-rough methods for data mining

    • Feature selection, finding optimal reducts

    • Handling missing values and other problems

    • Classification/prediction

    • Instance selection

    • Semi-supervised learning

  • Future work

    • Imputation, better rule induction and instance selection methods, more semi-supervised methods, optimizations, instance/feature weighting


Fr methods in weka

FR methods in Weka

  • Weka implementations of all fuzzy-rough methods can be downloaded from:

  • KEEL version available soon (hopefully!)

  • http://users.aber.ac.uk/rkj/book/wekafull.jar


  • Login