Soft computing methods
This presentation is the property of its rightful owner.
Sponsored Links
1 / 86

Soft Computing Methods PowerPoint PPT Presentation


  • 118 Views
  • Uploaded on
  • Presentation posted in: General

Soft Computing Methods. J.A. Johnson Dept. of Math and Computer Science Seminar Series February 8, 2013. Outline. Fuzzy Sets Neural Nets Rough Sets Bayesian Nets Genetic Algorithms. Fuzzy sets. Fuzzy set theory is a means of specifying how well an object satisfies a vague description.

Download Presentation

Soft Computing Methods

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Soft computing methods

Soft Computing Methods

J.A. Johnson

Dept. of Math and Computer Science Seminar Series

February 8, 2013


Outline

Outline

  • Fuzzy Sets

  • Neural Nets

  • Rough Sets

  • Bayesian Nets

  • Genetic Algorithms


Fuzzy sets

Fuzzy sets

  • Fuzzy set theory is a means of specifying how well an object satisfies a vague description.

  • A fuzzy set can be defined as a set with fuzzy boundaries

  • Fuzzy sets were first introduced by Zadeh (1965).


How do we represent a fuzzy set in a computer

How do we represent a fuzzy set in a computer?

First, the membership function must be determined.


Example

Example

  • Consider the proposition "Nate is tall."

  • Is the proposition true if Nate is 5' 10"?

  • The linguistic term "tall" does not refer to a sharp demarcation of objects into two classes—there are degrees of tallness.


Soft computing methods

Fuzzy set theory treats Tall as a fuzzy predicate and says that the truth value of Tall(Nate) is a number between 0 and 1, rather than being either true or false.


Soft computing methods

Let A denote the fuzzy set of all tall employees and x be a member of the universe X of all employees. What would the function μA(x) look like


Soft computing methods

  • μA(x) = 1 if x is definitely tall

  • μA(x) = 0 if x is definitely not tall

  • 0 <μA(x) <1 for borderline cases


Soft computing methods

  • Classical Set

  • Fuzzy Set


Standard f uzzy set operations

Standard Fuzzy set operations

  • Complement cA(x) = 1 − A(x)

  • Intersection(A ∩ B)(x) = min [A(x), B(x)]

  • Union(A ∪ B)(x) = max [A(x), B(x)]


Linguistic variables and hedges

Linguistic variables and hedges

  • The range of possible values of a linguistic variable represents the universe of discourse of that variable.

  • A linguistic variable carries with it the concept of fuzzy set qualifiers, called hedges. Hedges are terms that modify the shape of fuzzy sets.


Soft computing methods

  • For instance, the qualifier “very” performs concentration and creates a new subset.(very, extremely)

  • An operation opposite to concentration is dilation. It expands the set.(More or less, somewhat)


Representation of hedges

Representation of hedges


Representation of hedges1

Representation of hedges

Hedge Mathematical Expression Graphical representation


Soft computing methods

  • Fuzzy logic is not logic that is fuzzy, but logic that is used to describe fuzziness.

  • Fuzzy logic deals with degrees of truth.


Building a fuzzy expert system

Building a Fuzzy Expert System

  • Specify the problem and define linguistic variables.

  • Determine fuzzy sets.

  • Elicit and construct fuzzy rules.

  • Perform fuzzy inference.

  • Evaluate and tune the system.


References

References

[1]Artificial Intelligence (A Guide to Intelligent Systems) 2nd Edition by MICHAEL NEGNEVITSKY

[2]An Introduction to Fuzzy Sets by WitoldPedrycz and Fernando Gomide

[3]Fuzzy Sets and Fuzzy Logic: Theory and Applications by Bo Yuan and George J.

[4]ELEMENTARY FUZZY MATRIX THEORY AND FUZZY MODELS FOR SOCIAL SCIENTISTS by W. B. VasanthaKandasamy

[5]Wikipedia: http://en.wikipedia.org/wiki/Fuzzy_logic

[6] Wikipedia: http://en.wikipedia.org/wiki/Fuzzy


References1

References

  • http://www.softcomputing.net/fuzzy_chapter.pdf

  • http://www.cs.cmu.edu/Groups/AI/html/faqs/ai/fuzzy/part1/faq-doc-18.html

  • http://www.mv.helsinki.fi/home/niskanen/zimmermann_review.pdf

  • http://sawaal.ibibo.com/computers-and-technology/what-limits-fuzzy-logic-241157.html

  • http://my.safaribooksonline.com/book/software-engineering-and-development/9780763776473/fuzzy-logic/limitations_of_fuzzy_systems#X2ludGVybmFsX0ZsYXNoUmVhZGVyP3htbGlkPTk3ODA3NjM3NzY0NzMvMTUy


Thanks to

Thanks to

  • Ding Xu

  • EdwigeNounangNgnadjo

    For help with researching content and preparation of overheads on Fuzzy Sets


Artificial neural networks

Artificial Neural Networks

Neuron:basic information-processing units


Single neural network

Single neural network

basic information-processing units


Single neural network1

Single neural network


Active function

Active function

  • The Step and Sign active function, also named hard limit functions, are mostly used in decision-making neurons.

  • The Sigmoid function transforms the input, which can have any value between plus and minus infinity, into a reasonable value in the range between 0 and 1. Neurons with this function are used in the back-propagation networks.

  • The Linear activation function provides an output equal to the neuron weighted input. Neurons with the linear function are often used for linear approximation.


How the machine learns p erceptron neuron weight training

How the machine learns:Perceptron(Neuron+Weight training)


The algorithm of single neural network

The Algorithm of single neural network

  • Step 1: Initialization

    Set initial weights w1,w2, . . . ,wnand threshold to random numbers in the range [-0.5,0.5]。

  • Step 2: Activation

  • Step 3: Weight training

  • Step 4: Iteration

    Increase iteration p by one, go back to Step 2 and repeat the process until convergence.


How the machine learns

How the machine learns

Weight training


The design of my program

The design of my program


The design of my program1

The design of my program


The result

The result


Problem

Problem


Multilayer neural network

Multilayer neural network


References2

References

  • http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html.

    2. Stuart J. Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, 2009.

    3. http://www.roguewave.com/Portals/0/products/imsl-numerical-libraries/c-library/docs/6.0/stat/default.htm?turl=multilayerfeedforwardneuralnetworks.htm

    4. Notes on Multilayer, Feedforward Neural Networks , Lynne E. Parker.

    5.http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html#Why use neural networks


Thanks to1

Thanks to

  • Hongming(Homer) Zuo

  • Danni Ren

    For help with researching content and preparation of overheads on Neural Nets


Rough sets

Rough Sets

  • Introduced by ZdzislawPawlak in the early 1980’s.

  • Formal framework for the automated transformation of data into knowledge.

  • Simplifies the search for dominant attributesin an inconsistent information table leading to derivation of shorter if-then rules.


Inconsistent information table

Inconsistent Information Table


Soft computing methods

Certain rules for examples are:

(Temperature, normal)  (Flu, no),

(Headache, yes) and (Temperature, high)  (Flu, yes),

(Headache, yes) and (Temperature, very_high)  (Flu, yes).

Uncertain (or possible) rules are:

(Headache, no)  (Flu, no),

(Temperature, high)  (Flu, yes),

(Temperature, very_high)  (Flu, yes).


Strength of a rule

Strength of a Rule

  • Weights

    • Coverage:

      # elements covered by rule

      # elements in universe

    • Support:

      # positive elements covered by rule

      # elements in universe

    • Degree of certainty:

      support x 100

      coverage


Attribute reduction

Attribute Reduction

  • Which are the dominate attributes?

  • How do we determine redundant attributes?


Indiscernibility classes

Indiscernibility Classes

  • An indiscernibility class, with respect to set of attributes X, is defined as a set of examples all of whose values for attributes x Є X agree

  • For example, the indiscernibility classes with respect to attributes X = {Headache, Temperature} are {e1}, {e2}, {e3}, {e4}, {e5, e7} and {e6, e8}


Soft computing methods

Defined by a lower approximation and an upper approximation

The lower approximation is

X = i xi

The upper approximation is

X= (i x) i


Soft computing methods

e5

e8

Lower and upper

approximations

of set X

upper approximation of X

Set X

lower approximation of X

e4

e7

e6

e1

e2

e3


Soft computing methods

If the indiscernibility classes with and without attribute A are identical then attribute A is redundant.


Inconsistent information table1

Inconsistent Information Table


Inconsistent information table2

Inconsistent Information Table


Set x

Set X


Example identifying edible mushrooms with ila algorithm

Example:Identifying Edible Mushrooms with ILA Algorithm


Mushrooms

Mushrooms


Mushroom dataset

Mushroom Dataset

Dataset contains 8124 entries of different mushrooms

Each entry (mushroom) has 22 different attributes


22 different attributes

Cap-shape

Cap-surface

Cap-color

Bruises

Odor

Gill-attachment

Gill-spacing

Gill-size

Gill-color

Stalk-shape

Stalk-root

Stalk-surface-above-ring

Stalk-surface-below-ring

Stalk-color-above-ring

Stalk-color-below-ring

Veil-type

Veil-color

Ring-number

Ring-type

Spore-print-color

Population

Habitat

22 different attributes


Soft values for attributes

almond

anise

creosote

fishy

foul

musty

none

pungent

spicy

Soft Values for Attributes

One of the attributes chosen is odor

All the possible values are


Example of the dataset

Example of the dataset

  • p,x,s,n,t,p,f,c,n,k,e,e,s,s,w,w,p,w,o,p,k,s,u

  • e,x,s,y,t,a,f,c,b,k,e,c,s,s,w,w,p,w,o,p,n,n,g

  • e,b,s,w,t,l,f,c,b,n,e,c,s,s,w,w,p,w,o,p,n,n,m

  • p,x,y,w,t,p,f,c,n,n,e,e,s,s,w,w,p,w,o,p,k,s,u

  • e,x,s,g,f,n,f,w,b,k,t,e,s,s,w,w,p,w,o,e,n,a,g


Ila algorithm

ILA Algorithm

  • Was invented by Mehmed R. Tolun and Saleh M. Abu-Soud

  • It is used for data mining

  • Runs in a stepwise forward iteration

  • Searches for a description that covers a relatively large number of data

  • Outputs IF-THEN rules


General requirements

General Requirements:

  • Examples are listed in a tabular form where each row corresponds to an example and each column contains attribute values.

  • A set of m training examples each example composed of k attributes and a class attribute with n possible decisions.

  • A rule set R with an initial value of Ø

  • All rows in the table are initially unmarked


Ila algorithm steps

ILA Algorithm Steps

  • Step 1: Partition the table containing m examples into n sub-tables. One table for each possible value of the class attribute.

    ( Steps 2 through 8 are repeated for each sub-table )

  • Step 2: Initialize attribute combination count j as j = 1.

  • Step 3: For the sub-table under consideration, divide the attribute list into distinct combinations, each combination with j distinct attributes.


Ila algorithm steps1

ILA Algorithm Steps

  • Step 4: For each combination of attributes, count the number of occurrences of attribute values that appear under the same combination of attributes in unmarked rows of the sub-table under consideration but at the same time that should not appear under the same combination of attributes of other sub-tables. Call the first combination with the maximum number of occurrences as max-combination.


Ila algorithm steps2

ILA Algorithm Steps

  • Step 5: If max-combination = Ø

  • Step 6: Mark all rows of the sub-table under consideration, in which the values of max-combination appear, as classified.

  • Step 7: Add a rule to R whose left hand side comprise attribute names of max-combination with their values separated by AND operator(s) and its right hand side contains the decision attribute value associated with the sub-table.

  • Step 8: If all rows are marked as classified, then move on to process another sub-table and go to Step 3. Otherwise (i.e., if there are still unmarked rows) go to Step 4. If no sub-tables are available, exit with the set of rules obtained so far.


Output of the ila algoritm

25 Rules (first 12 Rules)

If stalk-color-above-ring=gray then edible.

If odor=almond then edible.

If odor=anise then edible.

If population=abundant then edible.

If stalk-color-below-ring=gray then edible.

If habitat=waste then edible.

If stalk-color-above-ring=orange then edible.

If population=numerous then edible.

If ring-type=flaring then edible.

If cap-shape=sunken then edible.

If spore-print-color=black and odor=none then edible.

If spore-print-color=brown and odor=none then edible.

RuleNo TP FN Error

1- 576 0 0.0

2- 400 0 0.0

3- 400 0 0.0

4- 384 0 0.0

5- 384 0 0.0

6- 192 0 0.0

7- 192 0 0.0

8- 144 0 0.0

9- 48 0 0.0

10- 32 0 0.0

11- 608 0 0.0

12- 608 0 0.0

Output of the ILA algoritm


Output of the ila algorithm 2

25 Rules (Remaining 13 rules)

If stalk-color-below-ring=brown and gill-spacing=crowded then edible.

If spore-print-color=white and ring-number=two then edible.

If odor=foul then poisonous.

If gill-color=buff then poisonous.

If odor=pungent then poisonous.

If odor=creosote then poisonous.

If spore-print-color=green then poisonous.

If odor=musty then poisonous.

If stalk-color-below-ring=yellow then poisonous.

If cap-surface=grooves then poisonous.

If cap-shape=conical then poisonous.

If stalk-surface-above-ring=silky and gill-spacing=close then poisonous.

If population=clustered and cap-color=white then poisonous.

RuleNo TP FN Error

13- 48 0 0.0

14- 192 0 0.0

15- 2160 0 0.0

16- 1152 0 0.0

17- 256 0 0.0

18- 192 0 0.0

19- 72 0 0.0

20- 36 0 0.0

21- 24 0 0.0

22- 4 0 0.0

23- 1 0 0.0

24- 16 0 0.0

25- 3 0 0.0

Output of the ILA algorithm 2


Introduction to bayesian networks

Introduction to Bayesian Networks

  • A probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph (DAG).

  • Nodes, which are not connected, represent variables which are conditionally independent of each other.


Introduction to bayesian networks1

Introduction to Bayesian Networks

  • Each node is associated with a probability function that takes as input a particular set of values for the node's parent variables and gives the probability of the variable represented by the node.

  • If the parents are m Boolean variables then the probability function could be represented by a table of 2m entries, one entry for each of the 2m possible combinations of its parents being true or false.


Example1

Example

  • Suppose there are two events which could cause grass to be wet: either the sprinkler is on or it's raining. Also, suppose that the rain has a direct effect on the use of the sprinkler (namely that when it rains, the sprinkler is usually not turned on). Then the situation can be modeled with a Bayesian network . All three variables have two possible values, T (for true) and F (for false).


Soft computing methods

The joint probability function is:

P(G,S,R) = P(G | S,R)P(S | R)P(R)

where the names of the variables have been abbreviated to G = Grass wet, S = Sprinkler, and R = Rain.


Soft computing methods

  • The model can answer questions like "What is the probability that it is raining, given the grass is wet?“

  • By using the conditional probability formula and summing over all nuisance variables:


Example continued

Example (continued)


Applications

Applications

  • Biology and bioinformatics (gene regulatory networks, protein structure, gene expression analysis).

  • Medicine.

  • Document classification.

  • Information retrieval.

  • Image processing.

  • Data fusion.

  • Decision support systems.

  • Engineering.

  • Gaming.

  • Law.


Reference

Reference

[1] "Bayesian Probability Theory" in George F. Luger, William A. Stubbleeld, "Artificial Intelligence: Structures and Strategies for Complex Problem Solving", Second Edition, The Benjamin/Cummings Publishing Company, Inc., ISBN 0-8053-4780-1.

[2] "Bayesian Reasoning" in Michael Negnevitsky, "Artificial Intelligence: A Guide to Intelligent Systems", Third Edition, Pearson Education Limited, ISBN 978-1-4082-2574-5.

[3] "Bayesian Network" in http://en.wikipedia.org/wiki/Bayesian_network.

[4] "Probabilistic Graphical Model" in http://en.wikipedia.org/wiki/Graphical_model.

[5] "Random Variables" in http://en.wikipedia.org/wiki/Random_variables.

[6] "Conditional Independence" in http://en.wikipedia.org/wiki/Conditional_independence.


Reference1

Reference

[7] "Directed Acyclic Graph" in http://en.wikipedia.org/wiki/Directed_acyclic_graph.

[8] "Inference" in http://en.wikipedia.org/wiki/Inference.

[9] "Machine Learning" in http://en.wikipedia.org/wiki/Machine_learning.

[10] "History" in http://en.wikipedia.org/wiki/Bayesian_network.

[11] "Example" in http://en.wikipedia.org/wiki/Bayesian_network.

[12] "Applications" in http://en.wikipedia.org/wiki/Bayesian_network.

[13] "A simple Bayesian Network" figure in http://en.wikipedia.org/wiki/File:SimpleBayesNet.svg.


Reference2

Reference

[14] "Representation" in http://www.cs.ubc.ca/ murphyk/Bayes/bnintro.html#repr.

[15] "Conditional Independence in Bayes Nets" in http://www.cs.ubc.ca/ murphyk/Bayes/bnintro.html#repr.

[16] "Representation Example" figure in http://www.cs.ubc.ca/ murphyk/Bayes/bnintro.html#repr.

[17] "Conditional Independence" figure in http://www.cs.ubc.ca/ murphyk/Bayes/bnintro.html#repr.

[18] "Inference and Learning" in http://en.wikipedia.org/wiki/Bayesian_network.

[19] "Decision Theory" in http://www.cs.ubc.ca/ murphyk/Bayes/bnintro.html#repr.


Thanks to2

Thanks to

  • Sheikh ShushmitaJahan

    For help with researching content and preparation of overheads on Bayesean Nets


Genetic algorithms

Genetic Algorithms

  • Use random numbers to search for near-optimal solutions.

  • Use a process similar to the Theory of Evolution by Natural Selection proposed by Charles Darwin in his book On The Origin of Species.

  • Apply the same rules as Natural Selection in order to find near-optimal solutions.


Soft computing methods

  • An initial population of candidate solutions is generated,

  • the fitness of each solution is evaluated.

  • the most-fit solutions are chosen to reproduce.


Candidate solutions

Candidate Solutions

  • An array of bytes:

    • 00010101 00111010 11110000

  • May be converted to string representation


  • Fitness function

    FITNESS FUNCTION

    • May be an integer representation (or score)

    • There should be a preset maximum or minimum score (to help with termination)

    • One of the bigger challenges of designing a genetic algorithm


    Crossover

    Crossover

    • An operation which is analogous to biological reproduction, in which parts of parent solutions are combined in order to produce offspring solutions.

    • Typically, a single crossover point is chosen and the data beyond it are swapped in the children.


    Crossover1

    crossover


    Mutation

    Mutation

    • An operation aimed at including diversity into successive generations of solutions.

    • A mutation takes an existing solution to a problem and alters it in some way before including it in the next generation.


    Soft computing methods

    • Using crossover points and mutation factors, offspring solutions are produced and added to the population.

    • This procedure is repeated until a termination condition is reached (eg. sufficient fitness, time limit exceeded)


    Initialization

    Initialization

    • The creation of an initial population of solutions

    • Random bytes or strings are generated:

      solutions = new array(size)

      for (i = 0; i < size; i++)

      new solution

      solution.value = random bytes or strings

      solution.fitness = 0

      endfor


    Selection

    Selection

    Individual solutions are measured against the fitness function, and marked for either reproduction or removal


    Selection cont

    Selection Cont.

    for (i = 0; i < size; i++)

    solutions[i].fitness = fitnessFunction(i)

    endfor

    next = new array(maxSolutionsPerGeneration)

    fittest = solutions[0]

    for (i = 0; i < maxSolutionsPerGeneration; i++)

    for (j = 0; j < size; j++)

    if (fittest.fitness < solutions[j].fitness)

    fittest = solutions[j]

    endif

    endfor

    next[i] = fittest

    endfor

    solutions = next


    Overall algorithm

    Overall Algorithm

    initial population

    fitness function on individual solutions of initial population

    average fitness of all solutions

    loop (until terminating condition)

    select x solutions for reproduction

    combine pairs randomly

    mutate

    evaluate fitness

    determine average fitness

    end loop


    References3

    References


    Thanks to3

    Thanks to

    • Devon Noel de Tilly

    • Tyler Chamberland

      For help with researching content and preparation of overheads on Genetic Algorithms.


    Hybridization fs nn

    Hybridization (FS/NN)

    Fuzzy systems lack the capabilities of machine learning , as well as neural network-type memory and pattern recognition, therefore, hybrid systems(eg, neurofuzzy systems) are becoming more popular for specific applications.


    Hybridization rs nn

    Hybridization (RS/NN)

    Rough sets paradigm permits reduction of the number of inputs for a neural network as well as assists with the assignment of initial weights that are likely to cause the NN to converge more quickly.


  • Login