- 163 Views
- Uploaded on
- Presentation posted in: General

Soft Computing Methods

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Soft Computing Methods

J.A. Johnson

Dept. of Math and Computer Science Seminar Series

February 8, 2013

- Fuzzy Sets
- Neural Nets
- Rough Sets
- Bayesian Nets
- Genetic Algorithms

- Fuzzy set theory is a means of specifying how well an object satisfies a vague description.
- A fuzzy set can be defined as a set with fuzzy boundaries
- Fuzzy sets were first introduced by Zadeh (1965).

First, the membership function must be determined.

- Consider the proposition "Nate is tall."
- Is the proposition true if Nate is 5' 10"?
- The linguistic term "tall" does not refer to a sharp demarcation of objects into two classes—there are degrees of tallness.

Fuzzy set theory treats Tall as a fuzzy predicate and says that the truth value of Tall(Nate) is a number between 0 and 1, rather than being either true or false.

Let A denote the fuzzy set of all tall employees and x be a member of the universe X of all employees. What would the function μA(x) look like

- μA(x) = 1 if x is definitely tall
- μA(x) = 0 if x is definitely not tall
- 0 <μA(x) <1 for borderline cases

- Classical Set
- Fuzzy Set

- Complement cA(x) = 1 − A(x)
- Intersection(A ∩ B)(x) = min [A(x), B(x)]
- Union(A ∪ B)(x) = max [A(x), B(x)]

- The range of possible values of a linguistic variable represents the universe of discourse of that variable.
- A linguistic variable carries with it the concept of fuzzy set qualifiers, called hedges. Hedges are terms that modify the shape of fuzzy sets.

- For instance, the qualifier “very” performs concentration and creates a new subset.(very, extremely)
- An operation opposite to concentration is dilation. It expands the set.(More or less, somewhat)

Hedge Mathematical Expression Graphical representation

- Fuzzy logic is not logic that is fuzzy, but logic that is used to describe fuzziness.
- Fuzzy logic deals with degrees of truth.

- Specify the problem and define linguistic variables.
- Determine fuzzy sets.
- Elicit and construct fuzzy rules.
- Perform fuzzy inference.
- Evaluate and tune the system.

[1]Artificial Intelligence (A Guide to Intelligent Systems) 2nd Edition by MICHAEL NEGNEVITSKY

[2]An Introduction to Fuzzy Sets by WitoldPedrycz and Fernando Gomide

[3]Fuzzy Sets and Fuzzy Logic: Theory and Applications by Bo Yuan and George J.

[4]ELEMENTARY FUZZY MATRIX THEORY AND FUZZY MODELS FOR SOCIAL SCIENTISTS by W. B. VasanthaKandasamy

[5]Wikipedia: http://en.wikipedia.org/wiki/Fuzzy_logic

[6] Wikipedia: http://en.wikipedia.org/wiki/Fuzzy

- http://www.softcomputing.net/fuzzy_chapter.pdf
- http://www.cs.cmu.edu/Groups/AI/html/faqs/ai/fuzzy/part1/faq-doc-18.html
- http://www.mv.helsinki.fi/home/niskanen/zimmermann_review.pdf
- http://sawaal.ibibo.com/computers-and-technology/what-limits-fuzzy-logic-241157.html
- http://my.safaribooksonline.com/book/software-engineering-and-development/9780763776473/fuzzy-logic/limitations_of_fuzzy_systems#X2ludGVybmFsX0ZsYXNoUmVhZGVyP3htbGlkPTk3ODA3NjM3NzY0NzMvMTUy

- Ding Xu
- EdwigeNounangNgnadjo
For help with researching content and preparation of overheads on Fuzzy Sets

Neuron:basic information-processing units

basic information-processing units

- The Step and Sign active function, also named hard limit functions, are mostly used in decision-making neurons.
- The Sigmoid function transforms the input, which can have any value between plus and minus infinity, into a reasonable value in the range between 0 and 1. Neurons with this function are used in the back-propagation networks.
- The Linear activation function provides an output equal to the neuron weighted input. Neurons with the linear function are often used for linear approximation.

- Step 1: Initialization
Set initial weights w1,w2, . . . ,wnand threshold to random numbers in the range [-0.5,0.5]。

- Step 2: Activation
- Step 3: Weight training
- Step 4: Iteration
Increase iteration p by one, go back to Step 2 and repeat the process until convergence.

Weight training

- http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html.
2. Stuart J. Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, 2009.

3. http://www.roguewave.com/Portals/0/products/imsl-numerical-libraries/c-library/docs/6.0/stat/default.htm?turl=multilayerfeedforwardneuralnetworks.htm

4. Notes on Multilayer, Feedforward Neural Networks , Lynne E. Parker.

5.http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html#Why use neural networks

- Hongming(Homer) Zuo
- Danni Ren
For help with researching content and preparation of overheads on Neural Nets

- Introduced by ZdzislawPawlak in the early 1980’s.
- Formal framework for the automated transformation of data into knowledge.
- Simplifies the search for dominant attributesin an inconsistent information table leading to derivation of shorter if-then rules.

Certain rules for examples are:

(Temperature, normal) (Flu, no),

(Headache, yes) and (Temperature, high) (Flu, yes),

(Headache, yes) and (Temperature, very_high) (Flu, yes).

Uncertain (or possible) rules are:

(Headache, no) (Flu, no),

(Temperature, high) (Flu, yes),

(Temperature, very_high) (Flu, yes).

- Weights
- Coverage:
# elements covered by rule

# elements in universe

- Support:
# positive elements covered by rule

# elements in universe

- Degree of certainty:
support x 100

coverage

- Coverage:

- Which are the dominate attributes?
- How do we determine redundant attributes?

- An indiscernibility class, with respect to set of attributes X, is defined as a set of examples all of whose values for attributes x Є X agree
- For example, the indiscernibility classes with respect to attributes X = {Headache, Temperature} are {e1}, {e2}, {e3}, {e4}, {e5, e7} and {e6, e8}

Defined by a lower approximation and an upper approximation

The lower approximation is

X = i xi

The upper approximation is

X= (i x) i

e5

e8

Lower and upper

approximations

of set X

upper approximation of X

Set X

lower approximation of X

e4

e7

e6

e1

e2

e3

If the indiscernibility classes with and without attribute A are identical then attribute A is redundant.

Set X

Example:Identifying Edible Mushrooms with ILA Algorithm

Dataset contains 8124 entries of different mushrooms

Each entry (mushroom) has 22 different attributes

Cap-shape

Cap-surface

Cap-color

Bruises

Odor

Gill-attachment

Gill-spacing

Gill-size

Gill-color

Stalk-shape

Stalk-root

Stalk-surface-above-ring

Stalk-surface-below-ring

Stalk-color-above-ring

Stalk-color-below-ring

Veil-type

Veil-color

Ring-number

Ring-type

Spore-print-color

Population

Habitat

almond

anise

creosote

fishy

foul

musty

none

pungent

spicy

One of the attributes chosen is odor

All the possible values are

- p,x,s,n,t,p,f,c,n,k,e,e,s,s,w,w,p,w,o,p,k,s,u
- e,x,s,y,t,a,f,c,b,k,e,c,s,s,w,w,p,w,o,p,n,n,g
- e,b,s,w,t,l,f,c,b,n,e,c,s,s,w,w,p,w,o,p,n,n,m
- p,x,y,w,t,p,f,c,n,n,e,e,s,s,w,w,p,w,o,p,k,s,u
- e,x,s,g,f,n,f,w,b,k,t,e,s,s,w,w,p,w,o,e,n,a,g

- Was invented by Mehmed R. Tolun and Saleh M. Abu-Soud
- It is used for data mining
- Runs in a stepwise forward iteration
- Searches for a description that covers a relatively large number of data
- Outputs IF-THEN rules

- Examples are listed in a tabular form where each row corresponds to an example and each column contains attribute values.
- A set of m training examples each example composed of k attributes and a class attribute with n possible decisions.
- A rule set R with an initial value of Ø
- All rows in the table are initially unmarked

- Step 1: Partition the table containing m examples into n sub-tables. One table for each possible value of the class attribute.
( Steps 2 through 8 are repeated for each sub-table )

- Step 2: Initialize attribute combination count j as j = 1.
- Step 3: For the sub-table under consideration, divide the attribute list into distinct combinations, each combination with j distinct attributes.

- Step 4: For each combination of attributes, count the number of occurrences of attribute values that appear under the same combination of attributes in unmarked rows of the sub-table under consideration but at the same time that should not appear under the same combination of attributes of other sub-tables. Call the first combination with the maximum number of occurrences as max-combination.

- Step 5: If max-combination = Ø
- Step 6: Mark all rows of the sub-table under consideration, in which the values of max-combination appear, as classified.
- Step 7: Add a rule to R whose left hand side comprise attribute names of max-combination with their values separated by AND operator(s) and its right hand side contains the decision attribute value associated with the sub-table.
- Step 8: If all rows are marked as classified, then move on to process another sub-table and go to Step 3. Otherwise (i.e., if there are still unmarked rows) go to Step 4. If no sub-tables are available, exit with the set of rules obtained so far.

25 Rules (first 12 Rules)

If stalk-color-above-ring=gray then edible.

If odor=almond then edible.

If odor=anise then edible.

If population=abundant then edible.

If stalk-color-below-ring=gray then edible.

If habitat=waste then edible.

If stalk-color-above-ring=orange then edible.

If population=numerous then edible.

If ring-type=flaring then edible.

If cap-shape=sunken then edible.

If spore-print-color=black and odor=none then edible.

If spore-print-color=brown and odor=none then edible.

RuleNo TP FN Error

1- 576 0 0.0

2- 400 0 0.0

3- 400 0 0.0

4- 384 0 0.0

5- 384 0 0.0

6- 192 0 0.0

7- 192 0 0.0

8- 144 0 0.0

9- 48 0 0.0

10- 32 0 0.0

11- 608 0 0.0

12- 608 0 0.0

25 Rules (Remaining 13 rules)

If stalk-color-below-ring=brown and gill-spacing=crowded then edible.

If spore-print-color=white and ring-number=two then edible.

If odor=foul then poisonous.

If gill-color=buff then poisonous.

If odor=pungent then poisonous.

If odor=creosote then poisonous.

If spore-print-color=green then poisonous.

If odor=musty then poisonous.

If stalk-color-below-ring=yellow then poisonous.

If cap-surface=grooves then poisonous.

If cap-shape=conical then poisonous.

If stalk-surface-above-ring=silky and gill-spacing=close then poisonous.

If population=clustered and cap-color=white then poisonous.

RuleNo TP FN Error

13- 48 0 0.0

14- 192 0 0.0

15- 2160 0 0.0

16- 1152 0 0.0

17- 256 0 0.0

18- 192 0 0.0

19- 72 0 0.0

20- 36 0 0.0

21- 24 0 0.0

22- 4 0 0.0

23- 1 0 0.0

24- 16 0 0.0

25- 3 0 0.0

- A probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph (DAG).
- Nodes, which are not connected, represent variables which are conditionally independent of each other.

- Each node is associated with a probability function that takes as input a particular set of values for the node's parent variables and gives the probability of the variable represented by the node.
- If the parents are m Boolean variables then the probability function could be represented by a table of 2m entries, one entry for each of the 2m possible combinations of its parents being true or false.

- Suppose there are two events which could cause grass to be wet: either the sprinkler is on or it's raining. Also, suppose that the rain has a direct effect on the use of the sprinkler (namely that when it rains, the sprinkler is usually not turned on). Then the situation can be modeled with a Bayesian network . All three variables have two possible values, T (for true) and F (for false).

The joint probability function is:

P(G,S,R) = P(G | S,R)P(S | R)P(R)

where the names of the variables have been abbreviated to G = Grass wet, S = Sprinkler, and R = Rain.

- The model can answer questions like "What is the probability that it is raining, given the grass is wet?“
- By using the conditional probability formula and summing over all nuisance variables:

- Biology and bioinformatics (gene regulatory networks, protein structure, gene expression analysis).
- Medicine.
- Document classification.
- Information retrieval.
- Image processing.
- Data fusion.
- Decision support systems.
- Engineering.
- Gaming.
- Law.

[1] "Bayesian Probability Theory" in George F. Luger, William A. Stubbleeld, "Artificial Intelligence: Structures and Strategies for Complex Problem Solving", Second Edition, The Benjamin/Cummings Publishing Company, Inc., ISBN 0-8053-4780-1.

[2] "Bayesian Reasoning" in Michael Negnevitsky, "Artificial Intelligence: A Guide to Intelligent Systems", Third Edition, Pearson Education Limited, ISBN 978-1-4082-2574-5.

[3] "Bayesian Network" in http://en.wikipedia.org/wiki/Bayesian_network.

[4] "Probabilistic Graphical Model" in http://en.wikipedia.org/wiki/Graphical_model.

[5] "Random Variables" in http://en.wikipedia.org/wiki/Random_variables.

[6] "Conditional Independence" in http://en.wikipedia.org/wiki/Conditional_independence.

[7] "Directed Acyclic Graph" in http://en.wikipedia.org/wiki/Directed_acyclic_graph.

[8] "Inference" in http://en.wikipedia.org/wiki/Inference.

[9] "Machine Learning" in http://en.wikipedia.org/wiki/Machine_learning.

[10] "History" in http://en.wikipedia.org/wiki/Bayesian_network.

[11] "Example" in http://en.wikipedia.org/wiki/Bayesian_network.

[12] "Applications" in http://en.wikipedia.org/wiki/Bayesian_network.

[13] "A simple Bayesian Network" figure in http://en.wikipedia.org/wiki/File:SimpleBayesNet.svg.

[14] "Representation" in http://www.cs.ubc.ca/ murphyk/Bayes/bnintro.html#repr.

[15] "Conditional Independence in Bayes Nets" in http://www.cs.ubc.ca/ murphyk/Bayes/bnintro.html#repr.

[16] "Representation Example" figure in http://www.cs.ubc.ca/ murphyk/Bayes/bnintro.html#repr.

[17] "Conditional Independence" figure in http://www.cs.ubc.ca/ murphyk/Bayes/bnintro.html#repr.

[18] "Inference and Learning" in http://en.wikipedia.org/wiki/Bayesian_network.

[19] "Decision Theory" in http://www.cs.ubc.ca/ murphyk/Bayes/bnintro.html#repr.

- Sheikh ShushmitaJahan
For help with researching content and preparation of overheads on Bayesean Nets

- Use random numbers to search for near-optimal solutions.
- Use a process similar to the Theory of Evolution by Natural Selection proposed by Charles Darwin in his book On The Origin of Species.
- Apply the same rules as Natural Selection in order to find near-optimal solutions.

- An initial population of candidate solutions is generated,
- the fitness of each solution is evaluated.
- the most-fit solutions are chosen to reproduce.

- An array of bytes:
- 00010101 00111010 11110000

- May be an integer representation (or score)
- There should be a preset maximum or minimum score (to help with termination)
- One of the bigger challenges of designing a genetic algorithm

- An operation which is analogous to biological reproduction, in which parts of parent solutions are combined in order to produce offspring solutions.
- Typically, a single crossover point is chosen and the data beyond it are swapped in the children.

- An operation aimed at including diversity into successive generations of solutions.
- A mutation takes an existing solution to a problem and alters it in some way before including it in the next generation.

- Using crossover points and mutation factors, offspring solutions are produced and added to the population.
- This procedure is repeated until a termination condition is reached (eg. sufficient fitness, time limit exceeded)

- The creation of an initial population of solutions
- Random bytes or strings are generated:
solutions = new array(size)

for (i = 0; i < size; i++)

new solution

solution.value = random bytes or strings

solution.fitness = 0

endfor

Individual solutions are measured against the fitness function, and marked for either reproduction or removal

for (i = 0; i < size; i++)

solutions[i].fitness = fitnessFunction(i)

endfor

next = new array(maxSolutionsPerGeneration)

fittest = solutions[0]

for (i = 0; i < maxSolutionsPerGeneration; i++)

for (j = 0; j < size; j++)

if (fittest.fitness < solutions[j].fitness)

fittest = solutions[j]

endif

endfor

next[i] = fittest

endfor

solutions = next

initial population

fitness function on individual solutions of initial population

average fitness of all solutions

loop (until terminating condition)

select x solutions for reproduction

combine pairs randomly

mutate

evaluate fitness

determine average fitness

end loop

- Devon Noel de Tilly
- Tyler Chamberland
For help with researching content and preparation of overheads on Genetic Algorithms.

Fuzzy systems lack the capabilities of machine learning , as well as neural network-type memory and pattern recognition, therefore, hybrid systems(eg, neurofuzzy systems) are becoming more popular for specific applications.

Rough sets paradigm permits reduction of the number of inputs for a neural network as well as assists with the assignment of initial weights that are likely to cause the NN to converge more quickly.