- 97 Views
- Uploaded on
- Presentation posted in: General

Machine Learning

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Machine Learning

Basic definitions:

concept: often described implicitely(„good politician“) using examples, i.e. training data

hypothesis: an attempt to describe the concept in an explicite way

concept / hypothesis are presented in the corresponding language

hypothesis is verified using testing data

background knowledge provides info about the context (properties of environment)

learning algorithm searches the space of hypothesis to find consistent and complete h., the space is restricted by introducing bias

Goal of inductive ML

- Suggest a hypothesis characterizing concept in a given domain (= the set of objects in this domain) implicitely described through a limited set of classified examplesE+ and E-.
- The hypothesis:
- has to cover E+ while avoiding E-
- be applicable to objects which do not belong to E+ and E-.

Basic notions

- - domain of the concept K, ie. K.
- E a set of training examples is complemented by a classifcation, i.e. a function cl:E -->yes, no.
- E+ denotes all elements of E classified as yes
- E+ and E- are a disjoint cover of the set E

Example 1 „computer game“: Is there a way how to distinguish quickly a friendly robot from the others?

Friendly r.

Unfriendly r.

Concept Language and

Background Knowledge

- Examples of concept language:
- A set of real or idealised examples expressed in the object language that represent each of the concepts learned (Nearest Neighbour)
- attribute-value pairs (propositional logic)
- relational concepts (first order logic)

- One can extend the concept language with user-defined concepts or background knowledge.
- BK plays an important role in Inductive Logic Programming (ILP)
- The use of certain BK predicates may be a necessary condition for learning the right hypothesis.
- Redundant or irrelevant BK slows down the learning.

Example 1: hypothesis and its testing

H1 in the form of a decision tree

if neck( r) = bow then „friendly”

= nothing then

if head_shape ( r) = triangle then „friendly“

else „unfriendly“

= tiethen

if body_shape( r) = square then „unfriendly“ else

if head_shape( r) = circle then „friendly“

else „unfriendly“

Example 1: hypothesis and its testing

Hypothesis - attempt for a formal description

- Both examples and hypothesis have to be specified in a language. Hypothesis has the form of a formula (X) with a single free variable X.
- Let us define extensionExtof a hypotheis (X) wrt. the domain as the set of all elements of , which meet the condition , tj.Ext= o: (o) platí
- Properties of hypothesis
- hypothesis is complete (úplná), iff E+ Ext
- h. is consistent, if it covers no negative examples, i.e. Ext E- =
- h. is correct, if it is complete and consistent

How many correct hypothesis can be designed for a fixed training set E?

- Fact: the number of possible concepts is much more than possible hypothesis (a formula)
- concequence: most of the concepts cannot be characterized by a corresponding hypothesis - we have to accept the hypothesis, which are “approximately correct“ only.
- Uniqueness of an “approximately correct“ hypothesis cannot be ensured.

Choice of a hypthesis and Ockham´s rasor

- Williamu of Ockham recommends the way how to compare the hypothesis: „Entia non sunt multiplicanda praeter necessitatem“,
- „Einstein: „… the language should not be sompler than necessary.“

- The concept/hypothesis language specifies the language bias, which limits the set of all concepts/hypotheses that can be expressed/considered/learned.
- The preference bias allows us to decide between two hypotheses (even if they both classify the training data equally).
- The search bias defines the order in which hypotheses will be considered.
- Important if one does not search the whole hypothesis space.

Preference Bias, Search Bias & Version Space

Hypothesis are partially ordered

Version space: searches for the subset of hypotheses that have zero training error.

most gen. concept

_

_

+

+

+

most spec. concept

+

_

_

- skill refinement (swimming, biking, ...)
- knowledge acquisition
- Rote Learning (chess, checkers), the aim is to find an appropriate heuristic function evaluating the current state of the game, e.g. MIN-MAX approach
- Case-Based Reasoning: past experience is stored in a database. To solve a new problem, the systém searches the DB to find „the closest (the most similar) case“ - its solution is modified for the current problem
- Advice Taking, learning to use "interpret" or "operacionalize" an abstract advice – search for „applicability conditions“
- Induction. Difference Analysis: candidate-elimination or version space approach, decision trees induction etc.

Given: Training examples uniformly described by a single set of the same attributes and classified into a small set of classes (most often into 2 classes: positive X negative examples)

Find: a decision tree allowing to characterize the new species

Simple example: robots described by 5 discrete atributes and classified into 2 classes (friendly, unfriendly)

- Is_smiling Î{no, yes},
- Holding Î{sword, balloon, flag},
- Has_tie Î{no, yes},
- Head_shape Î{round, square, octagone},
- Body_shape Î{round, square, octagone}.

given: S ... the set of classified examples

goal: design a decision tree DT ensuring the same classification as S

1. The root is denoted by S

2. Find the "best" attribute at to be used for splitting the current set S

3. Split the set S into the subsets S1, S2, ..., Snwrt. value of at (all examples in the subset Si have the same value at = vi ). This set denotes a node of the DT

4. For each Sido:

If all examples in Sibelong to the same class or

then create a leaf with the same label,

else go to 1 with S = Si

minimize the entropy (Shanon)

H(Si) = - pi+ log pi+ - pi- log pi-

pi+=the probability that a random example in Si is ,

estimated by frequency

Let the attribute at split S into the subsets S1, S2, ..., Sn. The entropy of this system is defined

H(S,at) = i n = 1 P(S i ) H (Si )

where P(S i ) is probability of the event S i , approx. by relative size |S i | / |S|

Chooseatwith the minimalH(S,at)

Design an automatic controller for F16 for following complex task:

1.Start up and rise upto the heigth 2000 feet

2.Fly 32000 feet north

3.Turn right 330°

4.When 42000 feet from the starting point (direction N-S) turn left and head towards the starting point, the rotation is finished when the course is between 140° and 180°.

5.Adjust the flight direction so that it is paralel to the landing course, tolerance 5° for flight direction and 10° for wing twist wrt. horizont

6.Decrease the heigth and move towards the start of the landing path

7.Lend

Training data: 3 skilled pilots performed the assigned mission, each 30 times

Each flight is described by 1000 vectors characterizing ( total of 90000 training examples): ·Position and state of the plane

·Pilot’s control action

Position and state

- on_goundboolean: is the plane on the ground?
- g_limitboolean: acceleration limit exceeded?
- wing_stall (is the plane stabile?), twist (int: 0°-360°, wings wrt. horizont)
- elevation (angle „body wrt. horizont“), azimuth, roll_speed(wings deflection), elevation_speed, azimuth_speed, airspeed, climbspeed, E/W distance, N/S distance, fuel (weight of current supply)
Control:

- rollers and elevator: position of horizontal/ vertical deflection
- thrustinteger: 0-100%, force
- flapsinteger: 0°, 10° or 20°, wing twist
Each of the 7 phases calls for a specific type of control.

The training data are divided into 7 disjunctive sets which are used to design specific decision trees (independently for each task phase and each control action). Control ensured by 7 * 4 decison trees.

- Classification/prediction
- diagnosis (troubleshooting motor pumps, medicine,.., SKICAT - astronomical cataloguing)
- execution/control (GASOIL - separation of hydrocarbons)
- configuration/design (Siemens: equipment c., Boeing)
- language understanding
- vision and speech
- planning and schedulling

- Why? Important speed up of the development and maintenace
- 180 man-years to develop ES XCON with 8000 rules, 30 m-y needed for maint.
- 1 man-year to develop BP GASOIL (MLbased) with 2800 rules, 0,1 m-y needed for maint.