slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Learning & Data Mining PowerPoint Presentation
Download Presentation
Learning & Data Mining

Loading in 2 Seconds...

play fullscreen
1 / 49

Learning & Data Mining - PowerPoint PPT Presentation


  • 283 Views
  • Uploaded on

Learning & Data Mining Learning Change of contents and organization of system’s knowledge enabling to improve its performance on task - Simon Acquire new knowledge from environment Organize its current knowledge Inductive Inference General conclusion from examples

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Learning & Data Mining' - Albert_Lan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
learning
Learning
  • Change of contents and organization of system’s knowledge enabling to improve its performance on task - Simon
  • Acquire new knowledge from environment
  • Organize its current knowledge
  • Inductive Inference
    • General conclusion from examples
    • Infer association between input and output
    • with some confidence
  • Incremental vs Batch
general model of learning agent
General Model of Learning Agent

Performance standard

Environment

Sensors

Critics

Feedback

Change

Learning

Module

Performance

Module

Knowledge

Learning

Goals

Problem

Generator

Effectors

From Artificial Intelligence : a modern Approach

by Russel and Norvig

classification of inductive learning
Classification of Inductive Learning
  • Supervised Learning
    • given training examples
      • correct input-output pairs
    • recover unknown function from data generated from the function
    • generalization ability for unseen data
    • classification : function is discrete
    • concept learning : output is binary
  • Unsupervised Learning
classification of inductive learning5
Classification of Inductive Learning
  • Supervised Learning
  • Unsupervised Learning
    • No correct input-output pairs
    • needs other source for determining correctness
    • reinforcement learning : yes/no answer only
      • example : chess playing
    • Clustering : group into clusters of common characteristics
    • Map Learning : explore unknown territory
    • Discovery Learning : uncover new relationships
slide6
데이터 마이닝
  • 데이터 마이닝(data mining)의 정의
    • 대량의 실제 데이터로부터
    • 이전에 잘 알려지지는 않았지만
    • 묵시적이고
    • 잠재적으로 유용한 정보를

추출하는 작업

Cf) KDD(Knowledge Discovery in Database)

데이터로부터 지식을 추출하는 전 과정

데이터 마이닝 ⊂ KDD

slide7
데이터 마이닝 기술( I )

전문가

시스템

기계 학습

KDD

Data Mining

데이터

베이스

통계학

가시화

slide8
데이터 마이닝 기술 ( II )
  • 데이터 마이닝 주요 작업(primary task)
    • 분류화(Classification)
    • 군집화(Clustering)
    • 특성화(Characterization, Summerization)
    • 경향 분석(Trend analysis)
    • 연관규칙 탐사(Association, Market basket analysis)
    • 패턴 분석(Pattern analysis)
    • Estimation
    • Prediction
slide9
데이터 마이닝 기술( III )
  • 응용 분야
    • Marketing & Retail
    • Banking
    • Finance
    • Insurance
    • Medicine & Health(Genetics)
    • Quality control
    • Transportation
    • Geo • Spatial Application
data mining tasks 1
Data Mining Tasks(1)
  • classification
    • Examples

News ⇒ [international] [domestic]

[sports]

[culture]…

large

medium

small

predefinedclasses

objects

data mining tasks 2
Data Mining Tasks(2)
  • Classification - continued

Credit application ⇒ [high] [medium] [low]

Water sample => [일급수]

[이급수]

[구정물]

    • Algorithm
      • Decision trees, Memory based reasoning
data mining tasks 3
Data Mining Tasks(3)
  • Estimation

cf. classification maps to discrete categories

    • Examples
      • 나이, 성별, 혈압… ⇒ 잔여수명
      • 나이, 성별, 직업… ⇒ 연수입
      • 지역, 수량(水), 인구 -> 오염농도
    • Algorithm : Neural network
    • Estimating future value is called Prediction

attr1

attr2

attr3

(continuous)

value

data

data mining tasks 4
Data Mining Tasks(4)
  • Association (Market basket analysis)

- determine which things go together

    • Example
      • shopping list ⇒ Cross-Selling(supermarket (shelf, catalog, CF…) home-shopping, E-shopping…)
      • Association rules
data mining tasks 5
Data Mining Tasks(5)
  • Clustering

cf. classification - predefined category

clustering - find new category &

explain the category

G1

G2

G3

G4

heterogeneous population

homogeneous subgroups(clusters)

data mining tasks 6
Data Mining Tasks(6)
  • Clustering -continued
      • Examples
        • Symptom ⇒ Disease
        • Customer information ⇒ Selective sales
        • 토양(수질) data

Note: clustering is dependent to the

features used

card 예: number, color, suite …

data mining tasks 7
Data Mining Tasks(7)
  • Clustering - continued
    • Clustering is useful for Exception finding
    • Algorithm

K-means -> K clusters

Note:Directed vs. Non-directed KDD

exception

  • calling card fraud detection
  • credit card fraud. etc.
slide17
데이터 마이닝 기술(IV)
  • 데이터 마이닝 기법
    • 연관규칙(association rule)
    • K-최단인접(k-nearest neighbor)
    • 의사결정 트리(decision tree)
    • 신경망(neural network)
    • 유전자 알고리즘(genetic algorithm)
    • 통계적 기법(statistical technique)
market basket analysis associations 1 10
Market Basket Analysis (Associations) (1/10)

O: Orange Juice M: Milk

S: Soda W: Window Cleaner

D: Detergent

market basket analysis associations 3 10
Market Basket Analysis (Associations) (3/10)

{ S , O} : Co-Occurrence of 2

R1 - if S Then O

R2 - if O Then S

  • Support - 전체 data중 몇 percent가 이를 포함?

Confidence - 전체 LHS 중 몇 percent 가 규칙만족?

eg. Support of R1  2 / 5 40%

Confidence of R1  2 / 3

confidence of R2  2 / 4

determine “How Good” is the Rule

market basket analysis associations 5 10
Market Basket Analysis (Associations) (5/10)

R1: If A ∧ B then C

R2: If A ∧ C then B

R3: If B ∧ C then A

  • Confidence

Support =5

market basket analysis associations 6 10
Market Basket Analysis (Associations) (6/10)
  • R3 has the best confidence (0.33)

but is it GOOD?

Note: R3 : If B ∧C then A (0.33)

A (0.45)

예: 머리 긴 사람 여자

  • Improvement -> How good is the rule

compared to random guessing?

?

market basket analysis associations 7 10
Market Basket Analysis (Associations) (7/10)

improvement=

improvement > 1: criteria

P(condition and result)

P(condition) P(result)

market basket analysis associations 8 10
Market Basket Analysis (Associations) (8/10)
  • Some Issues
    • overall algorithm

build co-occurrence matrix for

1 item, 2 items, 3 items, etc.

-> complex!!

    • Pruning

eg. minimum support pruning

    • Virtual Item

season, store, geographic information

combined with real : items

eg. If OJ ∧ Milk ∧Friday then Beer

market basket analysis associations 9 10
Market Basket Analysis (Associations) (9/10)
  • Level of Description

How specific !

Drink Soda Coke

  • 장점

- explainability

- undirected Data Mining

- variable length data

- simple computation

market basket analysis associations 10 10
Market Basket Analysis (Associations) (10/10)
  • 단점

- Complex as data grows

- Limited Data Type (attributes)

- Difficult to determine right number of items

- Rare Items --> pruned

clustering algorithm 1 2
Clustering Algorithm (1/2)
  • k-means method ( Mc Queen ‘61)

- lot of variations

  • Alg. Step

1. Choose initial k-points (seeds)

2. Find closest neighbors for k points

( initial cluster)

3. Find centroid for the cluster

4. goto step 2

stop when no more change

clustering algorithm 2 2

x1+ … + xn

y1 + … + yn

(x2, y2)

(x3, y3)

n

n

,

(xn, yn)

(x1, y1)

Clustering Algorithm (2/2)

Note: Finding neighbors

  • Finding Centroid
variation of k means
Variation of k-means

1. Use probability density rather than simple distance

eg. Gaussian mixture Models

2. Weighted Distance

3. Agglomeration Method

- hierarchical cluster

agglomerative algorithm
Agglomerative Algorithm

1. Start with every single record as a cluster(N)

2. Select closest cluster and combine them

(N-1 clusters)

3. go to step 2

4. Stop at the right level (number)

what is closest?

distance between clusters
Distance between clusters
  • 3 measures

1. Single linkage

closest members

2. Complete linkage

most distant members

3. centeroids

clustering
Clustering
  • Strength

1. Undirected Knowledge Discovery

2. Categorical, Numeric, Textual data 에 적합

3. Easy to Apply

  • Weakness

1. Can be difficult to choose right (distance)

measure & weight

2. Initial parameter에 sensitive

3. Can be hard to interpret

decision tree contact lens

Tear production

normal

reduced

none

astigmatism

yes

no

spectacle press

soft

hypermetrope

myope

none

hard

Decision Tree(contact lens)
concept learning

Class 1

Class 2

Learning

function

input

Class n

classification

yes

concept

input

Concept learning

no

decision tree

Concept Learning

eg. red

good customer

weather data
Weather data

attribute

instance

s: sunny h: hot

o: overcast m: mild

h: high n: normal

r: rainy c: cool

decision tree for weather 1 4
Decision Tree for weather (1/4)

outlook

sunny

r

o

humidity

windy

no

high

n

f

t

no

yes

yes

no

If Outlook = sunny then play = no and humidity = high

decision tree for weather 2 4
Decision Tree for weather (2/4)

note: temp, humid can be numeric data

temp>30 (hot)

10<= temp <= 30 (normal)

temp<10 (cool)

decision tree for weather 3 4
Decision Tree for weather (3/4)
  • attribute
    • Attribute types
      • nominal ( categorical discreet )
      • ordinal ( numeric continuous)
      • interval [10,20]
      • ratio – real numbers
decision tree for weather 4 4
Decision Tree for weather (4/4)

note: Leaf node doesn’t have to be yes/no

--> classification

tear

normal

reduced

astigmatism

none

no

hard

soft

Contact lens

decision tree prediction
Decision Tree 를 이용한 Prediction

A

Build trees

C

B

Training

(set)

...

Test

(set)

Evaluation

set

B

Choose best

A

data

real data

Predict expected

performance

box diagram of decision tree
Box Diagram of Decision Tree

rain

sunny

overcast

Windy

humidity

yes

high

n

n

y

y

n

n

n

y

no

n

y

y

y

y

y

y

the effect of pruning

Prune here!

Unseen data

Error

rate

Training data

Depth of Tree

The effect of pruning
  • Some issues
    • where to prune?

Too high -> unnecessarily complex

too low -> lose information

    • what to split?

(first)

error rate
Error Rate

y

y

y

n

y

y

n

y

er=2/7

  • Adjusted error rate of a tree

AE(T)= E(T) + α leaf-count(T)

  • Find sub tree α1 of T s.t.

AE(α1) <= AE(T)

then prune all the branches

that are not part of α1

possible sub trees for weather data 1 2
Possible sub trees for weather data (1/2)

first split?

(a) (b)

temp

outlook

sunny

not

cool

rainy

o

mild

y

y

y

n

n

y

y

y

y

y

y

n

n

n

y

y

y

n

y

y

y

y

n

n

y

y

n

n

possible sub trees for weather data 2 2
Possible sub trees for weather data (2/2)

(c ) (d)

windy

humidity

high

true

normal

false

y

y

y

y

y

y

n

y

y

y

n

n

n

n

y

y

y

n

n

n

y

y

y

y

y

y

n

n

information theory entropy
Information Theory & Entropy

info([2,3]) = 0.971 bit

info([4,0]) = 0.0 bit

info([3,2]) = 0.971 bit

-> info ([2,3], [4,0], [3,2])

= (5/14) * 0.971 + (4/14) * 0 + (5/14) * 0.971

= 0.693 bit

gain(outlook) = info([9,5]) - info([2,3], [4,0],[3,2])

= 0.247 bits

gain(temp) = 0.029 bit

gain(humid) = 0.152 bit

gain(windy) = 0.048 bit

calculating info x entropy
Calculating info(x) - entropy
  • if either #yes or #no is 0

then info(x) = 0

  • if #yes = #no then

info(x) is max.value

  • can cover multi class situation

eg. Info[2,3,4]

= info( [2,7] + 7/9 * info[3,4] )

=> entropy(p1, p2, … , pn) = - p1log p1 - p2 logp2

- … - pn log pn

info([2,3,4]) = entropy ( 2/9, 3/9, 4/9 )

-> -2/9 * log 2/9 - 3/9 * log 3/9 - 4/9 log 4/9

= [-2log 2 - 3 log 3 - 4 log 4 + 9 log 9] /9

algorithms cart c4 5
Algorithms: CART, C4.5
  • CART - binary tree only

Briemen ‘84

  • C4.5

Quinlan ‘86 => ID3

    • Clementine
    • NCR
  • CHAID Hartigan ‘75