unsupervised learning n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Unsupervised Learning PowerPoint Presentation
Download Presentation
Unsupervised Learning

Loading in 2 Seconds...

play fullscreen
1 / 29

Unsupervised Learning - PowerPoint PPT Presentation


  • 105 Views
  • Uploaded on

Unsupervised Learning. A little knowledge…. Agminatics Aciniformics Q-analysis Botryology Systematics Taximetrics Clumping Morphometrics. Nosography Nosology Numerical taxonomy Typology Clustering. A few “synonyms”…. Exploratory Data Analysis.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Unsupervised Learning' - amos-martin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
unsupervised learning

Unsupervised Learning

A little knowledge…

a few synonyms
Agminatics

Aciniformics

Q-analysis

Botryology

Systematics

Taximetrics

Clumping

Morphometrics

Nosography

Nosology

Numerical taxonomy

Typology

Clustering

A few “synonyms”…
exploratory data analysis
Exploratory Data Analysis
  • Visualization methods with little or no numerical manipulation
  • Dimensionality Reduction
    • Multidimensional Scaling
    • Principal Components Analysis, Factor Analysis
    • Self-Organizing Maps
  • Cluster Analysis
    • Hierarchical
      • Agglomerative
      • Divisive
    • Non-hierarchical
  • Unsupervised: No GOLD STANDARD
outline
Outline
  • Proximity
    • Distance Metrics
    • Similarity Measures
  • Multidimensional Scaling
  • Clustering
    • Hierarchical Clustering
      • Agglomerative
    • Criterion Functions for Clustering
  • Graphical Representations
algorithms similarity measures and graphical representations
Algorithms, similarity measures, and graphical representations
  • Most algorithms are not necessarily linked to a particular metric or similarity measure
  • Also not necessarily linked to a particular graphical representation
  • Be aware of the fact that old algorithms are being reused under new names!
common mistakes
Common mistakes
  • Refer to dendrograms as meaning “hierarchical clustering” in general
  • Misinterpretation of tree-like graphical representations
  • Refer to self-organizing maps as clustering
  • Ill definition of clustering criterion
    • Declare a clustering algorithm as “best”
  • Expect classification model from clusters
  • Expect robust results with little/poor data
unsupervised learning1
Unsupervised Learning
  • Dimensionality Reduction
  • MDS
  • SOM

Raw

Data

Graphical

Representation

Distance or

Similarity Matrix

  • Clustering
  • Hierarchical
  • Non-hierar.
  • Validation
  • Internal
  • Extermal
minkowski r metric
Minkowski r-metric
  • Manhattan
    • (city-block)
  • Euclidean

j

i

metric spaces

j

i

h

Metric spaces
  • Positivity

Reflexivity

  • Symmetry
  • Triangle

inequality

more metrics
More metrics

j

  • Ultrametric
  • Four-point

additive

condition

i

h

similarity measures
Similarity measures
  • Similarity function
    • For binary, “shared attributes”

i = [1,0,1]

j = [0,0,1]

variations
Variations…
  • Fraction of d attributes shared
  • Tanimoto coefficient

i = [1,0,1]

j = [0,0,1]

more variations
More variations…
  • Correlation
    • Linear
    • Rank
  • Entropy-based
    • Mutual information
  • Ad-hoc
    • Neural networks
multidimensional scaling
Multidimensional Scaling
  • Geometrical models
  • Uncover structure or pattern in observed proximity matrix
  • Objective is to determine both dimensionality d and the position of points in the d-dimensional space
metric and non metric mds
Metric and non-metric MDS
  • Metric (Torgerson 1952)
  • Non-metric (Shepard 1961)
    • Estimates nonlinear form of the monotonic function
stress and goodness of fit
Stress

20

10

5

2.5

0

Goodness of fit

Poor

Fair

Good

Excellent

Perfect

Stress and goodness-of-fit
slide23

Non-Hierarchical:

Distance threshold

Duda et al., “Pattern Classification”

additive trees
Additive Trees
  • Commonly the minimum spanning tree
  • Nearest neighbor approach to hierarchical clustering
  • Single-linkage
other linkages
Other linkages
  • Single-linkage: proximity to the closest element in another cluster
  • Complete-linkage: proximity to the most distant element
  • Average-linkage: average proximity
  • Mean: proximity to the mean (centroid)

[only that does not require all distances]

hierarchical clustering
Hierarchical Clustering
  • Agglomerative Technique
    • Successive “fusing” cases
    • Respect (or not) definitions of intra- and /or inter-group proximity
  • Visualization
    • Dendrogram, Tree, Venn diagram