LEARNING VECTOR QUANTIZATION Presentation By : Mihajlo Grbovic

LEARNING VECTOR QUANTIZATIONPresentation By : Mihajlo Grbovic

Learning Vector Quantization INTRODUCTION Learning Vector Quantization (LVQ) has been introduced by Kohonen as a simple, universal and efficient learning classifier. LVQ represents a family of algorithms that are widely used in the classification of potentially high-dimensional data. Their popularity and success in numerous applications is closely related to their easy implementation and their intuitively clear approach.

Learning Vector Quantization INTRODUCTION TRAINING DATA SET: Class 1 - green Class 2 - blue Class 3 - red Class 4 - yellow

Learning Vector Quantization INTRODUCTION LVQ’s TASK IS TO BUILD A MODEL USING A TRAINING DATA SET: Each test point is labeled based on the label of the closest prototype LABEL TEST POINTS BASED ON THE CLOSEST LVQ PROTOTYPES LVQ PROTOTYPES

Learning Vector Quantization INTRODUCTION LVQ classification is based on the Euclidian distance as a measure of how similar the given data is to the so-called prototypes. The prototypes are determined during the training procedure using a labeled dataset. The idea is to start with some initial positions of the prototypes in the feature space, and then improve them in such way that in the end they represent the labeled data in a best possible way. Attractive feature of LVQ is that it can be easily applied to a multi-class problem Depending on the complexity of the labeled data, we choose the number of prototypes that are involved in representation of each class. This number can vary from only a single prototype per each class (if class separations are simple) to a large number of prototypes per each class (if class separations are complex). Also, different classes can involve different number of prototypes depending on their distribution in space.

Learning Vector Quantization INTRODUCTION During the training procedure, positions of the prototypes are updated based on the distance from the points in the given dataset. Basically, we are scanning trough the dataset and for every point determining the closest prototype. Once the closest prototype is found it is moved towards (away from) the point if their classes match (differ),respectively. LVQ is an on-line learning algorithm, its computational effort scales linearly with the size of the dataset. Once one scan trough the data is finished, the prototypes should be in their optimal positions. However, there are some applications where multiple scans are needed.

Learning Vector Quantization INTRODUCTION There are several different LVQ algorithms that deal with the updates of the prototypes in a different way. Three main variants are LVQ1, LVQ2, and LVQ3. There are also LVQ2.1, LFM, LFMW, weighted LVQ, etc.

Learning Vector Quantization LVQ 1 For each training point x(t),all of the reference vectors (prototypes) are searched and the reference vector closest to the point is found, using a Euclidean distance measure. If this reference vector (prototype) mi belongs to the same class as the training point x(t), it is moved closer to the point, in proportion to the distance between the two vectors: If the closest reference vector (prototype) mi belongs to a class other than that of the point x(t), it is moved away, again in proportion to the distance between the two vectors: mi(t+l) = mi(t) +α(t) (x(t) – mi(t)), where α(t) is a monotonically decreasing function of time. mi(t+l) = mi(t) -α(t) (x(t) – mi(t)) Prototype (class 2) Prototype (class 1) Point (class 1)

Learning Vector Quantization LVQ2 For a certain training point x(t), three conditions must be met for LVQ learning to occur: 1) Closest prototype to x(t) has to be of wrong class - mi. 2) Next closest prototype to x(t) has to be of correct class - mj. 3) The training point x(t) must fall inside a small symmetric window defined around the midpoint of mi and mj mi(t+l) = mi(t) - α(t) (x(t) – mi(t)) mj(t+l) = mj(t) + α(t) (x(t) – mj(t)) UPDATE STEP where x(t) is a training vector belonging to class j, mi is the reference vector for the incorrect category, mj is the reference vector for the correct category and α(t) is a monotonically decreasing function of time. It can be seen that this scheme assures that the decision line between the two vectors will eventually attain a near-optimal position given the probability distributions of the categories, namely, the place where the distributions cross. Common initial value for α(0) is 0.03 Let di and dj be the distances from the certain training point x(t) and corresponding prototypes. Then, x(t) falls inside the window if: , where s is a constant factor, commonly chosen between 0.4 and 0.8

Learning Vector Quantization LFM For each training point x(t),all of the reference vectors (prototypes) are searched and the reference vector closest to the point is found, using a Euclidean distance measure. If this reference vector (prototype) belongs to the same class as the training point, Do NOTHING! If the closest reference vector (prototype) belongs to a class other than that of the training point, it is moved away, in proportion to the distance between the two vectors: After that find the closest prototype mj of the same class as the training point. This prototype is then moved closer to the training point, again, in proportion to the distance between the two vectors: mi(t+l) = mi(t) -α(t) (x(t) – mi(t)) mj(t+l) = mj(t) +α(t) (x(t) – mj(t)) Prototype (class 2) Prototype (class 4) Point (class 4) Prototype (class 1)

PROBLEMS WITH LVQ

Problems with LVQ Some Issues • How to initialize positions of the prototypes? • How many prototypes per class to choose? 10, 20, 30… Depends on the situation • Some classes have more complicated distribution in the feature space then others, so they need more • prototypes. How to detect this? • If the data set is unbalanced – 90% of the training data is of class 1 and 10% of class 2, how many prototypes • to assign to each class? More of them to class 1 or more of them to class 2? • As a result of noise some prototypes end up in positions where they are increasing classification • error instead of decreasing it. They are doing more harm then good. Example - 2 Gaussians in 2D • 1 2 • If we are working on a budget (100 prototypes) do we use them right away or we start with a number of • prototypes and smartly increase their number during classifications? or Prototype will initially be chosen here where it will remain trapped. 2

Problems with LVQ Complicated Data Sets • It can be shown that regular LVQ doesn’t cope well with complicated distributions is feature • space, even in the 2D case. • Example: After 0 LVQ iterations (based on initial prototype positions) Accuracy: 0.6808, number of misclassified points = 3192 After 60 LVQ iterations Accuracy: 0.87, number of misclassified points = 1207 After 30 LVQ iterations Accuracy: 0.88, number of misclassified points = 1173 Training Data Set – 10.000 points, 4 classes Initially choose 100 points of each class as prototypes

Problems with LVQ Complicated Data Sets • Why are these points misclassified after so many iterations? • There must be learning going on. But never the less these points remain misclassified. • No meter how much these points are moving the prototypes of correct class towards them, they never seem • to come. • There is a simple explanation for this. Some other points are dragging them back so they can remain correctly • classified. This means we don’t have enough prototypes. • So we come to the conclusion that we have to add some more prototypes at certain places.

Adaptive LVQ LVQ add / LVQ remove • We introduced a novel modification of LVQ called Adaptive LVQ • The idea is to start with initial equal number of prototypes per each class. • Than add prototypes to better describe more complicated class regions and • remove prototypes that are increasing classification error instead of decreasing it. • We add two steps at the end of every LVQ iteration: LVQremove and LVQadd

Adaptive LVQ LVQ ADD • LVQadd concentrates on misclassified points of each class while LVQ training. • Using Hierarchical clustering we find whole clusters of such points that are • misclassified due to insufficient number of prototypes of that class. • Then, we add prototypes at positions of cluster centorids to improve classification • accuracy. • We can control the size of clusters we want to take into consideration and the • number of prototypes we are allowed to add.

Adaptive LVQ LVQ ADD • First we isolate training points that are misclassified by the existing prototypes • Then we concentrate on each class separately to find clusters of misclassified • points and determine their centroids. interesting not interesting CLASS 1 CLASS 2 etc…

Adaptive LVQ LVQ ADD • We are not interested in small clusters of data. We can control the sensitivity • of our algorithm (for example, consider only clusters with 4 or more points). • After LVQadd the new prototypes will be added to the existing ones • There is usually some budget involved. Let’s say we start with 10 prototypes all • together. We can set a limiting budget of 50 prototypes. • So if LVQadd already added 40 prototypes during the first 30 iterations, in order • to add more it has to wait for LVQremove to remove some of them.

Adaptive LVQ LVQ REMOVE • LVQremove is introduced to deal with possible outcomes of prototype outcasts, • trapped prototypes and prototypes that are stuck in the position where they are • classifying more training points incorrectly than correctly. • This can also happen to the prototypes added as a result of LVQadd. • We are gathering statistics about each prototype during LVQ training and • combining these statistics into a unique prototype score. • For each prototype i : Scorei=Ai-Bi+Ci Ai counts how many times prototype i classified correctly (and hasn’t been moved) Bi counts how many times has prototype i been moved away as a prototype of the wrong class Ci counts how many times has prototype i been moved towards as a prototype of the correct class.

Adaptive LVQ LVQ REMOVE • Prototypes with negative score are increasing classification error instead of • decreasing it and as a result they are removed. • Based on the SCORE, prototype is a “good” prototype if it has to be moved a small • number of times AND it classifies correctly a large number of times.It is STABLE! • The purpose of LVQremove is to detect “bad” prototypes and remove them: 1, Outcast prototypes - Never, or almost never selected as the closest prototypes. They have small Ai and small Ci They are not influencing any point. These prototypes can be removed simply and without the implementation of SCORE. 2. Prototypes that are too close to one another – We merge them 3, Trapped prototypes - Large number of times selected as closest prototypes but they usually misclassify. They have large Bi and small Ai. - They can never escape their destiny and will always be moved around (2D Gauss case)

Adaptive LVQ IMPLEMENTATION • LVQadd and LVQremove together form Adaptive LVQ that can be applied to any • algorithm in the LVQ family (with slight adjustments). • For example LVQ2+LVQadd+LVQremove=Adaptive LVQ2 • LVQremove and LVQadd are performed after each LVQ iteration respectively. • Adaptive LVQ has many interesting applications. We can use it to: • - form multi-class BUDGET classification algorithm • - determine which class needs more prototypes and which less • - determine how many prototypes is enough for good classification

EXPERIMENTS AND RESULTS

RESULTS COMPLICATED 2D CASE • We use the same data set as before. This time we start with 20 training points • (5 of each class) as initial prototypes and use Adaptive LVQ to build a model. • Our limit is 100 prototypes, since we did previous experiments with this number • of prototypes. After 30 LVQ iterations Accuracy = 0.982, number of misclassified points = 174 Training Data Set – 10.000 points, 4 classes

RESULTS MAJOR DATA SETS • We compared Adaptive LVQ to Regular LVQ in classification results on 10 major data sets. • Adaptive LVQ brings 6.4% accuracy improvement on average

LEARNING VECTOR QUANTIZATION Presentation By : Mihajlo Grbovic

LEARNING VECTOR QUANTIZATION Presentation By : Mihajlo Grbovic

Presentation Transcript

Classification of boar sperm head imagesusing Learning Vector Quantization

Image Compression using Vector Quantization

Classification of boar sperm head images using Learning Vector Quantization

Extensions of vector quantization for incremental clustering

Outline of Vector Quantization of Images

Cached Vector Quantization.

Vector Quantization for Texture Compression Qiu Wu

Vector Quantization

The Dynamics of Learning Vector Quantization

Vector Quantization

Fast Texture Synthesis Tree-structure Vector Quantization

Optimized entropy-constrained vector quantization of lossy vector map compression

Variable Metric For Binary Vector Quantization

Side match vector quantization

3) Vector Quantization (VQ) and Learning Vector Quantization (LVQ)

Fast vector quantization image coding by mean value predictive algorithm

Pertemuan 9 JARINGAN LEARNING VECTOR QUANTIZATION

3.4.1 Elements of Vector Quantization Implementation

Analysis of Robust Soft Learning Vector Quantization

Supervised Clustering of Label Ranking Data Mihajlo Grbovic, Nemanja Djuric, Slobodan Vucetic

矢量量化 (Vector Quantization)