Agglomerative clustering (AC)

1 / 56

Agglomerative clustering (AC) - PowerPoint PPT Presentation

Agglomerative clustering (AC). Clustering algorithms: Part 2c. Pasi Fränti 25.3.2014 Speech &amp; Image Processing Unit School of Computing University of Eastern Finland Joensuu, FINLAND. Agglomerative clustering Categorization by cost function. Single link

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about ' Agglomerative clustering (AC)' - luigi

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Agglomerative clustering (AC)

Clustering algorithms: Part 2c

• Pasi Fränti
• 25.3.2014
• Speech & Image Processing Unit
• School of Computing
• University of Eastern Finland
• Joensuu, FINLAND
Agglomerative clusteringCategorization by cost function

• Minimize distance of nearest vectors

• Minimize distance of two furthest vectors

Ward’s method

• Minimize mean square error
• In Vector Quantization, known as Pairwise Nearest Neighbor (PNN) method

We focus on this

Pseudo code
• PNN(X, M) → C, P
• FOR i←1 TO N DO
• p[i]←i; c[i]←x[i];
• REPEAT
• a,b ← FindSmallestMergeCost();
• MergeClusters(a,b);
• m←m-1;
• UNTIL m=M;

O(N)

O(N2)

N times

T(N) = O(N3)

Merge cost:

Local optimization strategy:

Nearest neighbor search:

• Find the cluster pair to be merged
• Update of NN pointers
Example of the overall process

M=5000

M=50

M=5000

M=4999

M=4998

.

.

.

M=50

.

.

M=16

M=15

M=16

M=15

Example - 25 Clusters

MSE ≈ 1.01*109

Example - 24 Clusters

MSE ≈ 1.03*109

Example - 23 Clusters

MSE ≈ 1.06*109

Example - 22 Clusters

MSE ≈ 1.09*109

Example - 21 Clusters

MSE ≈ 1.12*109

Example - 20 Clusters

MSE ≈ 1.16*109

Example - 19 Clusters

MSE ≈ 1.19*109

Example - 18 Clusters

MSE ≈ 1.23*109

Example - 17 Clusters

MSE ≈ 1.26*109

Example - 16 Clusters

MSE ≈ 1.30*109

Example - 15 Clusters

MSE ≈ 1.34*109

Storing distance matrix
• Maintain the distance matrix and update rows for the changed cluster only!
• Number of distance calculations reduces from O(N2) to O(N) for each step.
• Search of the minimum pair still requires O(N2) time  still O(N3) in total.
• It also requires O(N2) memory.
• Search reduces O(N2)  O(logN).
• In total: O(N2 logN)
Store nearest neighbor (NN) pointers[Fränti et al., 2000: IEEE Trans. Image Processing]

Time complexity reduces to O(N 3)  Ω (N 2)

Pseudo code
• PNN(X, M) → C, P
• FOR i←1 TO N DO
• p[i]←i; c[i]←x[i];
• FOR i←1 TO N DO
• NN[i]← FindNearestCluster(i);
• REPEAT
• a ← SmallestMergeCost(NN);
• b ← NN[i];
• MergeClusters(C,P,NN,a,b,);
• UpdatePointers(C,NN);
• UNTIL m=M;

O(N)

O(N2)

O(N)

O(N)

http://cs.uef.fi/pages/franti/research/pnn.txt

Algorithm:Lazy-PNN

T. Kaukoranta, P. Fränti and O. Nevalainen, "Vector quantization by lazy pairwise nearest neighbor method", Optical Engineering, 38 (11), 1862-1868, November 1999

Monotony property of merge cost [Kaukoranta et al., Optical Engineering, 1999]

Merge costs values are monotonically increasing:

d(Sa, Sb) d(Sa, Sc) d(Sb, Sc)

d(Sa, Sc)  d(Sa+b, Sc)

Lazy variant of the PNN
• Store merge costs in heap.
• Update merge cost value only when it appears at top of the heap.
• Processing time reduces about 35%.
Algorithm:Iterative shrinking

P. Fränti and O. Virmajoki “Iterative shrinking method for clustering problems“Pattern Recognition, 39 (5), 761-765, May 2006.

Agglomeration based on cluster removal[Fränti and Virmajoki, Pattern Recognition, 2006]
Cluster removal in practice

Find secondary cluster:

Calculate removal cost for every vector:

Complexity analysis

Number of vectors per cluster:

If we iterate until M=1:

Adding the processing time per vector:

Algorithm:PNN with kNN-graph

P. Fränti, O. Virmajoki and V. Hautamäki, "Fast agglomerative clustering using a k-nearest neighbor graph". IEEE Trans. on Pattern Analysis and Machine Intelligence, 28 (11), 1875-1881, November 2006

Time distortion comparison

-PNN (229 s)

Trivial-PNN (>9999 s)

Graph-PNN (1)

MSE = 5.36

Graph-PNN (2)

• Graph created by MSP
• Graph created by D-n-C

Conclusions

• Simple to implement, good clustering quality
• Straightforward algorithm slow O(N3)
• Fast exact (yet simple) algorithm O(τN2)
• Beyond this possible:
• O(τ∙N∙logN) complexity
• Complicated graph data structure
• Compromizes the exactness of the merge
Literature
• P. Fränti, T. Kaukoranta, D.-F. Shen and K.-S. Chang, "Fast and memory efficient implementation of the exact PNN", IEEE Trans. on Image Processing, 9 (5), 773-777, May 2000.
• P. Fränti, O. Virmajoki and V. Hautamäki, "Fast agglomerative clustering using a k-nearest neighbor graph". IEEE Trans. on Pattern Analysis and Machine Intelligence, 28 (11), 1875-1881, November 2006.
• P. Fränti and O. Virmajoki, "Iterative shrinking method for clustering problems", Pattern Recognition, 39 (5), 761-765, May 2006.
• T. Kaukoranta, P. Fränti and O. Nevalainen, "Vector quantization by lazy pairwise nearest neighbor method", Optical Engineering, 38 (11), 1862-1868, November 1999.
• T. Kurita, "An efficient agglomerative clustering algorithm using a heap", Pattern Recognition 24 (3) (1991) 205-209.
Literature
• J. Shanbehzadeh and P.O. Ogunbona, "On the computational complexity of the LBG and PNN algorithms". IEEE Transactions on Image Processing6 (4), 614‑616, April 1997.
• O. Virmajoki, P. Fränti and T. Kaukoranta, "Practical methods for speeding-up the pairwise nearest neighbor method ", Optical Engineering, 40 (11), 2495-2504, November 2001.
• O. Virmajoki and P. Fränti, "Fast pairwise nearest neighbor based algorithm for multilevel thresholding", Journal of Electronic Imaging, 12 (4), 648-659, October 2003.
• O. Virmajoki, Pairwise Nearest Neighbor Method Revisited, PhD thesis, Computer Science, University of Joensuu, 2004.
• J.H. Ward, Hierarchical grouping to optimize an objective function, J. Amer. Statist.Assoc. 58 (1963) 236-244.