1 / 22

Jianzhi Jin 1 , Yuhua Liu 1 , Kaihua Xu 2 ,Fang Hu 1

An Efficient Algorithm Based on Self-adapted Fuzzy C-Means Clustering for Detecting Communities in Complex Networks. Jianzhi Jin 1 , Yuhua Liu 1 , Kaihua Xu 2 ,Fang Hu 1 1 Department of Computer Science, HuaZhong Normal University Wuhan, 430079, China

sherry
Download Presentation

Jianzhi Jin 1 , Yuhua Liu 1 , Kaihua Xu 2 ,Fang Hu 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Efficient Algorithm Based on Self-adapted Fuzzy C-Means Clustering for Detecting Communities in Complex Networks Jianzhi Jin1, Yuhua Liu1, Kaihua Xu2,Fang Hu1 1 Department of Computer Science, HuaZhong Normal University Wuhan, 430079, China 2 College of Physical Science and Technology, HuaZhong Normal University Wuhan, 430079, China Email: yhliu@mail.ccnu.edu.cn 2011.12.16

  2. Outline • Introduction • Self-adapted Fuzzy C-Means Clustering in Complex Networks • Simulations and Analysis • Conclusion and Future Works

  3. Introduction(1/6) • Many complex networked systems are found to divide naturally into modules or communities, groups of vertices with relatively dense connections within groups but sparser connections between them. • Detecting Communities can provide invaluable help in understanding and visualizing the structure of networks

  4. Introduction(2/6) • Detecting Communities • Requirements: • High efficiency and high accuracy • Be based on sound theoretical principles • Not allowed to be any cut-node or cut-link

  5. Introduction(3/6) • Detecting Communities • Validation Metrics • Modularity • Accuracy • Density

  6. Introduction(4/6) • FCM in Complex Networks • Have been applied to detecting communities in recent years • The mainstream algorithm—AFCM, CFCM and NFCM etc. • All use the different variants of Laplacian matrix of the graph

  7. Introduction(5/6) • FCM in Complex Networks • Laplacian matrix (N=D-A) is used in AFCM • N=D-1A is used in CFCM, and N=D-1/2(D-A) D-1/2 is used in NFCM. • D is the diagonal matrix consisting by the degree of all nodes in the whole network, and A is the adjacency matrix of the network.

  8. Introduction(6/6) • FCM in Complex Networks • Better clustering accuracy and running efficiency • The synthetic performance is well • Two deficiencies • Cannot find the number of clusters to be explored voluntarily • Easy to get stuck in a local extremum

  9. Self-adapted Fuzzy C-Means Clustering in Complex Networks(1/5) • SFCM in Complex Networks • A new algorithm based on FCM to detecting communities----Self-adapted FCM. • Constructing a new validity function to find an optimal number of clusters voluntarily. 

  10. Self-adapted Fuzzy C-Means Clustering in Complex Networks(2/5) • A New Validity Function • The inter-cluster distances should be as bigger as possible • The intra-cluster distances should be as smaller as possible.

  11. Self-adapted Fuzzy C-Means Clustering in Complex Networks(3/5) • Steps of the Algorithm • Step 1 Initialization : termination condition , cluster number , , . • Step 2 The partition matrix was constructed. • If there exist j and r, so that , then and for .

  12. Self-adapted Fuzzy C-Means Clustering in Complex Networks(4/5) • Steps of the Algorithm • Step 3 The prototypes was calculated. • Step 4 If • Then stop the iteration, else let ,and go to Step 2.

  13. Self-adapted Fuzzy C-Means Clustering in Complex Networks(5/5) • Steps of the Algorithm • Step 5 was calculated under . If is the highest values, then stop the algorithm, else go to Step 2 with . • Deficiency • The computable complexity is O(n3).

  14. Simulations and Analysis(1/7) • Zachary’s Karate Club • Network of American Football Games • Tests on Computer-generated Networks

  15. Simulations and Analysis(2/7) • Zachary’s Karate Club Square nodes and circle nodes represent the instructor’s faction and the administrator’s faction, respectively. The squares also split into two communities, which are identified by blue and green, in accordance with the circles which are identified by red and yellow.

  16. Simulations and Analysis(3/7) • Zachary’s Karate Club • Modularity of all are not high. • Modularity in AFCM is declined substantially. • Modularity in CFCM is lower than NFCM and SFCM.

  17. Simulations and Analysis(4/7) • Network of American Football Games The algorithm can find ten communities, which contain ten conferences almost exactly voluntarily. A total of 11 nodes are unclassified or misclassified, with a red circle marked, and its Accuracy is 90.43%.

  18. Simulations and Analysis(5/7) • Network of American Football Games • The modularity calculated by SFCM is higher than others, so does the density. Likewise, the community number of the first three algorithms is pre-specified.

  19. Simulations and Analysis(6/7) • Tests on Computer-generated Networks • RN(c, m, k, p) • Where c is the number of communities in the network, m is the number of nodes in each community, k is the degree of each node, and p is the density we presented.

  20. Simulations and Analysis(7/7) • Tests on Computer-generated Networks • p is increasing from 0 to 1, the community structure in the network becomes more cohesive. • All algorithms can correctly cluster all the nodes when p was no less than 0.5. • In the range of , the accuracy of SFCM is better than others.

  21. Conclusion and Future Works • A new validity function is defined in this algorithm to find an optimal cluster number voluntarily. • The simulation results verify that the algorithm is more complete and accurate • The higher computable complexity will influence its performance in the end • In a further research, we will focus on improving the computability and complexity with less loss of precision, and getting the global optimal solution.

  22. Please Ask Questions Thank you!

More Related