Document Type: Original Research Paper

Authors

1 Faculty of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran.

2 Faculty of Computer Engineering, K. N. Toosi University of Technology, Tehran, Iran.

10.22061/jecei.2020.7190.366

Abstract

Background and Objectives: Simplicity and flexibility constitute the two basic features for graph models which has made them functional models for real life problems. The attributive graphs are too popular among researchers because of their efficiency and functionality. An attributive graph is a graph the nodes and edges of which can be attributive. Nodes and edges as structural dimension and their attributes as contextual dimension made graphs more flexible in modeling real problems.
Methods: In this study, a new clustering algorithm is proposed based on K-Medoid which focuses on graph’s structure dimension, through heat diffusion algorithm and contextual dimension through weighted Jaccard coefficient in a simultaneous matter. The calculated clusters through proposed algorithm are of denser and nodes with more similar attributes.
Results: DBLP and PBLOG real data sets are applied to evaluate and compare this algorithm with new and well-known cluster algorithms.
Conclusion: Results indicate the outperformers of this algorithm in relation to its counterparts as to structure quality, cluster contextual and time complexity criteria.

Keywords

Main Subjects

[1] M.E. Newman, "The structure and function of complex networks," SIAM Review, 45(2):167-256, 2003.

[2] R. Guimera, L.A.N. Amaral, "Functional cartography of complex metabolic networks," Nature, 433: 895, 2005.

[3] E. Ravasz, A.L. Somera, D.A. Mongru, Z.N. Oltvai, A.-L. Barabási, "Hierarchical organization of modularity in metabolic networks," Science, 297: 1551-1555, 2002.

[4]     D.M. Wilkinson, B.A. Huberman, "A method for finding communities of related genes," in Proc. the national Academy of sciences, 101: 5241-5248, 2004.

[5] Y. Dourisboure, F. Geraci, M. Pellegrini, "Extraction and classification of dense communities in the web," in Proc. the 16th international conference on World Wide Web: 461-470, 2007.

[6] R. Cazabet, H. Takeda, M. Hamasaki, F. Amblard, "Using dynamic community detection to identify trends in user-generated content," Social Network Analysis and Mining, 2: 361-371, 2012.

[7] K. Konstantinidis, S. Papadopoulos, Y. Kompatsiaris, "Exploring Twitter communication dynamics with evolving community analysis," PeerJ Computer Science, 3: e107, 2017.

[8] C. Bothorel, J.D. Cruz, M. Magnani, B. Micenkova, "Clustering attributed graphs: models, measures and methods," Network Science, 3(3): 408-444, 2015.

[9] H. Cheng, Y. Zhou, J.X. Yu, "Clustering large attributed graphs: A balance between structural and attribute similarities," ACM Transactions on Knowledge Discovery from Data (TKDD), 5(2): 12, 2011.

[10] W. Nawaz, K.-U. Khan, Y.-K. Lee, S. Lee, "Intra graph clustering using collaborative similarity measure," Distributed and Parallel Databases, 33: 583-603, 2015.

[11] S. Farzi, S. Kianian, "A novel clustering algorithm for attributed graphs based on K-medoid algorithm," Journal of Experimental & Theoretical Artificial Intelligence, 30(6): 1-15, 2018.

[12] Z. Xu, Y. Ke, Y. Wang, H. Cheng, J. Cheng, "A model-based approach to attributed graph clustering," in Proc. the 2012 ACM SIGMOD international conference on management of data: 505-516, 2012.

[13] H. Ma, I. King, M.R. Lyu, "Mining web graphs for recommendations," IEEE Transactions on Knowledge and Data Engineering, 24(6): 1051-1064, 2012.

[14] M. Popescu, J. M. Keller, J. A. Mitchell, "Fuzzy measures on the gene ontology for gene product similarity," IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 3(3): 263-274, 2006.

[15] Y. Zhou, H. Cheng, J. X. Yu, "Clustering large attributed graphs: An efficient incremental approach," in Proc. 2010 IEEE 10th International Conference on Data Mining (ICDM): 689-698, 2010.

[16] Y. Tian, R.A. Hankins, J.M. Patel, "Efficient aggregation for graph summarization," in Proc. the 2008 ACM SIGMOD international conference on Management of data: 567-580, 2008.

[17] M.E. Newman, M. Girvan, "Finding and evaluating community structure in networks," Physical review E, 69(2): 026113, 2004.

[18] J. Shi, J. Malik, "Normalized cuts and image segmentation," IEEE Transactions on pattern analysis and machine intelligence, 22(8): 888-905, 2000.

[19] X. Xu, N. Yuruk, Z. Feng, T.A. Schweiger, "Scan: a structural clustering algorithm for networks," in Proc. the 13th ACM SIGKDD international conference on Knowledge discovery and data mining: 824-833, 2007.

[20] Y. Ruan, D. Fuhry, S. Parthasarathy, "Efficient community detection in large networks using content and links," in Proceedings of the 22nd international conference on World Wide Web: 1089-1098, 2013.

[21] S. Kianian, M.R. Khayyambashi, N. Movahhedinia, "Semantic community detection using label propagation algorithm," Journal of Information Science, 42(2): 166-178, 2016.

[22] M. Belkin, P. Niyogi, "Laplacian eigenmaps for dimensionality reduction and data representation," Neural computation, 15(6): 1373-1396, 2003.

[23] I.K. RISI, "Diffusion kernels on graphs and other discrete input spaces," in Proc. 19th Int. Conf. Machine Learning, 2002.

[24] J. Lafferty, G. Lebanon, "Diffusion kernels on statistical manifolds," Journal of Machine Learning Research, 6(5): 129-163, 2005.

[25] Y. Li, C. Jia, J. Yu, "A parameter-free community detection method based on centrality and dispersion of nodes in complex networks," Physica A: Statistical Mechanics and its Applications, 438: 321-334, 2015.

[26] S. Ioffe, "Improved consistent sampling, weighted minhash and l1 sketching," in Proc. 2010 IEEE 10th International Conference on Data Mining (ICDM): 246-255, 2010.

[27] L. Kaufman, P. Rousseeuw, Clustering by means of medoids: North-Holland, 1987.

[28] M. Seifikar, F. Saeed, M. Barati, "C-Blondel: An efficient louvain-based dynamic community detection algorithm," IEEE Transactions on Computational Social Systems 7(2): 308-318, 2020.

[29] M. Fozuni. Shirjini, , S. Farzi, A. Nikanjam, "MDPCluster: a swarm-based community detection algorithm in large-scale graphs." Computing, 102: 893-922, 2020.

[30] S.F. Mirmousavi, S. Kianian, "Link Prediction using Network Embedding based on Global Similarity." Journal of Electrical and Computer Engineering Innovations (JECEI), 8(1): 97-108, 2019.