A New Clustering Algorithm for Attributive Graphs through Information Diffusion Approaches

Kianian, S.; Farzi, S.; Samak, H.

doi:10.22061/jecei.2020.7190.366

Document Type : Original Research Paper

Authors

¹ Faculty of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran.

² Faculty of Computer Engineering, K. N. Toosi University of Technology, Tehran, Iran.

https://doi.org/10.22061/jecei.2020.7190.366

Abstract

Background and Objectives: Simplicity and flexibility constitute the two basic features for graph models which has made them functional models for real life problems. The attributive graphs are too popular among researchers because of their efficiency and functionality. An attributive graph is a graph the nodes and edges of which can be attributive. Nodes and edges as structural dimension and their attributes as contextual dimension made graphs more flexible in modeling real problems.
Methods: In this study, a new clustering algorithm is proposed based on K-Medoid which focuses on graph’s structure dimension, through heat diffusion algorithm and contextual dimension through weighted Jaccard coefficient in a simultaneous matter. The calculated clusters through proposed algorithm are of denser and nodes with more similar attributes.
Results: DBLP and PBLOG real data sets are applied to evaluate and compare this algorithm with new and well-known cluster algorithms.
Conclusion: Results indicate the outperformers of this algorithm in relation to its counterparts as to structure quality, cluster contextual and time complexity criteria.

Keywords

20.1001.1.23223952.2020.8.2.11.2

Main Subjects

Artificial Intelligence

Open Access

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit: http://creativecommons.org/licenses/by/4.0/

Publisher’s Note

JECEI Publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Publisher

Shahid Rajaee Teacher Training University

References

[1] M.E. Newman, "The structure and function of complex networks," SIAM Review, 45(2):167-256, 2003.

[2] R. Guimera, L.A.N. Amaral, "Functional cartography of complex metabolic networks," Nature, 433: 895, 2005.

[3] E. Ravasz, A.L. Somera, D.A. Mongru, Z.N. Oltvai, A.-L. Barabási, "Hierarchical organization of modularity in metabolic networks," Science, 297: 1551-1555, 2002.

[4] D.M. Wilkinson, B.A. Huberman, "A method for finding communities of related genes," in Proc. the national Academy of sciences, 101: 5241-5248, 2004.

[5] Y. Dourisboure, F. Geraci, M. Pellegrini, "Extraction and classification of dense communities in the web," in Proc. the 16th international conference on World Wide Web: 461-470, 2007.

[6] R. Cazabet, H. Takeda, M. Hamasaki, F. Amblard, "Using dynamic community detection to identify trends in user-generated content," Social Network Analysis and Mining, 2: 361-371, 2012.

[7] K. Konstantinidis, S. Papadopoulos, Y. Kompatsiaris, "Exploring Twitter communication dynamics with evolving community analysis," PeerJ Computer Science, 3: e107, 2017.

[8] C. Bothorel, J.D. Cruz, M. Magnani, B. Micenkova, "Clustering attributed graphs: models, measures and methods," Network Science, 3(3): 408-444, 2015.

[9] H. Cheng, Y. Zhou, J.X. Yu, "Clustering large attributed graphs: A balance between structural and attribute similarities," ACM Transactions on Knowledge Discovery from Data (TKDD), 5(2): 12, 2011.

[10] W. Nawaz, K.-U. Khan, Y.-K. Lee, S. Lee, "Intra graph clustering using collaborative similarity measure," Distributed and Parallel Databases, 33: 583-603, 2015.

[11] S. Farzi, S. Kianian, "A novel clustering algorithm for attributed graphs based on K-medoid algorithm," Journal of Experimental & Theoretical Artificial Intelligence, 30(6): 1-15, 2018.

[12] Z. Xu, Y. Ke, Y. Wang, H. Cheng, J. Cheng, "A model-based approach to attributed graph clustering," in Proc. the 2012 ACM SIGMOD international conference on management of data: 505-516, 2012.

[13] H. Ma, I. King, M.R. Lyu, "Mining web graphs for recommendations," IEEE Transactions on Knowledge and Data Engineering, 24(6): 1051-1064, 2012.

[14] M. Popescu, J. M. Keller, J. A. Mitchell, "Fuzzy measures on the gene ontology for gene product similarity," IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 3(3): 263-274, 2006.

[15] Y. Zhou, H. Cheng, J. X. Yu, "Clustering large attributed graphs: An efficient incremental approach," in Proc. 2010 IEEE 10th International Conference on Data Mining (ICDM): 689-698, 2010.

[16] Y. Tian, R.A. Hankins, J.M. Patel, "Efficient aggregation for graph summarization," in Proc. the 2008 ACM SIGMOD international conference on Management of data: 567-580, 2008.

[17] M.E. Newman, M. Girvan, "Finding and evaluating community structure in networks," Physical review E, 69(2): 026113, 2004.

[18] J. Shi, J. Malik, "Normalized cuts and image segmentation," IEEE Transactions on pattern analysis and machine intelligence, 22(8): 888-905, 2000.

[19] X. Xu, N. Yuruk, Z. Feng, T.A. Schweiger, "Scan: a structural clustering algorithm for networks," in Proc. the 13th ACM SIGKDD international conference on Knowledge discovery and data mining: 824-833, 2007.

[20] Y. Ruan, D. Fuhry, S. Parthasarathy, "Efficient community detection in large networks using content and links," in Proceedings of the 22nd international conference on World Wide Web: 1089-1098, 2013.

[21] S. Kianian, M.R. Khayyambashi, N. Movahhedinia, "Semantic community detection using label propagation algorithm," Journal of Information Science, 42(2): 166-178, 2016.

[22] M. Belkin, P. Niyogi, "Laplacian eigenmaps for dimensionality reduction and data representation," Neural computation, 15(6): 1373-1396, 2003.

[23] I.K. RISI, "Diffusion kernels on graphs and other discrete input spaces," in Proc. 19th Int. Conf. Machine Learning, 2002.

[24] J. Lafferty, G. Lebanon, "Diffusion kernels on statistical manifolds," Journal of Machine Learning Research, 6(5): 129-163, 2005.

[25] Y. Li, C. Jia, J. Yu, "A parameter-free community detection method based on centrality and dispersion of nodes in complex networks," Physica A: Statistical Mechanics and its Applications, 438: 321-334, 2015.

[26] S. Ioffe, "Improved consistent sampling, weighted minhash and l1 sketching," in Proc. 2010 IEEE 10th International Conference on Data Mining (ICDM): 246-255, 2010.

[27] L. Kaufman, P. Rousseeuw, Clustering by means of medoids: North-Holland, 1987.

[28] M. Seifikar, F. Saeed, M. Barati, "C-Blondel: An efficient louvain-based dynamic community detection algorithm," IEEE Transactions on Computational Social Systems 7(2): 308-318, 2020.

[29] M. Fozuni. Shirjini, , S. Farzi, A. Nikanjam, "MDPCluster: a swarm-based community detection algorithm in large-scale graphs." Computing, 102: 893-922, 2020.

[30] S.F. Mirmousavi, S. Kianian, "Link Prediction using Network Embedding based on Global Similarity." Journal of Electrical and Computer Engineering Innovations (JECEI), 8(1): 97-108, 2019.

LETTERS TO EDITOR

Journal of Electrical and Computer Engineering Innovations (JECEI) welcomes letters to the editor for the post-publication discussions and corrections which allows debate post publication on its site, through the Letters to Editor. Letters pertaining to manuscript published in JECEI should be sent to the editorial office of JECEI within three months of either online publication or before printed publication, except for critiques of original research. Following points are to be considering before sending the letters (comments) to the editor.

[1] Letters that include statements of statistics, facts, research, or theories should include appropriate references, although more than three are discouraged.

[2] Letters that are personal attacks on an author rather than thoughtful criticism of the author’s ideas will not be considered for publication.

[3] Letters can be no more than 300 words in length.

[4] Letter writers should include a statement at the beginning of the letter stating that it is being submitted either for publication or not.

[5] Anonymous letters will not be considered.

[6] Letter writers must include their city and state of residence or work.

[7] Letters will be edited for clarity and length.

Name *

Email Address *

Affiliation *

Comments *

Security Code *

Journal of Electrical and Computer Engineering Innovations (JECEI)

A New Clustering Algorithm for Attributive Graphs through Information Diffusion Approaches

References

References

Send comment about this article

Volume 8, Issue 2
July 2020
Pages 273-284

A New Clustering Algorithm for Attributive Graphs through Information Diffusion Approaches

References

References

Send comment about this article

Volume 8, Issue 2July 2020Pages 273-284

Volume 8, Issue 2
July 2020
Pages 273-284