Link Prediction using Network Embedding based on Global Similarity

Mirmousavi, S. F.; Kianian, S.

doi:10.22061/jecei.2020.7135.359

Document Type : Original Research Paper

Authors

Faculty of Computer Engineering , Shahid Rajaee Teacher Training University,Tehran, Iran

https://doi.org/10.22061/jecei.2020.7135.359

Abstract

Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link prediction methods identify all path structures in a network and can determine the similarity degree between graph-extracted entities with high accuracy but are time-consuming instead. Most existing algorithms are only using one type of feature (global or local) to represent data, which not well described due to the large scale and heterogeneity of complex networks.
Methods: In this paper, a new method presented for Link Prediction using node embedding due to the high dimensions of real-world networks. The proposed method extracts a smaller model of the input network by getting help from the deep neural network and combining global and local nodes in a way to preserve the network's information and features to the desired extent. First, the feature vector is being extracted by an encoder-decoder for each node, which is a suitable tool for modeling complex nonlinear phenomena. Secondly, both global and local information concurrently used to improve the loss function. More obvious, the clustering similarity threshold considered as the local criterion and the transitive node similarity measure used to exploit the global features. To the end, the accuracy of the link prediction algorithm increased by designing the optimization operation accurately.
Results: The proposed method applied to 4 datasets named Cora, Wikipedia, Blog catalog, Drug-drug-interaction, and the results are compared with laplacian, Node2vec, and GAE methods. Experimental results show an average accuracy achievement of 0.620, 0.723, 0.875, and 0.845 on the mentioned datasets, and confirm that the link prediction can effectively improve the prediction performance using network embedding based on global similarity.

Keywords

20.1001.1.23223952.2020.8.1.9.8

Main Subjects

Artificial Intelligence

Open Access

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit: http://creativecommons.org/licenses/by/4.0/

Publisher’s Note

JECEI Publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Publisher

Shahid Rajaee Teacher Training University

References

[1] G. Tsoumakas, I. Katakis, “Multi-label classification: An overview,” International Journal of Data Warehousing and Mining (IJDWM), 3(3): 1-13, 2007.

[2] L. Lü, T. Zhou, “Link prediction in complex networks: A survey,” Physica A: statistical mechanics and its applications, 390(6), 2011.

[3] S. Fortunato, Community detection in graphs. Physics reports, 2010.

[4] L. Lü, M. Medo, C.H. Yeung, Y.C. Zhang, Z.K. Zhang, T. Zhou, Recommender systems. Physics reports, 519(1): 1-49, 2012.

[5] S. Wang, Q.Wang, M. Gong. "Multi-Task Learning Based Network Embedding," Frontiers in Neuroscience, 2020.

[6] E.Airoldi “Mixed membership block models for relational data with application to protein-protein interactions in Proc. International Biometric Society-ENAR Annual Meetings, 2006.

[7] Z. Huang, X. Li, H. Chen, “Link prediction approach to collaborative filtering,” in Proc.5th ACM/IEEE-CS Joint Conference on Digital Libraries, NK,: 141–142.

[8] M. Al Hasan, V. Chaoji, S. Salem, M. Zaki, Link prediction using supervised learning. In SDM06: workshop on link analysis, counter-terrorism and security 30: 798-805, 2006.

[9] Z. Huang, D.K. Lin, “The time-series link prediction problem with applications in communication surveillance,” INFORMS Journal on Computing, 21(2): 286-303, 2009.

[10] L. A. Adamic, E. Adar, Friends and neighbors on the web, Social 1networks, 25(3):230 , 2003.

[11] P. Jaccard, Etude comparative de la distribution florale dans une portiondes Alpes et du Jura, Impr. Corbaz, 1901.

[12] A. Clauset, C. Moore, M.E. Newman, “Hierarchical structure and the prediction of missing links in networks,” Nature, 453: 98-101, 2008.

[13] H. C. White, S. A. Boorman, R. L. Breiger, “Social structure from multiple networks,” i. blockmodels of roles and positions, America journal of sociology780.

[14] N. Friedman, L. Getoor, D. Koller, A. Pfeffer, “Learning probabilistic relational models,” In IJCAI, 99: 1300-1309, 1999.

[15] D. Heckerman, C. Meek, D. Koller, “Probabilistic entity-relationship models, PRMs, and plate models,” Introduction to statistical relational learning, 201-238, 2007.

[16] P. Goyal, E. Ferrara, "Graph embedding techniques, applications, and performance: A survey, “Knowledge-Based Systems, 151:78-94, 2018.

[17] W. L. Hamilton, R. Ying, J. Leskovec, Representation learning on graphs: Methods and applications, “arXiv preprint,” arXiv:1709.05584, 2017.‏

[18] X. Yue, et al., "Graph embedding on biomedical networks: methods, applications and evaluations," Bioinformatics, 36(4): 1241-1251, 2020.‏

[19] A. Ahmed, N. Shervashidze, S. Narayanamurthy, V. Josifovski, A. J. Smola, “Distributed large-scale natural graph factorization,” in: Proce 22nd international conference on World Wide Web, ACM, :37–48, 2013.

[20] D. Wang, P. Cui, W. Zhu, “Structural deep network embedding,” in Proce. 22nd International Conference on Knowledge Discovery and Data Mining, ACM,:1225–12, 2016.

[21] C. Hongyun, V.W. Zheng, K. Chen-Chuan Chang. "A comprehensive survey of graph embedding: Problems, techniques, and applications." IEEE Transactions on Knowledge and Data Engineering 3(2018):1616-1637, 2018.

[22] Ou, Mingdong, et al., "Asymmetric transitivity preserving graph embedding,” in Proce. 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 2016.‏

[23] B. Perozzi, R. Al-Rfou, S. Skiena, "August. Deepwalk: "Online Learning Of Social Representations,” In Proce 20th ACM SIGKDD international conference on Knowledge discovery and data mining,: 701-710, 2014.

[24] A. Grover, J. Leskovec, “node2vec: Scalable feature learning for networks,” in Proce 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016.‏

[25] C. Peng, et al., “A survey on network embedding,” IEEE Transactions on Knowledge and Data Engineering, 31(5): 833-852, 2018.

[26] J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, Q. Mei, “Line: Large-scale Information Network Embedding,” in Proce. 24th International Conference on World Wide Web,: 1067-1077, 2015.

[27] T.N. Kipf, M. Welling, "Semi-supervised Classification With Graph Convolutional networks" arXiv,: 1-11, 2016.

[28] Y. Wang, Y. Hongxun, S. Zhao, "Auto-encoder based dimensionality reduction," Neurocomputing, 184: 232-242, 2016.

[29] A. Papadimitriou, “Panagiotis Symeonidis, and Yannis Manolopoulos,” Friendlink: link prediction in social networks via bounded local path traversal,” 2011 International Conference on Computational Aspects of Social Networks (CASoN). IEEE, 2011.

[30] C. Xing, et al., "The application of degree related clustering coefficient in estimating the link predictability and predicting missing links of networks," Chaos: An Interdisciplinary Journal of NonlinearScience, 29(5):053135, 2019.

[31] S. Panagiotis, E. Tiakas, Y. Manoulos, “Transitive node similarity for link prediction in social networks with positive and negative links,” in Proce. The fourth ACM conference on Recommender systems, 2010.

[32] M. Belkin, P. Niyogi, “Laplacian eigenmaps and spectral techniques for embedding and clustering,” In Advances in neural information processing systems,: 585591, 2002.

[33] T.N. Kipf, M. Welling, “Variational graph auto-encoders,” arXiv preprint arXiv:1611.07308, 2016.

[34] L. Tang, H. Liu, “Relational learning via latent social dimensions,” In Proce 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, :817–826,Paris,France,Jun.2009.

[35] http://www.mattmahoney.net/dc/textdata

[36] T. Yang, R. Jin, Y. Chi, Z. Shenghuo, “Combining link and content for community detection: a discriminative approach,” In 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,: 927–936,Paris, France, Jun.2009.

[37] D.S. Wishart, Y.D. Feunang, A.C. Guo, et al., “Drugbank 5.0: a major update to the drugbank database for 2018,” Nucleic acids research, 46(D1), D1074–D1082, 2017.

LETTERS TO EDITOR

Journal of Electrical and Computer Engineering Innovations (JECEI) welcomes letters to the editor for the post-publication discussions and corrections which allows debate post publication on its site, through the Letters to Editor. Letters pertaining to manuscript published in JECEI should be sent to the editorial office of JECEI within three months of either online publication or before printed publication, except for critiques of original research. Following points are to be considering before sending the letters (comments) to the editor.

[1] Letters that include statements of statistics, facts, research, or theories should include appropriate references, although more than three are discouraged.

[2] Letters that are personal attacks on an author rather than thoughtful criticism of the author’s ideas will not be considered for publication.

[3] Letters can be no more than 300 words in length.

[4] Letter writers should include a statement at the beginning of the letter stating that it is being submitted either for publication or not.

[5] Anonymous letters will not be considered.

[6] Letter writers must include their city and state of residence or work.

[7] Letters will be edited for clarity and length.

Name *

Email Address *

Affiliation *

Comments *

Security Code *

Journal of Electrical and Computer Engineering Innovations (JECEI)

Link Prediction using Network Embedding based on Global Similarity

References

References

Send comment about this article

Volume 8, Issue 1
January 2020
Pages 97-108

Link Prediction using Network Embedding based on Global Similarity

References

References

Send comment about this article

Volume 8, Issue 1January 2020Pages 97-108

Volume 8, Issue 1
January 2020
Pages 97-108