Parallel and Exact Method for Solving n-Similarity Problem

Mirhosseini, M.; Fazlali, M.

doi:10.22061/jecei.2020.7247.377

Document Type : Original Research Paper

Authors

Department of Data and Computer Science. Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran.

https://doi.org/10.22061/jecei.2020.7247.377

Abstract

Background and Objectives: -similarity problem defined as measuring the similarity among objects and finding a group of objects from a dataset that have the most similarity to each other. This problem has been become an important issue in information retrieval and data mining. Theory of this concept is mathematically proven, but it practically has high memory complexity and is so time consuming. Besides, the solutions found by metaheuristics are not exact.
Methods: This paper is conducted to propose an exact method to solve -similarity problem reducing the memory complexity and decreasing the execution time by parallelism using Open-MP. The experiments are performed on the application of text document resemblance.
Results: It has been shown that the memory complexity of the proposed method is decreased to , and the experimental results show that this method accelerates the speed of the computations about 5 times.
Conclusion: The simulated results of the proposed method display a good improvement in speed, the used memory space, and scalability compared with the previous exact method.

Keywords

20.1001.1.23223952.2020.8.2.5.6

Main Subjects

Classification

Open Access

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit: http://creativecommons.org/licenses/by/4.0/

Publisher’s Note

JECEI Publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Publisher

Shahid Rajaee Teacher Training University

References

[1] M. Keshavarzi, M. A. Dehghan, M. Mashinchi, “Applications of classiﬁcation based on similarities and dissimilarities,” Fuzzy Information and Engineering, 4(1): 75-92, 2012.

[2] M. Keshavarzi, M. A. Dehghan, M. Mashinchi, “Classiﬁcation based on 3-similarity, Iranian Journal of Mathematical Sciences and Informatics,” 6(1): 7-21, 2011.

[3] M. Keshavarzi, “Classification based on similarity and dissimilarity”, PhD thesis, Shahid Bahonar University of Kerman, Iran, 2010.

[4] S. Theodoridis, K. Koutroumbas, Pattern recognition, Academic Press, 2003.

[5] L. Kaufman, P. J. Rousseeuw, Finding Group in Data An Introduction to Cluster Analysis, Wiley, New York, 2005.

[6] W. J. Wang, “New similarity measure on fuzzy sets and on elements”, Fuzzy Sets and Systems, 85(3): 305-309, 1997.

[7] H. Rezaei, M. Emoto, M. Mukaidono, “New similarity measure between two fuzzy sets,” Journal of Advanced Computational Intelligence and Intelligent Informatics, 10(6): 946-953, 2006.

[8] J. Ye, “Cosine similarity measures for intuitionistic fuzzy sets and their applications,” Mathematical and Computer Modeling, 53: 91–97, 2011.

[9] M. Mirhoseini, M. Mashinchi, H. Nezamabadi-pour, “Improving n-Similarity problem by genetic algorithm and its application in text document resemblance,” Fuzzy Information and Engineering, 6: 263-278, 2014.

[10] M. Mirhoseini, H. Nezamabadi-pour, “Metaheuristic Search Algorithms in Solving the n-Similarity Problem,” Fundamenta Informaticae, 152(2): 145-166, 2017.

[11] K. Lakshmanan, S. Kato, R. Rajkumar, “Scheduling Parallel Real-Time Tasks on Multi-core Processors,” in Proc. 2010 31st IEEE Real-Time Systems Symposium: 259-268, 2010.

[12] M. K. Fallah, V. S. Keshvari, M. Fazlali, “A Parallel Hybrid Genetic Algorithm for Solving the Maximum Clique Problem,” in Proc. High-Performance Computing and Big Data Analysis. TopHPC 2019. Communications in Computer and Information Science, 891: 378-393, 2019.

[13] M. K. Fallah, M. Mirhosseini, M. Fazlali, M. Daneshtalab, "Scalable Parallel Genetic Algorithm For Solving Large Integer Linear Programming Models Derived From Behavioral Synthesis," in Proc. 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP): 390-394, 2020.

[14] S. Hong, T. Oguntebi, K. Olukotun, “Efficient Parallel Graph Exploration on Multi-Core CPU and GPU”, 2011 International Conference on Parallel Architectures and Compilation Techniques, Galveston, TX, pp. 78-88, 2011.

[15] M. Mirhosseini, M. Fazlali, G. Gaydadjiev, “A Parallel and Improved Quadrivalent Quantum-Inspired Gravitational Search Algorithm in Optimal Design of WSNs,” High-Performance Computing and Big Data Analysis. TopHPC 2019. Communications in Computer and Information Science, 891: 352-366, 2019.

[16] P. Delisle, M. Krajecki, M. Gravel, C. Gagné, “Parallel implementation of an ant colony optimization metaheuristic with OpenMP,” In Proceedings of the 3rd European Workshop on OpenMP (EWOMP’01): 1-7, 2001.

[17] L. Dagum, M. Menon. “OpenMP: an industry standard API for shared-memory programming,” IEEE Computational Science and Engineering, 5(1): 46-55, 1998

[18] http://www.daviddlewis.com/resources/testcollections/reuters21578/.

[19] http://www.ai.mit.edu/projects/jmlr/papers/volume5/lewis04a/a11-smart-stop-list/.

[20] M. F. Porter, “An algorithm for suffix stripping”, Program, 14(3): 130–137, 1980.

[21] C. Qimin, G. Qiao, W. Yongliang, W. Xianghu, “Text clustering using VSM with feature clusters”, Neural Computing and Applications, vol 26, pp. 995- 1003, 2015.

LETTERS TO EDITOR

Journal of Electrical and Computer Engineering Innovations (JECEI) welcomes letters to the editor for the post-publication discussions and corrections which allows debate post publication on its site, through the Letters to Editor. Letters pertaining to manuscript published in JECEI should be sent to the editorial office of JECEI within three months of either online publication or before printed publication, except for critiques of original research. Following points are to be considering before sending the letters (comments) to the editor.

[1] Letters that include statements of statistics, facts, research, or theories should include appropriate references, although more than three are discouraged.

[2] Letters that are personal attacks on an author rather than thoughtful criticism of the author’s ideas will not be considered for publication.

[3] Letters can be no more than 300 words in length.

[4] Letter writers should include a statement at the beginning of the letter stating that it is being submitted either for publication or not.

[5] Anonymous letters will not be considered.

[6] Letter writers must include their city and state of residence or work.

[7] Letters will be edited for clarity and length.

Name *

Email Address *

Affiliation *

Comments *

Security Code *

Journal of Electrical and Computer Engineering Innovations (JECEI)

Parallel and Exact Method for Solving n-Similarity Problem

References

References

Send comment about this article

Volume 8, Issue 2
July 2020
Pages 193-200

Parallel and Exact Method for Solving n-Similarity Problem

References

References

Send comment about this article

Volume 8, Issue 2July 2020Pages 193-200

Volume 8, Issue 2
July 2020
Pages 193-200