A High-Performance Model based on Ensembles for Twitter Sentiment Classification

Asgarnezhad, R.; Monadjemi, A.; SoltanAghaei, M.

doi:10.22061/jecei.2020.7100.357

Document Type : Original Research Paper

Authors

¹ Department of Computer Engineering, Isfahan (Khorasgan) Branch, Islamic Azad University, Isfahan, Iran

² Senior Lecturer, School of continuing and lifelong education, National University of Singapore, Singapore, 119077 Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran

https://doi.org/10.22061/jecei.2020.7100.357

Abstract

Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment target and seek for tweets containing positive, negative, or neutral opinions. This is remarkable for consumers to investigate the products before purchase automatically.
Methods: This paper suggests a model for sentiment classification. The goal of this model is to investigate what is the role of n-grams and sampling techniques in Sentiment Classification application using an ensemble method on Twitter datasets. Also, it examines both binary and multiple classifications, which are classified datasets into positive, negative, or neutral classes.
Results: Twitter Classification is an outstanding problem, which has very few free resources and not available due to modified authorization status. However, all Twitter datasets are not labeled and free, except for our applied dataset. We reveal that the combination of ensemble methods, sampling techniques, and n-grams can improve the accuracy of Twitter Sentiment Classification.
Conclusion: The results confirmed the superiority of the proposed model over state-of-the-art systems. The highest results obtained in terms of accuracy, precision, recall, and f-measure.

Keywords

20.1001.1.23223952.2020.8.1.5.4

Main Subjects

Text Classification

Open Access

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit: http://creativecommons.org/licenses/by/4.0/

Publisher’s Note

JECEI Publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Publisher

Shahid Rajaee Teacher Training University

References

[1] E. Kouloumpis, T. Wilson, J. D. Moore, "Twitter sentiment analysis: The good the bad and the omg!," in Proc. Fifth International AAAI conf. on weblogs and social media, 2011: 538-541, 2011.

[2] F. H. Khan, S. Bashir, U. Qamar, "TOM: Twitter opinion mining framework using hybrid classification scheme," Decision Support Systems, 57: 245-257, 2014.

[3] N. F. Da Silva, E. R. Hruschka, E. R. Hruschka, "Tweet sentiment analysis with classifier ensembles," Decision Support Systems, 66: 170-179, 2014.

[4] A. C. Pandey, D. S. Rajpoot, M. Saraswat, "Twitter sentiment analysis using hybrid cuckoo search method," Information Processing & Management, 53: 764-779, 2017.

[5] H. Saif, M. Fernández, Y. He, H. Alani, "On stopwords, filtering and data sparsity for sentiment analysis of twitter," in Proc. Ninth International Conf. on Language Resources and Evaluation, 2014: 810–817, 2014.

[6] D. Tang, F. Wei, N. Yang, M. Zhou, T. Liu, B. Qin, "Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification," in Proc. The 52nd Annual Meeting of the Association for Computational Linguistics, 2014: 1555-1565, 2014.

[7] B. Besbinar, D. Sarigiannis, P. Smeros, "Tweet Sentiment Classification," Lausanne, 2014.

[8] A. Montejo-Ráez, E. Martínez-Cámara, M. T. Martín-Valdivia, L. A. Ureña-López, "Ranked wordnet graph for sentiment polarity classification in twitter," Computer Speech & Language, 28: 93-107, 2014.

[9] D.-T. Vo, Y. Zhang, "Target-Dependent Twitter Sentiment Classification with Rich Automatic Features," in Proc. IJCAI,: 1347-1353, 2015.

[10] A. Go, R. Bhayani, L. Huang, "Twitter sentiment classification using distant supervision," CS224N Project Report, Stanford, 1, :1-6, 2009.

[11] L. Jiang, M. Yu, M. Zhou, X. Liu, T. Zhao, "Target-dependent twitter sentiment classification," in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies- 1: 151-160, 2011.

[12] H. Saif, Y. He, H. Alani, "Alleviating data sparsity for twitter sentiment analysis," in Proc. the 21st International Conference on theWorld Wide Web,: 2–9, 2012.

[13] L. Chen, W. Wang, M. Nagarajan, S. Wang, A. P. Sheth, "Extracting Diverse Sentiment Expressions with Target-Dependent Polarity from Twitter," ICWSM, 2: 50-57, 2012.

[14] R. Asgarnezhad, K. Mohebbi, "A Comparative Classification of Approaches and Applications in Opinion Mining," International Academic Journal of Science and Engineering, 2(1): 68-80, 2015.

[15] R. Asgarnezhad, S. A. Monadjemi, M. Soltanaghaei," FAHPBEP: A fuzzy Analytic Hierarchy Process framework in text classification," accepted in Majlesi Journal of Electrical Engineering, 14(3), 2020.

[16] J. Han, "MichelineKamber. Data mining: concepts and techniques," Morgan Kaufmann Publishers–An Imprint of Elsevier, 500: 105-150, 2006.

[17] S. R. Ahmad, M. Z. M. Rodzi, N. S. S. Nurhafizeh, M. M. Yusop, S. Ismail, “A Review of Feature Selection and Sentiment Analysis Technique in Issues of Propaganda,” International Journal of Advanced Computer Science and Applications, 10(11): 240-245, 2019.

[18] J. J. Shynk, "Performance surfaces of a single-layer perceptron," IEEE Transactions on Neural Networks, 1: 268-274, 1990.

[19] D. Michie, D. J. Spiegelhalter, and C. Taylor, "Machine learning," Neural and Statistical Classification, 13: 1-298, 1994.

[20] K.-L. Liu, W.-J. Li, and M. Guo, "Emoticon smoothed language models for twitter sentiment analysis," in Proc. The Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012: 1678–1684, 2012.

[21] A. Hassan, A. Abbasi, and D. Zeng, "Twitter sentiment analysis: A bootstrap ensemble framework," in Proc. International Conference on Social Computing,: 357-364, 2013.

[22] A. C. E. Lima, L. N. de Castro, and J. M. Corchado, "A polarity analysis framework for Twitter messages," Applied Mathematics and Computation, 270: 756-767, 2015.

[23] H. Keshavarz and M. S. Abadeh, "ALGA: Adaptive lexicon learning using genetic algorithm for sentiment analysis of microblogs," Knowledge-Based Systems, 122: 1-16, 2017.

[24] M. Bala, "Sentiment Classification Using Supervised and Unsupervised Approach," International Journal on Future Revolution in Computer Science & Communication Engineering, 3(11): 573-577, 2017.

[25] S. Haider, M. Tanvir Afzal, M. Asif, H. Maurer, A. Ahmad, A. Abuarqoub, "Impact analysis of adverbs for sentiment classification on Twitter product reviews," Concurrency and Computation: Practice and Experience: 1-15, 2018.

[26] M. Trupthi, S. Pabboju, G. Narsimha, "Possibilistic fuzzy C-means topic modelling for twitter sentiment analysis," International Journal of Intelligent Engineering and Systems, 11: 100-108, 2018.

[27] R. Asgarnezhad, S. A. Monadjemi, M. Soltanaghaei, A. Bagheri, "SFT: A model for sentiment classification using supervised methods in Twitter," Journal of Theoretical & Applied Information Technology, 96(8): 2242-2251, 2018.

[28] M. Abdolahi, M. Zahedi, "A new model for text coherence evaluation using statistical characteristics," Journal of Electrical and Computer Engineering Innovations (JECEI), 6: 15-24, 2018.

[29] I. Behravan, S. H. Zahiri, S. M. Razavi, R. Trasarti, "Clustering a Big Mobility Dataset Using an Automatic Swarm Intelligence-Based Clustering Method," Journal of Electrical and Computer Engineering Innovations (JECEI), 6: 243-262, 2018.

[30] S. Vashishtha, S. Susan, "Fuzzy rule based unsupervised sentiment analysis from social media posts," Expert Systems with Applications, 138: 1-15, 2019.

[31] A. K. Tripathi, K. Sharma, M. Bala, "Parallel hybrid bbo search method for twitter sentiment analysis of large scale datasets using mapreduce," International Journal of Information Security and Privacy (IJISP), 13: 106-122, 2019.

[32] K. Padmaja, N. P. Hegde, "Twitter sentiment analysis using adaptive neuro-fuzzy inference system with genetic algorithm," in 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), 2019: 498-503, 2019.

[33] A. K. Abbas, A. K. Salih, H. A. Hussein, Q. M. Hussein, S. A. Abdulwahhab, "Twitter Sentiment Analysis Using an Ensemble Majority Vote Classifier," Journal of Southwest Jiaotong University, 2020: 55: 1-7, 2020.

[34] N. Jiang, F. Tian, J. Li, X. Yuan, J. Zheng, "MAN: mutual attention neural networks model for aspect-level sentiment classification in SIoT," IEEE Internet of Things Journal, 7: 2901-2913, 2020.

[35] U. Naseem, I. Razzak, K. Musial, M. Imran, "Transformer based Deep Intelligent Contextual Embedding for Twitter sentiment analysis," Future Generation Computer Systems, 2020: 1-35, 2020.

[36] M. D. Samad, N. D. Khounviengxay, M. A. Witherow, "Effect of Text Processing Steps on Twitter Sentiment Classification using Word Embedding," arXiv preprint arXiv:2007.13027: 1-14, 2020.

[37] S. M. Nematollahzadeh, S. Ozgoli, M. Sayad Haghighi, "Parameter Identification Method for Opinion Dynamics Models: Tested via Real Experiments," Journal of Electrical and Computer Engineering Innovations (JECEI), 7: 121-131, 2019.

[38] S. Sharma, A. Jain, "An Empirical Evaluation of Correlation Based Feature Selection for Tweet Sentiment Classification," in Proc. Advances in Cybernetics, Cognition, and Machine Learning for Communication Technologies, ed: Springer, 2020: 199-208, 2020.

[39] E. Kouloumpis, T. Wilson, J. Moore, "Twitter sentiment analysis: The good the bad and the omg!," in Proc. The Fifth International Association for the Advancement of Artificial Intelligence Conf. on Weblogs and Social Media, 2011: 538-541, 2011.

[40] C. D. Manning, P. Raghavan, H. Schütze, Introduction to information retrieval 1: Cambridge university press Cambridge, 2008.

[41] S. Chandrakala, C. Sindhu, "Opinion Mining and sentiment classification a survey," ICTACT journal on soft computing, 3: 420-425, 2012.

[42] A. Tripathy, A. Agrawal, S. K. Rath, "Classification of sentiment reviews using n-gram machine learning approach," Expert Systems with Applications, 57: 117-126, 2016.

[43] E. Fersini, E. Messina, F. A. Pozzi, "Sentiment analysis: Bayesian ensemble learning," Decision support systems, 68: 26-38, 2014.

LETTERS TO EDITOR

Journal of Electrical and Computer Engineering Innovations (JECEI) welcomes letters to the editor for the post-publication discussions and corrections which allows debate post publication on its site, through the Letters to Editor. Letters pertaining to manuscript published in JECEI should be sent to the editorial office of JECEI within three months of either online publication or before printed publication, except for critiques of original research. Following points are to be considering before sending the letters (comments) to the editor.

[1] Letters that include statements of statistics, facts, research, or theories should include appropriate references, although more than three are discouraged.

[2] Letters that are personal attacks on an author rather than thoughtful criticism of the author’s ideas will not be considered for publication.

[3] Letters can be no more than 300 words in length.

[4] Letter writers should include a statement at the beginning of the letter stating that it is being submitted either for publication or not.

[5] Anonymous letters will not be considered.

[6] Letter writers must include their city and state of residence or work.

[7] Letters will be edited for clarity and length.

Name *

Email Address *

Affiliation *

Comments *

Security Code *

Journal of Electrical and Computer Engineering Innovations (JECEI)

A High-Performance Model based on Ensembles for Twitter Sentiment Classification

References

References

Send comment about this article

Volume 8, Issue 1
January 2020
Pages 41-52

A High-Performance Model based on Ensembles for Twitter Sentiment Classification

References

References

Send comment about this article

Volume 8, Issue 1January 2020Pages 41-52

Volume 8, Issue 1
January 2020
Pages 41-52