Document Type: Original Research Paper


1 Department of Computer Engineering, Isfahan (Khorasgan) Branch, Islamic Azad University, Isfahan, Iran

2 Senior Lecturer, School of continuing and lifelong education, National University of Singapore, Singapore, 119077 Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran



Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment target and seek for tweets containing positive, negative, or neutral opinions. This is remarkable for consumers to investigate the products before purchase automatically.
Methods: This paper suggests a model for sentiment classification. The goal of this model is to investigate what is the role of n-grams and sampling techniques in Sentiment Classification application using an ensemble method on Twitter datasets. Also, it examines both binary and multiple classifications, which are classified datasets into positive, negative, or neutral classes.
Results: Twitter Classification is an outstanding problem, which has very few free resources and not available due to modified authorization status. However, all Twitter datasets are not labeled and free, except for our applied dataset. We reveal that the combination of ensemble methods, sampling techniques, and n-grams can improve the accuracy of Twitter Sentiment Classification.
Conclusion: The results confirmed the superiority of the proposed model over state-of-the-art systems. The highest results obtained in terms of accuracy, precision, recall, and f-measure.


Main Subjects

[1] E. Kouloumpis, T. Wilson, J. D. Moore, "Twitter sentiment analysis: The good the bad and the omg!," in Proc. Fifth International AAAI conf. on weblogs and social media, 2011: 538-541, 2011.

[2] F. H. Khan, S. Bashir, U. Qamar, "TOM: Twitter opinion mining framework using hybrid classification scheme," Decision Support Systems, 57: 245-257, 2014.

[3] N. F. Da Silva, E. R. Hruschka, E. R. Hruschka, "Tweet sentiment analysis with classifier ensembles," Decision Support Systems, 66: 170-179, 2014.

[4] A. C. Pandey, D. S. Rajpoot, M. Saraswat, "Twitter sentiment analysis using hybrid cuckoo search method," Information Processing & Management, 53: 764-779, 2017.

[5] H. Saif, M. Fernández, Y. He, H. Alani, "On stopwords, filtering and data sparsity for sentiment analysis of twitter," in Proc. Ninth International Conf. on Language Resources and Evaluation, 2014: 810–817, 2014.

[6] D. Tang, F. Wei, N. Yang, M. Zhou, T. Liu, B. Qin, "Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification," in Proc. The 52nd Annual Meeting of the Association for Computational Linguistics, 2014: 1555-1565, 2014.

[7] B. Besbinar, D. Sarigiannis, P. Smeros, "Tweet Sentiment Classification," Lausanne, 2014.

[8] A. Montejo-Ráez, E. Martínez-Cámara, M. T. Martín-Valdivia, L. A. Ureña-López, "Ranked wordnet graph for sentiment polarity classification in twitter," Computer Speech & Language, 28: 93-107, 2014.

[9] D.-T. Vo, Y. Zhang, "Target-Dependent Twitter Sentiment Classification with Rich Automatic Features," in Proc. IJCAI,: 1347-1353, 2015.

[10] A. Go, R. Bhayani, L. Huang, "Twitter sentiment classification using distant supervision," CS224N Project Report, Stanford, 1, :1-6, 2009.

[11] L. Jiang, M. Yu, M. Zhou, X. Liu, T. Zhao, "Target-dependent twitter sentiment classification," in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies- 1: 151-160, 2011.

[12] H. Saif, Y. He, H. Alani, "Alleviating data sparsity for twitter sentiment analysis," in Proc. the 21st International Conference on theWorld Wide Web,: 2–9, 2012.

[13] L. Chen, W. Wang, M. Nagarajan, S. Wang,  A. P. Sheth, "Extracting Diverse Sentiment Expressions with Target-Dependent Polarity from Twitter," ICWSM, 2: 50-57, 2012.

[14] R. Asgarnezhad, K. Mohebbi, "A Comparative Classification of Approaches and Applications in Opinion Mining," International Academic Journal of Science and Engineering, 2(1): 68-80, 2015.

[15] R. Asgarnezhad, S. A. Monadjemi, M. Soltanaghaei," FAHPBEP: A fuzzy Analytic Hierarchy Process framework in text classification," accepted in Majlesi Journal of Electrical Engineering, 14(3), 2020.

[16] J. Han, "MichelineKamber. Data mining: concepts and techniques," Morgan Kaufmann Publishers–An Imprint of Elsevier, 500: 105-150, 2006.

[17] S. R. Ahmad, M. Z. M. Rodzi, N. S. S. Nurhafizeh, M. M. Yusop, S. Ismail, “A Review of Feature Selection and Sentiment Analysis Technique in Issues of Propaganda,” International Journal of Advanced Computer Science and Applications, 10(11): 240-245, 2019.

[18] J. J. Shynk, "Performance surfaces of a single-layer perceptron," IEEE Transactions on Neural Networks, 1: 268-274, 1990.

[19] D. Michie, D. J. Spiegelhalter, and C. Taylor, "Machine learning," Neural and Statistical Classification, 13: 1-298, 1994.

[20] K.-L. Liu, W.-J. Li, and M. Guo, "Emoticon smoothed language models for twitter sentiment analysis," in Proc. The Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012: 1678–1684, 2012.

[21] A. Hassan, A. Abbasi, and D. Zeng, "Twitter sentiment analysis: A bootstrap ensemble framework," in Proc. International Conference on Social Computing,: 357-364, 2013.

[22] A. C. E. Lima, L. N. de Castro, and J. M. Corchado, "A polarity analysis framework for Twitter messages," Applied Mathematics and Computation, 270: 756-767, 2015.

[23] H. Keshavarz and M. S. Abadeh, "ALGA: Adaptive lexicon learning using genetic algorithm for sentiment analysis of microblogs," Knowledge-Based Systems, 122: 1-16, 2017.

[24] M. Bala, "Sentiment Classification Using Supervised and Unsupervised Approach," International Journal on Future Revolution in Computer Science & Communication Engineering, 3(11): 573-577, 2017.

[25] S. Haider, M. Tanvir Afzal, M. Asif, H. Maurer, A. Ahmad, A. Abuarqoub, "Impact analysis of adverbs for sentiment classification on Twitter product reviews," Concurrency and Computation: Practice and Experience: 1-15, 2018.

[26] M. Trupthi, S. Pabboju, G. Narsimha, "Possibilistic fuzzy C-means topic modelling for twitter sentiment analysis," International Journal of Intelligent Engineering and Systems, 11: 100-108, 2018.

[27] R. Asgarnezhad, S. A. Monadjemi, M. Soltanaghaei, A. Bagheri, "SFT: A model for sentiment classification using supervised methods in Twitter," Journal of Theoretical & Applied Information Technology, 96(8): 2242-2251, 2018.

[28] M. Abdolahi, M. Zahedi, "A new model for text coherence evaluation using statistical characteristics," Journal of Electrical and Computer Engineering Innovations (JECEI), 6: 15-24, 2018.

[29] I. Behravan, S. H. Zahiri, S. M. Razavi, R. Trasarti, "Clustering a Big Mobility Dataset Using an Automatic Swarm Intelligence-Based Clustering Method," Journal of Electrical and Computer Engineering Innovations (JECEI), 6: 243-262, 2018.

[30] S. Vashishtha, S. Susan, "Fuzzy rule based unsupervised sentiment analysis from social media posts," Expert Systems with Applications, 138: 1-15, 2019.

[31] A. K. Tripathi, K. Sharma, M. Bala, "Parallel hybrid bbo search method for twitter sentiment analysis of large scale datasets using mapreduce," International Journal of Information Security and Privacy (IJISP), 13: 106-122, 2019.

[32] K. Padmaja, N. P. Hegde, "Twitter sentiment analysis using adaptive neuro-fuzzy inference system with genetic algorithm," in 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), 2019: 498-503, 2019.

[33]   A. K. Abbas, A. K. Salih, H. A. Hussein, Q. M. Hussein, S. A. Abdulwahhab, "Twitter Sentiment Analysis Using an Ensemble Majority Vote Classifier," Journal of Southwest Jiaotong University, 2020: 55: 1-7, 2020.

[34] N. Jiang, F. Tian, J. Li, X. Yuan,  J. Zheng, "MAN: mutual attention neural networks model for aspect-level sentiment classification in SIoT," IEEE Internet of Things Journal, 7: 2901-2913, 2020.

[35] U. Naseem, I. Razzak, K. Musial, M. Imran, "Transformer based Deep Intelligent Contextual Embedding for Twitter sentiment analysis," Future Generation Computer Systems, 2020: 1-35, 2020.

[36] M. D. Samad, N. D. Khounviengxay, M. A. Witherow, "Effect of Text Processing Steps on Twitter Sentiment Classification using Word Embedding," arXiv preprint arXiv:2007.13027: 1-14, 2020.

[37] S. M. Nematollahzadeh, S. Ozgoli, M. Sayad Haghighi, "Parameter Identification Method for Opinion Dynamics Models: Tested via Real Experiments," Journal of Electrical and Computer Engineering Innovations (JECEI), 7: 121-131, 2019.

[38] S. Sharma, A. Jain, "An Empirical Evaluation of Correlation Based Feature Selection for Tweet Sentiment Classification," in Proc. Advances in Cybernetics, Cognition, and Machine Learning for Communication Technologies, ed: Springer, 2020: 199-208, 2020.

[39] E. Kouloumpis, T. Wilson, J. Moore, "Twitter sentiment analysis: The good the bad and the omg!," in Proc. The Fifth International Association for the Advancement of Artificial Intelligence Conf. on Weblogs and Social Media, 2011: 538-541, 2011.

[40] C. D. Manning, P. Raghavan, H. Schütze, Introduction to information retrieval 1: Cambridge university press Cambridge, 2008.

[41]   S. Chandrakala, C. Sindhu, "Opinion Mining and sentiment classification a survey," ICTACT journal on soft computing, 3: 420-425, 2012.

[42] A. Tripathy, A. Agrawal, S. K. Rath, "Classification of sentiment reviews using n-gram machine learning approach," Expert Systems with Applications, 57: 117-126, 2016.

[43] E. Fersini, E. Messina, F. A. Pozzi, "Sentiment analysis: Bayesian ensemble learning," Decision support systems, 68: 26-38, 2014.