Data Mining
R. Asgarnezhad; A. Monadjemi; M. SoltanAghaei
Abstract
Background and Objectives: With the extensive web applications, review sentiment classification has attracted increasing interest among text mining works. Traditional approaches did not indicate multiple relationships connecting words while emphasizing the preprocessing phase and data reduction techniques, ...
Read More
Background and Objectives: With the extensive web applications, review sentiment classification has attracted increasing interest among text mining works. Traditional approaches did not indicate multiple relationships connecting words while emphasizing the preprocessing phase and data reduction techniques, making a huge performance difference in classification. Methods: This study suggests a model as an efficient model for sentiment classification combining preprocessing techniques, sampling methods, feature selection methods, and ensemble supervised classification to increase the classification performance. In the feature selection phase of the proposed model, we applied n-grams, which is a computational method, to optimize the feature selection procedure by extracting features based on the relationships of the words. Then, the best-selected feature through the particle swarm optimization algorithm to optimize the feature selection procedure by iteratively trying to improve feature selection. Results: In the experimental study, a comprehensive range of comparative experiments conducted to assess the effectiveness of the proposed model using the best in the literature on Twitter datasets. The highest performance of the proposed model obtains 97.33, 92.61, 97.16, and 96.23% in terms of precision, accuracy, recall, and f-measure, respectively.Conclusion: The proposed model classifies the sentiment of tweets and online reviews through ensemble methods. Besides, two sampling techniques had applied in the preprocessing phase. The results confirmed the superiority of the proposed model over state-of-the-art systems.
Computational Intelligence
Zeinab Khatoun Pourtaheri
Abstract
Background and Objectives: According to the random nature of heuristic algorithms, stability analysis of heuristic ensemble classifiers has particular importance.Methods: The novelty of this paper is using a statistical method consists of Plackett-Burman design, and Taguchi for the first time to specify ...
Read More
Background and Objectives: According to the random nature of heuristic algorithms, stability analysis of heuristic ensemble classifiers has particular importance.Methods: The novelty of this paper is using a statistical method consists of Plackett-Burman design, and Taguchi for the first time to specify not only important parameters, but also optimal levels for them. Minitab and Design Expert software programs are utilized to achieve the stability goals of this research.Results: The proposed approach is useful as a preprocessing method before employing heuristic ensemble classifiers; i.e., first discover optimal levels of important parameters and then apply these parameters to heuristic ensemble classifiers to attain the best results. Another significant difference between this research and previous works related to stability analysis is the definition of the response variable; an average of three criteria of the Pareto front is used as response variable.Finally, to clarify the performance of this method, obtained optimal levels are applied to a typical multi-objective heuristic ensemble classifier, and its results are compared with the results of using empirical values; obtained results indicate improvements in the proposed method.Conclusion: This approach can analyze more parameters with less computational costs in comparison with previous works. This capability is one of the advantages of the proposed method.