Document Type: Original Research Paper


Faculty of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran



Background: Prediction of students' academic performance is essential for systems emphasizing students' greater success. The results can largely lead to increase in the quality of the educating and learning. Through the application of data mining, useful and innovative patterns can be extracted from the educational data.
Methods: In this paper, a new metaheuristic algorithm, combination of simulated annealing and genetic algorithms, is proposed for predicting students’ academic performance in educational data mining. Although metaheuristic algorithms are one of the best options for discovering the hidden relationships between data in data science, they do not separately perform well in accurate prediction of students’ academic performance. Therefore, the proposed method integrates the advantages of both genetic and simulated annealing algorithms. The genetic algorithm is applied to explore new solutions, while simulated annealing is used to increase the exploitation power. By using this combination, the proposed algorithm has been able to predict the students’ academic performance with high accuracy.
Results: The efficiency of the proposed algorithm is evaluated on five different educational data sets, including two data sets of students of Shahid Rajaee University of Tehran and three online educational data sets. Our experimental results show  and   accuracy improvement of the proposed algorithm in comparison to the four similar metaheuristic and five popular classification methods respectively.


Main Subjects

[1]  E. Black, K. Dawson, J. Priem, “Data for free: Using LMS activity logs to measure community in online courses,” The Internet and Higher Education, 11(2): 65-70, 2008.

[2] C. Chen, Y. Chen, C. Liu, “Learning performance assessment approach using web-based learning portfolios for e-learning systems,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 37(6): 1349-1359, 2007.

[3] J. Hung, K. Zhang, “Revealing online learning behaviors and activity patterns and making predictions with data mining techniques in online teaching,” MERLOT Journal of Online Learning and Teaching, 4(4): 426-437, 2008.

[4]  M. Roblyer, L. Davis, S. Mills, J. Marshall, L. Pape, “Toward practical procedures for predicting and promoting success in virtual school students,” The Amer. Jrnl. of Distance Education, 22(2): 90-109, 2008.

[5]  A. Juan, T. Daradoumis, J. Faulin, F. Xhafa, “Developing an information system for monitoring student's activity in online collaborative learning,” in Proc. International Conference on Complex, Intelligent and Software Intensive Systems: 270-275, 2008.

[6]  E. Talbi, Metaheuristics: from design to implementation. John Wiley & Sons; 2009.

[7]  E. Talbi, “A taxonomy of hybrid metaheuristics,” Journal of heuristics, 8(5): 541-564, 2002.

[8] I. Oh, J. Lee, B. Moon, “Hybrid genetic algorithms for feature selection,” IEEE Transactions on pattern analysis and machine intelligence, 26(11): 1424-37, 2004.

[9] O. Martin, S. Otto, “Combining simulated annealing with local search heuristics. Annals of Operations Research,” 63(1): 57-75, 1996.

[10] K. Lenin, B. Reddy, M. Suryakalavathi, “Hybrid Tabu search-simulated annealing method to solve optimal reactive power problem,” International Journal of Electrical Power & Energy Systems, 82: 87-91, 2016.

[11] Y. Lin, Z. Bian, X. Liu, “Developing a dynamic neighborhood structure for an adaptive hybrid simulated annealing–tabu search algorithm to solve the symmetrical traveling salesman problem,” Applied Soft Computing, 49: 937-52, 2016.

[12] J. Kennedy, R. Eberhart, “Particle swarm optimization,” in Proc. of ICNN'95-International Conference on Neural Networks, 4: 1942-1948, 1995.

[13] D. Karaboga, “An idea based on honey bee swarm for numerical optimization,” Technical report-tr06, Erciyes university, engineering faculty, computer engineering department; 2005.

[14] Z. Meng, J. Pan, “HARD-DE: Hierarchical archive-based mutation strategy with depth information of evolution for the enhancement of differential evolution on numerical optimization, IEEE Access, 7: 12832-54, 2019.

[15] H. Wang, W. Wang, H. Sun, S. Rahnamayan, “Firefly algorithm with random attraction,” International Journal of Bio-Inspired Computation, 8(1): 33-41, 2016.

[16] G. Wang, S. Deb, L. Coelho, “Earthworm optimisation algorithm: a bio-inspired metaheuristic algorithm for global optimisation problems,” International Journal of Bio-Inspired Computation, 12(1): 1-22, 2018.

[17] A. Singh, K. Deep, “Exploration–exploitation balance in Artificial Bee Colony algorithm: a critical analysis,” Soft Computing, 23(19): 9525-9536, 2019.

[18] H. Braun, “On solving travelling salesman problems by genetic algorithms,” in Proc. International Conference on Parallel Problem Solving from Nature: 129-133, 1990.

[19] Y. Deng, Y. Liu, D. Zhou, “An improved genetic algorithm with initial population strategy for symmetric TSP,” Mathematical Problems in Engineering, 2015: 1-7, 2015.

[20] C. Romero, S. Ventura, “Data mining in education,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 3(1): 12-27, 2013.

[21] M. Tsiakmaki, G. Kostopoulos, S. Kotsiantis, O. Ragos, “Implementing AutoML in educational data mining for prediction tasks,” Applied Sciences, 10(1): 90, 2020.

[22] T. Soffer, A. Cohen, “Students' engagement characteristics predict success and completion of online courses,” Journal of Computer Assisted Learning, 35(3): 378-89, 2019.

[23] A. Rajeswari, C. Deisy, “Fuzzy logic based associative classifier for slow learners prediction,” Journal of Intelligent & Fuzzy Systems, 36(3): 2691-704, 2019.

[24] R. Morsomme, E. Smirnov, “Conformal Prediction for Students’ Grades in a Course Recommender System,” In Conformal and Probabilistic Prediction and Applications, 2019: 196-213, 2019.

[25] A. Sarra, L. Fontanella , S. Di Zio, “Identifying students at risk of academic failure within the educational data mining framework,” Social Indicators Research, 146(1-2): 41-60, 2019.

[26] A. Agudo-Peregrina, S. Iglesias-Pradas, M. Conde-González, A. Hernández-García, “Can we predict success from log data in VLEs? Classification of interactions for learning analytics and their relation with performance in VLE-supported F2F and online learning,” Computers in human behavior, 31: 542-50, 2014.

[27] C. Herodotou, B. Rienties, A. Boroowa, Z. Zdrahal, M. Hlosta, “A large-scale implementation of predictive learning analytics in higher education: the teachers’ role and perspective,” Educational Technology Research and Development, 67(5): 1273-306, 2019.

[28] P. Strecht, L. Cruz, C. Soares, J. Mendes-Moreira, “A Comparative Study of Classification and Regression Algorithms for Modelling Students' Academic Performance,” International Educational Data Mining Society, 2015: 392- 395, 2015.

[29] G. Kostopoulos, S. Kotsiantis, N. Fazakis, G. Koutsonikos, C. Pierrakeas, “A Semi-Supervised Regression Algorithm for Grade Prediction of Students in Distance Learning Courses,” International Journal on Artificial Intelligence Tools, 28(04): 1940001, 2019.

[30] A. Mubarak, H. Cao, W. Zhang, “Prediction of students’ early dropout based on their interaction logs in online learning environment,” Interactive Learning Environments, 20: 1-20, 2020.

[31] C. Burgos, M. Campanario, D. de la Peña, J. Lara, D. Lizcano, M. Martínez, “Data mining for modeling students’ performance: A tutoring action plan to prevent academic dropout,” Computers & Electrical Engineering, 66: 541-56, 2018.

[32] K. Chui, D. Fung, M. Lytras, M. Lam, “Predicting at-risk university students in a virtual learning environment via a machine learning algorithm,” Computers in Human Behavior, 107: 105584, 2020.

[33] E. Costa, B. Fonseca, M. Santana, E. de Araújo, J. Rego, “Evaluating the effectiveness of educational data mining techniques for early prediction of students' academic failure in introductory programming courses,” Computers in Human Behavior, 73: 247-56, 2017.

[34] J. Wang, C. Liu, K. Li, “A hybrid simulated annealing for scheduling in dual-resource cellular manufacturing system considering worker movement,” Automatika, 60(2): 172-80, 2019.

[35] S. Sreng, N. Maneerat, K. Hamamoto, R. Panjaphongse, “Automated diabetic retinopathy screening system using hybrid simulated annealing and ensemble bagging classifier,” Applied Sciences, 8(7): 1198, 2018.

[36] M. Mafarja, S. Mirjalili, “Hybrid whale optimization algorithm with simulated annealing for feature selection,” Neurocomputing, 260: 302-12, 2017.

[37] P. Vasant P, “Hybrid simulated annealing and genetic algorithms for industrial production management problems,” International Journal of Computational Methods, 7(02): 279-97, 2010.

[38] Z. Li, P. Schonfeld, “Hybrid simulated annealing and genetic algorithm for optimizing arterial signal timings under oversaturated traffic conditions,” Journal of advanced transportation, 49(1): 153-70, 2015.

[39] Y. Li, H. Guo, L. Wang, J. Fu, “A hybrid genetic-simulated annealing algorithm for the location-inventory-routing problem considering returns under E-supply chain environment,” The Scientific World Journal, 2013: 1-11, 2013.

[40] L. Junghans, N. Darde, “Hybrid single objective genetic algorithm coupled with the simulated annealing optimization method for building optimization,” Energy and Buildings, 86: 651-62, 2015.

[41] H. Wei, S. Li, H. Jiang, J. Hu, J. Hu, “Hybrid genetic simulated annealing algorithm for improved flow shop scheduling with make span criterion,” Applied Sciences, 8(12): 2621, 2018.

[42] F. Erchiqui, “Application of genetic and simulated annealing algorithms for optimization of infrared heating stage in thermoforming process,” Applied Thermal Engineering, 128: 1263-72, 2018.

[43] B. Minaei-Bidgoli, W. Punch, “Using genetic algorithms for data mining optimization in an educational web-based system,” in Proc. Genetic and evolutionary computation conference 2003: 2252-2263, 2003.

[44] S. Natek, M. Zwilling, “Student data mining solution–knowledge management system related to higher education institutions,” Expert systems with applications, 41(14): 6400-6407, 2014.

[45] I. papadogiannis, V. Poulopoulos, M. Wallace, “A Critical Review of Data Mining for Education: What has been done, what has been learnt and what remains to be seen,” International Journal of Educational Research Review, 5(4):353-372, 2020.

[46] A. RİMİ, A. IBRAHİM, O. BAYAT, “Developing Classifier for the Prediction of Students’ Performance Using Data Mining Classification Techniques,” AURUM Mühendislik Sistemleri ve Mimarlık Dergisi, 4(1):73-91, 2020.

[47] H. Hassan, R. Mohamad, R. Ali, Y. Talib, H. Hsbollah, “Factors Affecting Students’ Academic Performance in Higher Education: Evidence from Accountancy Degree Program,” International Business Education Journal, 13(1): 1-6, 2020.

[48] P. Kamal, S. Ahuja, “An ensemble-based model for prediction of academic performance of students in undergrad professional course,” Journal of Engineering, Design and Technology, 2019.

[49] B. Kapur, N. Ahluwalia, R. Sathyaraj, “Comparative study on marks prediction using data mining and classification algorithms,” International Journal of Advanced Research in Computer Science, 8(3): 1-5, 2017.

[50] C. Siebra, R. Santos, N. Lino, “A Self-Adjusting Approach for Temporal Dropout Prediction of E-Learning Students,” International Journal of Distance Education Technologies (IJDET), 18(2):19-33, 2020.

[51] S. Sivakumar, S. Venkataraman, R. Selvaraj, “Predictive modeling of student dropout indicators in educational data mining using improved decision tree,” Indian Journal of Science and Technology, 9(4): 1-5, 2016.

[52] P. Kaur, M. Singh, G. Josan, “Classification and prediction-based data mining algorithms to predict slow learners in education sector,” Procedia Computer Science, 57: 500-8, 2015.

[53] M. Hussain, W. Zhu, W. Zhang, S. Abidi, “Student engagement predictions in an e-learning system and their impact on student course assessment scores,” Computational intelligence and neuroscience, 2018..

[54] J. McCall, “Genetic algorithms for modelling and optimisation. Journal of computational and Applied Mathematics,” 184(1): 205-22, 2005.

[55] S. Kirkpatrick, C. Gelatt, M. Vecchi, “Optimization by simulated annealing,” science, 220(4598): 671-680, 1983.

[56] T. Wu, C. Chang, S. Chung, “A simulated annealing algorithm for manufacturing cell formation problems,” Expert Systems with Applications, 34(3): 1609-1617, 2008.

[57] H. Alshamlan, G. Badr, Y. Alohali, “Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification,” Computational biology and chemistry, 56: 49-60, 2015.

[58] D. Karaboga, B. Basturk, “On the performance of artificial bee colony (ABC) algorithm,” Applied soft computing, 8(1): 687-97, 2008.

[59] P. Cortez, A. Silva. Using data mining to predict secondary school student performance, in Proc. 5th Annual Future Business Technology Conference: 5-12, 2008.

[60] M. Vahdat, L. Oneto, D. Anguita, M. Funk, M. Rauterberg, “learning analytics approach to correlate the academic achievements of students with interaction data from an educational simulator,” in Proc. Design for teaching and learning in a networked world 2015: 352-366, 2015.