Document Type : Original Research Paper
Authors
1 Computer Engineering Department, Shahid Rajaee Teacher Training University, Tehran, Iran. & Faculty of Computer Science, Kabul Polytechnic University (KPU), Kabul, Afghanistan.
2 Computer Engineering Department, Shahid Rajaee Teacher Training University, Tehran, Iran.
Abstract
Background and Objectives: Discussion forums in Massive Open Online Courses (MOOCs) enable students to interact with instructors and share educational concerns. However, identifying urgent posts within the vast volume of discussions poses significant challenges. High dropout rates and the need for timely responses to urgent queries highlight the importance of efficient classification systems. This study addresses the binary classification of student posts in the Stanford MOOC Posts dataset into urgent and non-urgent categories, and aims to improve performance in the presence of class imbalance.
Methods: We propose a hybrid deep learning model that integrates BERT-based contextual embeddings with a Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) architecture to capture both local textual features and long-term dependencies. To mitigate the class imbalance issue, BERT-based data augmentation technique was employed which enriches minority class samples, and enhance model generalization and urgent post detection. The model’s performance was compared against baseline methods, including CNN, LSTM, BiLSTM, and other state-of-the-art models. Evaluation metrics such as F1-weighted score and class-specific F1 scores were used to assess effectiveness.
Results: The model achieved a 93.3% F1-weighted score and an 84.1% F1 score for the urgent class which surpasses the best-performing state-of-the-art model by 0.6% and 0.8%, respectively. The results show the effectiveness of augmentation and hybrid architecture while identifying urgent posts, particularly within imbalanced datasets.
Conclusion: This research introduces a scalable and effective framework for urgent post detection in MOOCs. By leveraging BERT-based augmentation and a CNN–BiLSTM hybrid architecture that integrates contextual and sequential learning, the study provides automated forum analysis, offer timely insights for instructors and course designers to enhance students support, engagement, and retention.
Keywords
Main Subjects
Open Access
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit: http://creativecommons.org/licenses/by/4.0/
Publisher’s Note
JECEI Publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Publisher
Shahid Rajaee Teacher Training University
Send comment about this article