ECSYAPPS – A framework for analyzing the effectiveness of classification techniques for early prediction of students academic performance in education sector.

Saved in:
Bibliographic Details
Title: ECSYAPPS – A framework for analyzing the effectiveness of classification techniques for early prediction of students academic performance in education sector.
Authors: Dol, Sunita M.1 (AUTHOR) sunita_aher@yahoo.com, Jawandhiya, Pradip M.2 (AUTHOR) pmjawandhiya@gmail.com
Source: Engineering Applications of Artificial Intelligence. Aug2024, Vol. 134, pN.PAG-N.PAG. 1p.
Subjects: Naive Bayes classification, Decision trees, Boosting algorithms, Fisher discriminant analysis, Academic achievement, Support vector machines, Classification algorithms, K-nearest neighbor classification
Abstract: Predicting students' academic performance is of paramount importance to educational institutions. If students' academic performance is predicted course-wise, semester-wise and year-wise then it will be helpful for students to a great extent. In this research article, we generate twenty nine dataset with the help of students' result analysis. Datasets 1 to 9 is the datasets of students' result in first attempt while datasets 10 to 19 represent datasets of students after passing all courses of semesters and datasets 20 to 29 is overall dataset of students' result analysis. To predict the course-wise, semester-wise and year-wise performance of students, we developed a framework titled as E CSY APPS (E ducational C ourse S emester Y ear-wise A cademic P erformance Prediction S ystem) based on classification techniques and designed algorithm for analyzing students' performance in education sector. This E CSY APPS predicts the course-wise, semester-wise and year-wise grade of students. Fifteen classification algorithms such as Logistic Regression (LR), Naive Bayes (NB), Decision Tree (DT), Random Forest (RF), Support Vector Machines (SVM), Linear Support Vector Classification (LSVC), K-Nearest Neighbors (KNN), Gradient Boosting (GB), Adaptive Boosting (AdaBoost), Bagging, Extreme Gradient Boosting (XGBoost), Light Gradient-Boosting Machine (LightGBM), Categorical Boosting (CatBoost), Linear Discriminant Analysis (LDA), and Stochastic Gradient Descent Classifier (SGDC) are selected and applied on 29 datasets and compared on basis of performance parameters such as Accuracy, Precision, Recall, F1-score and Mean Absolute Error (MAE). These fifteen classification algorithms are under ten classifiers such as Linear Model, Naive Bayes, Tree, Support Vector Machine, Nearest Neighbors, Ensemble, xgboost, lightgbm, Discriminant Analysis and catboost. If the accuracy of two more classification algorithms for a dataset is same then in that case Precision, Recall, F1-score and Mean Absolute Error (MAE) are compared for deciding the best classification algorithm for the dataset. In this way, best classification algorithm is selected for each of 29 datasets. It is found that classifiers xgboost works best for eleven datasets while ensemble techniques Gradient Boosting and AdaBoost work best for two datasets and six datasets respectively among 29 datasets. Other classification algorithms such as Decision Tree, LightGBM, LDA, and KNN are noted to be best classification algorithm for two, four, three, and two datasets respectively. This framework is tested on new eight datasets related to students' result two methods such as K-Fold Cross-validation and Train-Validation-Test method. The results of this framework on new datasets shows that accuracy obtained on test dataset or validation dataset as compared to the accuracy obtained on old dataset is less than 6%. This framework will be helpful for students as well instructor. For students, it will help them to improve the performance of difficult courses from students'point of view in the examination while faculty can use this framework to improve pedagogical practices. [ABSTRACT FROM AUTHOR]
Copyright of Engineering Applications of Artificial Intelligence is the property of Pergamon Press - An Imprint of Elsevier Science and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
Description
Abstract:Predicting students' academic performance is of paramount importance to educational institutions. If students' academic performance is predicted course-wise, semester-wise and year-wise then it will be helpful for students to a great extent. In this research article, we generate twenty nine dataset with the help of students' result analysis. Datasets 1 to 9 is the datasets of students' result in first attempt while datasets 10 to 19 represent datasets of students after passing all courses of semesters and datasets 20 to 29 is overall dataset of students' result analysis. To predict the course-wise, semester-wise and year-wise performance of students, we developed a framework titled as E CSY APPS (E ducational C ourse S emester Y ear-wise A cademic P erformance Prediction S ystem) based on classification techniques and designed algorithm for analyzing students' performance in education sector. This E CSY APPS predicts the course-wise, semester-wise and year-wise grade of students. Fifteen classification algorithms such as Logistic Regression (LR), Naive Bayes (NB), Decision Tree (DT), Random Forest (RF), Support Vector Machines (SVM), Linear Support Vector Classification (LSVC), K-Nearest Neighbors (KNN), Gradient Boosting (GB), Adaptive Boosting (AdaBoost), Bagging, Extreme Gradient Boosting (XGBoost), Light Gradient-Boosting Machine (LightGBM), Categorical Boosting (CatBoost), Linear Discriminant Analysis (LDA), and Stochastic Gradient Descent Classifier (SGDC) are selected and applied on 29 datasets and compared on basis of performance parameters such as Accuracy, Precision, Recall, F1-score and Mean Absolute Error (MAE). These fifteen classification algorithms are under ten classifiers such as Linear Model, Naive Bayes, Tree, Support Vector Machine, Nearest Neighbors, Ensemble, xgboost, lightgbm, Discriminant Analysis and catboost. If the accuracy of two more classification algorithms for a dataset is same then in that case Precision, Recall, F1-score and Mean Absolute Error (MAE) are compared for deciding the best classification algorithm for the dataset. In this way, best classification algorithm is selected for each of 29 datasets. It is found that classifiers xgboost works best for eleven datasets while ensemble techniques Gradient Boosting and AdaBoost work best for two datasets and six datasets respectively among 29 datasets. Other classification algorithms such as Decision Tree, LightGBM, LDA, and KNN are noted to be best classification algorithm for two, four, three, and two datasets respectively. This framework is tested on new eight datasets related to students' result two methods such as K-Fold Cross-validation and Train-Validation-Test method. The results of this framework on new datasets shows that accuracy obtained on test dataset or validation dataset as compared to the accuracy obtained on old dataset is less than 6%. This framework will be helpful for students as well instructor. For students, it will help them to improve the performance of difficult courses from students'point of view in the examination while faculty can use this framework to improve pedagogical practices. [ABSTRACT FROM AUTHOR]
ISSN:09521976
DOI:10.1016/j.engappai.2024.108688