What Can We Learn from College Students' Network Transactions? Constructing Useful Features for Student Success Prediction
Saved in:
| Title: | What Can We Learn from College Students' Network Transactions? Constructing Useful Features for Student Success Prediction |
|---|---|
| Language: | English |
| Authors: | Pytlarz, Ian, Pu, Shi, Patel, Monal, Prabhu, Rajini |
| Source: | International Educational Data Mining Society. 2018. |
| Availability: | International Educational Data Mining Society. e-mail: admin@educationaldatamining.org; Web site: http://www.educationaldatamining.org |
| Peer Reviewed: | Y |
| Page Count: | 5 |
| Publication Date: | 2018 |
| Document Type: | Speeches/Meeting Papers Reports - Research |
| Education Level: | Higher Education Postsecondary Education |
| Descriptors: | College Freshmen, Grade Point Average, At Risk Students, Academic Achievement, Computer Networks, Attendance, Learner Engagement, On Campus Students, Study Habits, Geographic Location, Student Behavior, Artificial Intelligence, Correlation, Data Analysis, Prediction |
| Geographic Terms: | Indiana |
| Abstract: | Identifying at-risk students at an early stage is a challenging task for colleges and universities. In this paper, we use students' oncampus network traffic volume to construct several useful features in predicting their first semester GPA. In particular, we build proxies for their attendance, class engagement, and out-of-class study hours based on their network traffic volume. We then test how much these network-based features can increase the performance of a model with only conventional features (e.g., demographics, high school GPA, standardized test scores, etc.). We labeled students as "above median" and "below median" students based on their first term GPA. Several machine learning models were then applied, ranging from logistic regression, SVM, and random forests, to AdaBoost. The result shows that the model with network-based features consistently outperforms the ones without, in terms of accuracy, f1 score, and AUC. Given that network activity data is readily available data in most colleges and universities, this study provides practical insights on how to build more powerful models to predict student success. [For the full proceedings, see ED593090.] |
| Abstractor: | As Provided |
| Entry Date: | 2019 |
| Accession Number: | ED593202 |
| Database: | ERIC |
| Abstract: | Identifying at-risk students at an early stage is a challenging task for colleges and universities. In this paper, we use students' oncampus network traffic volume to construct several useful features in predicting their first semester GPA. In particular, we build proxies for their attendance, class engagement, and out-of-class study hours based on their network traffic volume. We then test how much these network-based features can increase the performance of a model with only conventional features (e.g., demographics, high school GPA, standardized test scores, etc.). We labeled students as "above median" and "below median" students based on their first term GPA. Several machine learning models were then applied, ranging from logistic regression, SVM, and random forests, to AdaBoost. The result shows that the model with network-based features consistently outperforms the ones without, in terms of accuracy, f1 score, and AUC. Given that network activity data is readily available data in most colleges and universities, this study provides practical insights on how to build more powerful models to predict student success. [For the full proceedings, see ED593090.] |
|---|