View in EDS

What Can We Learn from College Students' Network Transactions? Constructing Useful Features for Student Success Prediction

Saved in:

Bibliographic Details
Title:	What Can We Learn from College Students' Network Transactions? Constructing Useful Features for Student Success Prediction
Language:	English
Authors:	Pytlarz, Ian, Pu, Shi, Patel, Monal, Prabhu, Rajini
Source:	International Educational Data Mining Society. 2018.
Availability:	International Educational Data Mining Society. e-mail: admin@educationaldatamining.org; Web site: http://www.educationaldatamining.org
Peer Reviewed:	Y
Page Count:	5
Publication Date:	2018
Document Type:	Speeches/Meeting Papers Reports - Research
Education Level:	Higher Education Postsecondary Education
Descriptors:	College Freshmen, Grade Point Average, At Risk Students, Academic Achievement, Computer Networks, Attendance, Learner Engagement, On Campus Students, Study Habits, Geographic Location, Student Behavior, Artificial Intelligence, Correlation, Data Analysis, Prediction
Geographic Terms:	Indiana
Abstract:	Identifying at-risk students at an early stage is a challenging task for colleges and universities. In this paper, we use students' oncampus network traffic volume to construct several useful features in predicting their first semester GPA. In particular, we build proxies for their attendance, class engagement, and out-of-class study hours based on their network traffic volume. We then test how much these network-based features can increase the performance of a model with only conventional features (e.g., demographics, high school GPA, standardized test scores, etc.). We labeled students as "above median" and "below median" students based on their first term GPA. Several machine learning models were then applied, ranging from logistic regression, SVM, and random forests, to AdaBoost. The result shows that the model with network-based features consistently outperforms the ones without, in terms of accuracy, f1 score, and AUC. Given that network activity data is readily available data in most colleges and universities, this study provides practical insights on how to build more powerful models to predict student success. [For the full proceedings, see ED593090.]
Abstractor:	As Provided
Entry Date:	2019
Accession Number:	ED593202
Database:	ERIC

Full Text from ERIC

Description
Abstract:	Identifying at-risk students at an early stage is a challenging task for colleges and universities. In this paper, we use students' oncampus network traffic volume to construct several useful features in predicting their first semester GPA. In particular, we build proxies for their attendance, class engagement, and out-of-class study hours based on their network traffic volume. We then test how much these network-based features can increase the performance of a model with only conventional features (e.g., demographics, high school GPA, standardized test scores, etc.). We labeled students as "above median" and "below median" students based on their first term GPA. Several machine learning models were then applied, ranging from logistic regression, SVM, and random forests, to AdaBoost. The result shows that the model with network-based features consistently outperforms the ones without, in terms of accuracy, f1 score, and AUC. Given that network activity data is readily available data in most colleges and universities, this study provides practical insights on how to build more powerful models to predict student success. [For the full proceedings, see ED593090.]