Machine Learning-Based Risky User Behavior Detection to Mitigate Ransomware Attacks on Higher Education Institutions

Saved in:
Bibliographic Details
Title: Machine Learning-Based Risky User Behavior Detection to Mitigate Ransomware Attacks on Higher Education Institutions
Language: English
Authors: Godfrey F. Mendes
Source: ProQuest LLC. 2024D.Engr. Dissertation, The George Washington University.
Availability: ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com/en-US/products/dissertations/individuals.shtml
Peer Reviewed: N
Page Count: 126
Publication Date: 2024
Document Type: Dissertations/Theses - Doctoral Dissertations
Education Level: Higher Education
Postsecondary Education
Descriptors: Colleges, Artificial Intelligence, Users (Information), Risk Assessment, Risk Management, Risk, Computer Security, Crime Prevention, Information Security, Computer Software Evaluation, Computer Software Selection, Praxis, Man Machine Systems
ISBN: 979-83-8363-811-8
Abstract: This Praxis develops a machine learning (ML) model to address ransomware threats in higher education institutions (HEIs). HEIs are vulnerable to cyberattacks due to their open-access environments, diverse user bases, and decentralized IT systems. These vulnerabilities are compounded by limited budgets, heightened risks from increased digital operations, and a lack of security awareness among its users. The research focuses on the critical role of user behavior in cybersecurity strategies and utilizes ML to proactively detect risky user behaviors that could lead to ransomware attacks. Utilizing the CERT r4.2 Insider Threat dataset, this Praxis evaluates five ML models: Random Forest, Gradient Boosting, XGBoost, Support Vector Classifier, and Convolutional Neural Networks, to analyze user behaviors across email, HTTP, file access, device use, and logon activities. The research employs a dual-layer method. It initially identifies malicious activities in Layer 1, and then aggregates these activities to determine user risk levels in Layer 2. It utilizes K-means clustering to categorize users into various risk categories and utilizes Explainable Artificial Intelligence techniques such as SHapley Additive exPlanations to enhance transparency and interpretability. Key outcomes indicate that behaviors linked to device usage and HTTP actions are significant predictors of risky behaviors. While email content is impactful, it does not play as central a role as device and HTTP activities. The Random Forest ML model is effective in detecting these behaviors. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
Abstractor: As Provided
Entry Date: 2024
Access URL: https://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:31487475
Accession Number: ED659986
Database: ERIC
Description
Abstract:This Praxis develops a machine learning (ML) model to address ransomware threats in higher education institutions (HEIs). HEIs are vulnerable to cyberattacks due to their open-access environments, diverse user bases, and decentralized IT systems. These vulnerabilities are compounded by limited budgets, heightened risks from increased digital operations, and a lack of security awareness among its users. The research focuses on the critical role of user behavior in cybersecurity strategies and utilizes ML to proactively detect risky user behaviors that could lead to ransomware attacks. Utilizing the CERT r4.2 Insider Threat dataset, this Praxis evaluates five ML models: Random Forest, Gradient Boosting, XGBoost, Support Vector Classifier, and Convolutional Neural Networks, to analyze user behaviors across email, HTTP, file access, device use, and logon activities. The research employs a dual-layer method. It initially identifies malicious activities in Layer 1, and then aggregates these activities to determine user risk levels in Layer 2. It utilizes K-means clustering to categorize users into various risk categories and utilizes Explainable Artificial Intelligence techniques such as SHapley Additive exPlanations to enhance transparency and interpretability. Key outcomes indicate that behaviors linked to device usage and HTTP actions are significant predictors of risky behaviors. While email content is impactful, it does not play as central a role as device and HTTP activities. The Random Forest ML model is effective in detecting these behaviors. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
ISBN:979-83-8363-811-8