An Ensemble Machine Learning Approach for Detecting and Classifying Malware Attacks on Mobile Devices.

Saved in:
Bibliographic Details
Title: An Ensemble Machine Learning Approach for Detecting and Classifying Malware Attacks on Mobile Devices.
Authors: Alsharif, Eiman1 (AUTHOR) tu4359251@taibahu.edu.sa, Alharby, Maher2 (AUTHOR) mharby@taibahu.edu.sa
Source: Arabian Journal for Science & Engineering (Springer Science & Business Media B.V. ). Oct2025, Vol. 50 Issue 19, p15825-15841. 17p.
Subjects: Malware, Android (Operating system), Ensemble learning, Supervised learning, Mobile apps, Web-based user interfaces, Malware prevention, Data scrubbing, Feature selection
Abstract: The widespread use of mobile devices makes them targets for cybercriminals, especially with the rise of malware. Existing malware detection studies have limitations. These include focusing on subsets of datasets, using single classification approaches, and lacking usability in practical applications. This research develops a stacking ensemble method for detecting and classifying malware attacks on Android devices, employing supervised machine learning algorithms like Random Forest, Decision Tree, Gaussian Naive Bayes, K-Nearest Neighbors, and Logistic Regression. Using the CIC-AndMal2017 dataset, we apply data preprocessing techniques to address missing data and data imbalance. We employ various feature selection methods, including Random Forest Importance, Principal Component Analysis, and Correlation-Based Selection, to help reduce data dimensionality. We also utilize a grid search technique for hyperparameter tuning. We assess model performance using evaluation metrics, including accuracy, precision, recall, and F1 score. Additionally, we measure training and prediction times to ensure efficiency. The stacking technique achieved remarkable results, with 99.86% across all metrics (accuracy, precision, recall, and F1 score) for binary classification. For multi-class classification, the results were 97.0% accuracy, 97.03% precision, 97.07% recall, and 97.03% F1 score. Finally, we develop a user-friendly web application to enhance the accessibility and usability of the proposed models in detecting Android malware, ensuring broader adoption and practical application of the developed models. [ABSTRACT FROM AUTHOR]
Copyright of Arabian Journal for Science & Engineering (Springer Science & Business Media B.V. ) is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
Full text is not displayed to guests.
Description
Abstract:The widespread use of mobile devices makes them targets for cybercriminals, especially with the rise of malware. Existing malware detection studies have limitations. These include focusing on subsets of datasets, using single classification approaches, and lacking usability in practical applications. This research develops a stacking ensemble method for detecting and classifying malware attacks on Android devices, employing supervised machine learning algorithms like Random Forest, Decision Tree, Gaussian Naive Bayes, K-Nearest Neighbors, and Logistic Regression. Using the CIC-AndMal2017 dataset, we apply data preprocessing techniques to address missing data and data imbalance. We employ various feature selection methods, including Random Forest Importance, Principal Component Analysis, and Correlation-Based Selection, to help reduce data dimensionality. We also utilize a grid search technique for hyperparameter tuning. We assess model performance using evaluation metrics, including accuracy, precision, recall, and F1 score. Additionally, we measure training and prediction times to ensure efficiency. The stacking technique achieved remarkable results, with 99.86% across all metrics (accuracy, precision, recall, and F1 score) for binary classification. For multi-class classification, the results were 97.0% accuracy, 97.03% precision, 97.07% recall, and 97.03% F1 score. Finally, we develop a user-friendly web application to enhance the accessibility and usability of the proposed models in detecting Android malware, ensuring broader adoption and practical application of the developed models. [ABSTRACT FROM AUTHOR]
ISSN:2193567X
DOI:10.1007/s13369-025-10011-5