An Improved Detection of Cyberbullying on Social Media Using Randomized Sampling
Saved in:
| Title: | An Improved Detection of Cyberbullying on Social Media Using Randomized Sampling |
|---|---|
| Language: | English |
| Authors: | Nitasha Dhingra, Suhani Chawla, Oshin Saini, Rishabh Kaushal (ORCID |
| Source: | International Journal of Bullying Prevention. 2025 7(3):166-178. |
| Availability: | Springer. Available from: Springer Nature. One New York Plaza, Suite 4600, New York, NY 10004. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-460-1700; e-mail: customerservice@springernature.com; Web site: https://link.springer.com/ |
| Peer Reviewed: | Y |
| Page Count: | 13 |
| Publication Date: | 2025 |
| Document Type: | Journal Articles Reports - Research |
| Descriptors: | Bullying, Social Media, Computer Mediated Communication, Context Effect, Classification, Identification |
| DOI: | 10.1007/s42380-023-00188-4 |
| ISSN: | 2523-3653 2523-3661 |
| Abstract: | Due to the pandemic, the world's dependence shifted to online platforms. It has made all age groups vulnerable to cyberbullying. Now more than ever, there is a need for online behavior monitoring. Existing algorithms tend to classify friendly banter as cyberbullying. They make use of binary classification by identifying offensive keywords. The lack of analysis of the context of data posted and the unavailability of public training data makes it challenging to train models accurately. Our models and research focus on the larger picture by making use of context as a significant parameter during the classification. The dataset chosen was such that its annotation was based on 5 parameters that considered the context of conversations happening online. This paper executes various machine learning algorithms, SVM, random forest, AdaBoost, and MLP algorithms, on a benchmark cyberbullying-representations dataset extracted from Twitter. We conducted randomized oversampling on the best-performing SVM model, which resulted in a significantly higher average F1 score outperforming the baseline score. |
| Abstractor: | As Provided |
| Entry Date: | 2026 |
| Accession Number: | EJ1492569 |
| Database: | ERIC |
| Abstract: | Due to the pandemic, the world's dependence shifted to online platforms. It has made all age groups vulnerable to cyberbullying. Now more than ever, there is a need for online behavior monitoring. Existing algorithms tend to classify friendly banter as cyberbullying. They make use of binary classification by identifying offensive keywords. The lack of analysis of the context of data posted and the unavailability of public training data makes it challenging to train models accurately. Our models and research focus on the larger picture by making use of context as a significant parameter during the classification. The dataset chosen was such that its annotation was based on 5 parameters that considered the context of conversations happening online. This paper executes various machine learning algorithms, SVM, random forest, AdaBoost, and MLP algorithms, on a benchmark cyberbullying-representations dataset extracted from Twitter. We conducted randomized oversampling on the best-performing SVM model, which resulted in a significantly higher average F1 score outperforming the baseline score. |
|---|---|
| ISSN: | 2523-3653 2523-3661 |
| DOI: | 10.1007/s42380-023-00188-4 |