Phishing Detection Model on Social Media Enhanced With CNN and BERT

Authors

  • Nurliana Nasution Universitas Lancang Kuning
  • Wenni Syafitri Universitas Lancang Kuning
  • Feldiansyah Feldiansyah Universitas Lancang Kuning

DOI:

https://doi.org/10.55583/jtisi.v4i1.2194

Keywords:

CNN-BERT, IndoBERT, Phishing detection, Natural language processing, Social Media Phishing

Abstract

Phishing on social media has become an increasingly serious cyber threat because attackers exploit persuasive language, conversational context, and dynamic interaction patterns to deceive users. This study proposes a hybrid CNN-BERT model for detecting phishing content in Indonesian social media text by combining BERT’s contextual semantic representation with CNN’s ability to capture locally relevant textual patterns. The dataset was preprocessed to remove noise, normalize writing variations, and prepare the text for deep learning analysis; class proportions were also examined to support fairer evaluation. Model performance was assessed under multiple data-splitting scenarios and cross-validation to examine robustness and consistency. The experimental results indicate that the proposed hybrid model achieves strong and stable performance across accuracy, precision, recall, and F1-score, and outperforms the baseline model when the BERT backbone is frozen. However, when BERT is fully fine-tuned, the performance gain from the CNN layer becomes marginal, suggesting that strong contextual representations are already highly effective for this task. These findings indicate that integrating CNN and BERT is effective for phishing detection on social media, although domain adaptation challenges, overfitting risk, and real-world deployment constraints remain important considerations. The novelty of this work lies in systematically comparing frozen versus fully fine-tuned IndoBERT backbones with and without a CNN head for Indonesian short-message phishing detection.

Downloads

Published

2026-05-31

How to Cite

Nasution, N., Syafitri, W., & Feldiansyah, F. (2026). Phishing Detection Model on Social Media Enhanced With CNN and BERT. Jurnal Testing Dan Implementasi Sistem Informasi, 4(1), 87-101. https://doi.org/10.55583/jtisi.v4i1.2194