Sui, Hao and Zhang, Jiale and Chen, Bing and Zhu, Chengcheng and Ge, Chunpeng and Meng, Weizhi and Susilo, Willy (2026) GDetox : Purifying Backdoor Encoder in Graph Self-supervised Learning via Knowledge Distillation. IEEE Transactions on Information Forensics and Security. ISSN 1556-6013
T-IFS-22645.pdf - Accepted Version
Available under License Creative Commons Attribution.
Download (2MB)
Abstract
Graph Neural Networks (GNNs) have powerful representation capabilities for graph data, achieving excellent performance across various fields. Considering the scarcity of labels in real-world scenarios, graph self-supervised learning (GSSL) has gained increasing attention due to its ability to train without relying on labels. However, recent studies have revealed that GNNs are vulnerable to stealthy backdoor attacks in GSSL scenarios, enabling the encoder to learn backdoor features simply by injecting triggers. Existing graph backdoor defense methods mainly focus on supervised settings and cannot be directly transferred to self-supervised scenarios due to the lack of label guidance. To bridge this gap, we propose GDetox, the first backdoor defense approach against backdoored encoders in GSSL. GDetox aims to eliminate backdoor logic in encoders while maintaining the encoder's original performance. Specifically, GDetox can purify the graph backdoor encoder based on the self-supervised distillation approach without relying on label information. Further, we introduce an adversarial contrastive learning that augments node representations without relying on labels to enhance teacher model performance, thereby improving distilled encoder performance. We evaluate the defense performance of GDetox on four node classifications and four graph classification datasets by comparing with four state-of-the-art (SOTA) defense methods against seven latest backdoor attack methods on GSSL. Extensive experiments demonstrate that GDetox far outperforms the SOTA defense methods, reducing the attack success rate to 4% with negligible degradation in encoder performance (within 2%) in both node-level and graph-level tasks.