Shenghui Li: Robust Federated Learning: Defending Against Byzantine and Jailbreak Attacks
- Date: 16 January 2025, 09:00
- Location: 101121, Sonja Lyttkens, Ångström, Regementsvägen 1, Uppsala
- Type: Thesis defence
- Thesis author: Shenghui Li
- External reviewer: Karl Andersson
- Supervisors: Edith C. H. Ngai, Thiemo Voigt
- DiVA
Abstract
Federated Learning (FL) has emerged as a promising paradigm for training collaborative machine learning models across multiple participants while preserving data privacy. It is particularly valuable in privacy-sensitive domains like healthcare and finance. Recently, FL has been explored to harness the power of pre-trained Foundation Models (FMs) for downstream task adaptation, enabling customization and personalization while maintaining data locality and privacy. However, FL's distributed nature makes it inherently vulnerable to adversarial attacks. Notable threats include Byzantine attacks, which inject malicious updates to degrade model performance, and jailbreak attacks, which exploit the fine-tuning process to undermine safety alignments of FMs, leading to harmful outputs. This dissertation centers on robust FL, aiming to mitigate these threats and ensure global models remain accurate and safe even under adversarial conditions. To mitigate Byzantine attacks, we propose several Robust Aggregation Schemes (RASs) that decrease the influence of malicious updates. Additionally, we introduce Blades, an open-source benchmarking tool to systematically study the interplay between attacks and defenses in FL, offering insights into the effects of data heterogeneity, differential privacy, and momentum on RAS robustness. Exploring the synergy between FL and FMs, we present a taxonomy of research along with adaptivity, efficiency, and trustworthiness. We uncover a novel attack, “PEFT-as-an-Attack” (PaaA), where malicious FL participants jailbreak FMs through Parameter-Efficient-Fine-Tuning (PEFT) with harmful data. We evaluate defenses against PaaA and highlight critical gaps, emphasizing the need for advanced strategies balancing safety and utility in FL-FM systems. In summary, this dissertation advances FL robustness by proposing novel defenses, tools, and insights while exposing emerging attack vectors. These contributions pave the way for attack-resilient distributed machine learning systems capable of withstanding both current and emerging threats.