Federated Learning (FL) has emerged as a promising privacy-preserving collaborative model training paradigm without sharing raw data. However, recent studies have revealed that private information can still be leaked through shared gradient information and attacked by Gradient Inversion Attacks (GIA). While many GIA methods have been proposed, a detailed analysis, evaluation, and summary of these methods are still lacking. Although various survey papers summarize existing privacy attacks in FL, few studies have conducted extensive experiments to unveil the effectiveness of GIA and their associated limiting factors in this context. To fill this gap, we first undertake a systematic review of GIA and categorize existing methods into three types, i.e., optimization-based GIA (OP-GIA), generation-based GIA (GEN-GIA), and analytics-based GIA (ANA-GIA). Then, we comprehensively analyze and evaluate the three types of GIA in FL, providing insights into the factors that influence their performance, practicality, and potential threats. Our findings indicate that OP-GIA is the most practical attack setting despite its unsatisfactory performance, while GEN-GIA has many dependencies and ANA-GIA is easily detectable, making them both impractical. Finally, we offer a three-stage defense pipeline to users when designing FL frameworks and protocols for better privacy protection and share some future research directions from the perspectives of attackers and defenders that we believe should be pursued. We hope that our study can help researchers design more robust FL frameworks to defend against these attacks.
Figure 1. Taxonomy of existing GIA methods. The existing GIA methods can be divided into three types: optimization-based GIA (OP-GIA), which works by minimizing the distance between received gradients and gradients computed from dummy data; generation-based GIA (GEN-GIA), which utilizes a generator to reconstruct input data; and analytics-based GIA (ANA-GIA), which aims to recover input data in closed form. Moreover, GEN-GIA can be further divided into three categories: optimizing the latent vector z, optimizing the generator’s parameters W, and training an inversion generation model. ANA-GIA can be further divided into two categories: manipulating model architecture and manipulating model parameters. We also provide a public repository to continually track developments in this fast-evolving field: Awesome-Gradient-Inversion-Attacks.
Taxonomy | Influence Factors | Reconstruction Results | Extra Reliance | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Batch Size | Image Resolution | # Same Label | Model Training State | Network Architecture | Practical FedAvg | Original Inputs? | Visual Quality | |||
OP-GIA | - | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | Low | No |
GEN-GIA | Opti. Lat. Vec. | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ | No | High | Trained Generator |
Opti. Gen. Para. | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | Middle | Sigmoid Activation | |
Train. Inv. Mod. | ✓ | ✓ | ✓ | ✕ | ✓ | ✓ | Yes | Low | Auxiliary Dataset | |
ANA-GIA | Manip. Mod. Arch. | ✕ | ✕ | ✕ | ✕ | ✕ | ✕ | Yes | High | Malicious Server |
Manip. Mod. Para. | ✕ | ✓ | ✕ | ✓ | ✓ | ✕ | Yes | Middle | Malicious Server |
Figure 2. left. Reconstruction results of IG evaluated on models in different training states on various datasets with different image resolutions and batch sizes. right. Reconstruction results of IG with different network architectures on the CIFAR-100 dataset. The shaded region represents the standard deviation. These results show that a larger batch size, higher image resolution, more complicated network architecture, and better model training state lead to worse OP-GIA performance.
Figure 3. Reconstruction results of GGL. (a)-(c) Reconstruction results on the ImageNet dataset: (a) with different batch sizes and model training states; (b) under practical FedAvg scenario; (c) with random Gaussian noise. The ground truth for (b) and (c) is similar to (a) and is omitted. (d) Reconstruction results on the CIFAR-100 dataset with a batch size of one and an untrained model. These results show that when optimizing the latent vector z, GEN-GIA can generate semantically similar images and is not affected by the factors influencing OP-GIA. However, it heavily relies on the pre-trained generator and only can achieve semantic-level recovery.
Figure 4. left. Reconstruction results of CI-Net evaluated on ResNet-18 with different activation functions on various datasets with different batch sizes. right. Reconstruction results of CI-Net on ImageNet with different resolutions under the Sigmoid activation function. These results show that GEN-GIA with optimizing the generator's parameters W is affected by the factors that influence OP-GIA. Moreover, it only works when the target model adopts the Sigmoid activation function and fails with other activation functions.
Figure 5. left. Reconstruction results of LTI evaluated on different models with different training states on CIFAR-10 with different batch sizes. right. Reconstruction results of LTI on different datasets with different resolutions on LeNet. These results show that when training an inversion generation model, GEN-GIA can achieve pixel-level attacks but is influenced by most of the factors that affect OP-GIA, except for the model's training state.
Batch Size | CIFAR-10 | CIFAR-100 | ImageNet |
---|---|---|---|
1 | 64/64 | 64/64 | 64/64 |
8 | 63/64 | 64/64 | 64/64 |
32 | 60/64 | 61/64 | 64/64 |
64 | 60/64 | 60/64 | 61/64 |
Figure 6. Reconstruction results of Fishing on ImageNet with different image resolutions and model training states, and network architectures. These results show that the attack performance of ANA-GIA, which manipulates model parameters, is not affected by batch size but worsens with larger image resolutions, worse model training states, and more complicated model architectures.
Figure 7. left. Reconstruction results of Eq. (7) evaluated on the ViT-base fine-tuned with LoRA on different datasets with different batch sizes. right. Reconstruction results of Eq. (7) evaluated on different ViT architectures fine-tuned with LoRA on the CIFAR-100 dataset. These results show that attackers can breach privacy on low-resolution images but fail with high-resolution ones under PEFT. Moreover, smaller models are better at protecting privacy.
Our code is available in this repository: GIA. For newly proposed attack methods, we encourage you to use our repository to compare their performance with other methods. Similarly, for newly proposed defense methods, we encourage you to use our repository to evaluate their defense capabilities.