Implementation of dynamic aggregation for heterogeneous quantization in federated learning.
The experiements refer to the papers as following. The paper link and GitHub link are given.
- Communication-Efficient Learning of Deep Networks from Decentralized Data : GitHub
- SWALP: Stochastic Weight Averaging in Low-Precision Training : GitHub
Requirments.txt gives the detail requirements.
- Python3
- Pytorch
- Torchvision
- The experiments of FedHQ are run on MNIST and Cifar.
- You can choose download the data through the code.
--epochs:Number of communication rounds (T in the paper). Default is 150.--num_users:Number of clients (n in the paper). Default is 100.--frac:Fraction of users to be used for federated updates (C in the paper). Default is 0.1.--local_ep:Number of local training epochs in each user (K in the paper). Default is 1.--local_bs:Batch size of local updates in each user (B in the paper). Default is 600.--lr:Learning rate (η in the paper). Default is 0.1.--optimizer:The optimizer used. Default is sgd.--momentum:Momentum of optimizer (M in the paper). Default is 0.5.--weight_decay:Weight decay of optimizer (λ in the paper). Default: 0.0005.--average_scheme:Decide the average scheme. Default is FedHQ.--dataset:Name of dataset. Default is mnist.--gpu:To use CPU or GPU. Default set 1 to use GPU.--iid:Distribution of data amongst clients. Default set 1 for IID.--bit_4_ratio:The ratio for 4-bit quantization clients.--bit_8_ratio:The ratio for 8-bit quantization clients.
In our experiment, the sum of 'bit_4_ratio' and 'bit_8_ratio' is 1.
The detail results of our experiment refer to the Section 6 of the paper. All the commands are given when running directory is FedHQ folder.
- To run the FedHQ experiment with MNIST under IID condition using GPU:
python src/FedHQ_main.py --dataset=mnist --frac=1 --local_bs=600 --average_scheme=FedHQ --bit_4_ratio=0 --bit_8_ratio=1
- To run the FedHQ experiment with MNIST under non-IID condition using GPU:
python src/FedHQ_main.py --dataset=mnist --iid=0 --frac=1 --local_ep=1 --local_bs=600 --average_scheme=FedHQ --bit_4_ratio=0 --bit_8_ratio=1
Parameters setting as follows(only list the parameters differing from default):
frac:1
Learning-rate decay is 0.9 per ten rounds. The ratios of 4-bit quantization clients are [0,0.2,0.4,0.6,0.8,1].
Table 1: Number of communication round to reach different target accuracy on MNIST dataset, IID partition.
| Quantizationbits: ratio |
Schemes | Accuracy | ||||||
| 60% | 70% | 80% | 90% | 92% | 94% | 95% | ||
| 4-bit:0 8-bit:1 |
FegAvg | 13 | 15 | 25 | 33 | 42 | 46 | 65 |
| FedHQ+ | 13 | 14 | 22 | 35 | 39 | 47 | 63 | |
| 4-bit:0.2 8-bit:0.8 |
FegAvg | 12 | 18 | 19 | 32 | 42 | 54 | 82 |
| Proportional | 15 | 21 | 22 | 32 | 38 | 50 | 73 | |
| FedHQ+ | 13 | 15 | 25 | 33 | 35 | 47 | 61 | |
| 4-bit:0.4 8-bit:0.6 |
FegAvg | 17 | 22 | 24 | 42 | 45 | 62 | 104 |
| Proportional | 11 | 17 | 22 | 37 | 41 | 53 | 83 | |
| FedHQ+ | 12 | 17 | 25 | 34 | 38 | 50 | 61 | |
| 4-bit:0.6 8-bit:0.4 |
FegAvg | 13 | 31 | * | * | * | * | * |
| Proportional | 11 | 27 | 35 | * | * | * | * | |
| FedHQ+ | 18 | 19 | 20 | 40 | 42 | 46 | 66 | |
| 4-bit:0.8 8-bit:0.2 |
FegAvg | 21 | * | * | * | * | * | * |
| Proportional | 16 | 24 | 51 | * | * | * | * | |
| FedHQ+ | 13 | 18 | 23 | 35 | 47 | 53 | 69 | |
| 4-bit:1 8-bit:0 |
FegAvg | 14 | 32 | * | * | * | * | * |
| FedHQ+ | 16 | 20 | 32 | 52 | 79 | * | * | |
Table 2: Number of communication round to reach different target accuracy on MNIST dataset, non-IID partition.
| Quantizationbits: ratio |
Schemes | Accuracy | ||||||
| 60% | 70% | 80% | 90% | 92% | 94% | 95% | ||
| 4-bit:0 8-bit:1 |
FegAvg | 12 | 18 | 26 | 39 | 49 | 55 | 75 |
| FedHQ+ | 11 | 19 | 22 | 32 | 36 | 55 | 74 | |
| 4-bit:0.2 8-bit:0.8 |
FegAvg | 13 | 17 | 19 | 41 | 43 | 60 | 96 |
| Proportional | 14 | 18 | 22 | 34 | 44 | 57 | 92 | |
| FedHQ+ | 13 | 20 | 28 | 41 | 43 | 60 | 96 | |
| 4-bit:0.4 8-bit:0.6 |
FegAvg | 19 | 23 | 25 | 43 | 51 | 76 | 131 |
| Proportional | 20 | 22 | 23 | 42 | 46 | 68 | 119 | |
| FedHQ+ | 13 | 17 | 25 | 42 | 43 | 55 | 87 | |
| 4-bit:0.6 8-bit:0.4 |
FegAvg | 21 | 37 | * | * | * | * | * |
| Proportional | 19 | 42 | * | * | * | * | * | |
| FedHQ+ | 19 | 26 | 31 | 51 | 59 | 87 | 133 | |
| 4-bit:0.8 8-bit:0.2 |
FegAvg | 22 | 42 | * | * | * | * | * |
| Proportional | 16 | 41 | 50 | * | * | * | * | |
| FedHQ+ | 21 | 23 | 31 | * | * | * | * | |
| 4-bit:1 8-bit:0 |
FegAvg | 17 | 42 | * | * | * | * | * |
| FedHQ+ | 19 | 39 | * | * | * | * | * | |
- To run the FedHQ experiment with MNIST under IID condition using GPU:
python src/FedHQ_main.py --dataset=cifar --epochs=300 --frac=0.1 --local_ep=5 --local_bs=128 --average_scheme=FedHQ --bit_4_ratio=0 --bit_8_ratio=1
- To run the FedHQ experiment with MNIST under non-IID condition using GPU:
python src/FedHQ_main.py --dataset=cifar --epochs=150 --iid=0 --frac=0.1 --local_ep=5 --local_bs=64 --momentum=0.2 --average_scheme=FedHQ --bit_4_ratio=0 --bit_8_ratio=1
Parameters setting as follows(only list the parameters differing from default):
--epochs:300 for IID. 150 for non-IID.frac:0.1local_ep:5local_bs:128
Learning-rate decay is 0.9 per ten rounds. The ratios of 4-bit quantization clients are [0,0.2,0.4,0.6,0.8,1].
Table 3: Number of communication round to reach different target accuracy on CIFAR dataset, IID partition.
| Quantizationbits: ratio |
Schemes | Accuracy | |||||
| 60% | 70% | 80% | 82% | 84% | 86% | ||
| 4-bit:0 8-bit:1 |
FegAvg | 13 | 22 | 45 | 53 | 69 | 94 |
| FedHQ+ | 12 | 22 | 42 | 53 | 68 | 94 | |
| 4-bit:0.2 8-bit:0.8 |
FegAvg | 58 | 126 | * | * | * | * |
| Proportional | 31 | 58 | 179 | 285 | * | * | |
| FedHQ+ | 14 | 26 | 56 | 69 | 97 | 133 | |
| 4-bit:0.3 8-bit:0.7 |
FegAvg | 144 | 276 | * | * | * | * |
| Proportional | 73 | 109 | * | * | * | * | |
| FedHQ+ | 13 | 23 | 51 | 66 | 92 | 126 | |
| 4-bit:0.4 8-bit:0.6 |
FegAvg | * | * | * | * | * | * |
| Proportional | 119 | 227 | * | * | * | * | |
| FedHQ+ | 17 | 25 | 57 | 76 | 98 | 199 | |
| 4-bit:0.6 8-bit:0.4 |
FegAvg | * | * | * | * | * | * |
| Proportional | * | * | * | * | * | * | |
| FedHQ+ | 18 | 33 | 100 | 184 | * | * | |
| 4-bit:0.8 8-bit:0.2 |
FegAvg | * | * | * | * | * | * |
| Proportional | * | * | * | * | * | * | |
| FedHQ+ | 84 | * | * | * | * | * | |
| 4-bit:1 8-bit:0 |
FegAvg | * | * | * | * | * | * |
| FedHQ+ | * | * | * | * | * | * | |
Table 4: Number of communication round to reach different target accuracy on CIFAR dataset, non-IID partition.
| Quantizationbits: ratio |
Schemes | Accuracy | |||||
| 30% | 35% | 40% | 45% | 50% | 55% | ||
| 4-bit:0 8-bit:1 |
FegAvg | 9 | 15 | 15 | 30 | 48 | 73 |
| FedHQ+ | 9 | 14 | 15 | 27 | 48 | 71 | |
| 4-bit:0.2 8-bit:0.8 |
FegAvg | 31 | 48 | 93 | * | * | * |
| Proportional | 18 | 27 | 38 | 93 | * | * | |
| FedHQ+ | 18 | 57 | 60 | 83 | 93 | * | |
| 4-bit:0.3 8-bit:0.7 |
FegAvg | 48 | 110 | * | * | * | * |
| Proportional | 14 | 27 | 38 | 93 | * | * | |
| FedHQ+ | 11 | 14 | 18 | 38 | 88 | 110 | |
| 4-bit:0.4 8-bit:0.6 |
FegAvg | * | * | * | * | * | * |
| Proportional | 93 | * | * | * | * | * | |
| FedHQ+ | 22 | 49 | 60 | 60 | 93 | * | |
| 4-bit:0.6 8-bit:0.4 |
FegAvg | * | * | * | * | * | * |
| Proportional | * | * | * | * | * | * | |
| FedHQ+ | 33 | 40 | 62 | * | * | * | |
| 4-bit:0.8 8-bit:0.2 |
FegAvg | * | * | * | * | * | * |
| Proportional | * | * | * | * | * | * | |
| FedHQ+ | 117 | * | * | * | * | * | |
| 4-bit:1 8-bit:0 |
FegAvg | * | * | * | * | * | * |
| FedHQ+ | * | * | * | * | * | * | |