We have kindly provided a bash script train.sh to train the models. You can modify some hyperparameters in the script file according to your own needs.
For example, we are going to use 8 GPUs to train ELANDarkNet-S designed in this repo, so we can use the following command:
bash train.sh elandarknet_s imagenet_1k path/to/imagnet_1k 8 1699 None
Evaluate the top1 & top5 accuracy of ViT-Tiny on ImageNet-1K dataset:
python main.py --cuda -dataset imagenet_1k --root path/to/imagnet_1k -m elandarknet_s --batch_size 256 --img_size 224 --eval --resume path/to/elandarknet_s.pth
Tips:
random hflip and random crop resize.mixup, cutmix, rand aug, random erase and so on. However, we don't use the strong augmentation.AdamW with weight decay = 0.05 and base lr = 4e-3 (for bs of 4096) is deployed as the optimzier, and the CosineAnnealingLR is deployed as the lr scheduler, where the min lr is set to 1e-6.| Model | Augment | Batch | Epoch | size | acc@1 | GFLOPs | Params | Weight |
|---|---|---|---|---|---|---|---|---|
| DarkNet-S | weak | 4096 | 100 | 224 | 68.5 | 1.6 | 4.6 M | ckpt |
| DarkNet-M | weak | 4096 | 100 | 224 | ||||
| DarkNet-L | weak | 4096 | 100 | 224 | ||||
| DarkNet-X | weak | 4096 | 100 | 224 |
| Model | Augment | Batch | Epoch | size | acc@1 | GFLOPs | Params | Weight |
|---|---|---|---|---|---|---|---|---|
| CSPDarkNet-S | weak | 4096 | 100 | 224 | 70.2 | 1.3 | 4.0 M | ckpt |
| CSPDarkNet-M | weak | 4096 | 100 | 224 | ||||
| CSPDarkNet-L | weak | 4096 | 100 | 224 | ||||
| CSPDarkNet-X | weak | 4096 | 100 | 224 |
| Model | Augment | Batch | Epoch | size | acc@1 | GFLOPs | Params | Weight |
|---|---|---|---|---|---|---|---|---|
| ElANDarkNet-N | weak | 4096 | 100 | 224 | 62.1 | 0.38 | 1.36 M | ckpt |
| ElANDarkNet-S | weak | 4096 | 100 | 224 | 71.3 | 1.48 | 4.94 M | ckpt |
| ElANDarkNet-M | weak | 4096 | 100 | 224 | 4.67 | 11.60 M | ||
| ElANDarkNet-L | weak | 4096 | 100 | 224 | 10.47 | 19.66 M | ||
| ElANDarkNet-X | weak | 4096 | 100 | 224 | 20.56 | 37.86 M |
| Model | Augment | Batch | Epoch | size | acc@1 | GFLOPs | Params | Weight |
|---|---|---|---|---|---|---|---|---|
| GELAN-S | weak | 4096 | 100 | 224 | 68.4 | 0.9 | 1.9 M | ckpt |
| GELAN-C | weak | 4096 | 100 | 224 | 5.2 | 8.8 M | [ckpt]() |